Step one: Search Console as the source of truth
If the client has an existing site, even a small one, Google Search Console is the highest-quality data source you will get. Export queries by impressions over ninety days, filter to queries with zero or one impression, and that set is your long-tail. Twenty to forty percent of those queries are reachable with a single new leaf each. Search Console under-reports queries with very low impression counts (under three), so the export underestimates the true tail by a factor of one and a half to two in our experience. The Google Search Central docs on Search Console explain the export format and the impression sampling.
Step two: Bing Webmaster Tools for the second-engine signal
Bing Webmaster Tools exposes a similar export with different sampling biases. Bing tends to surface query-string variations that Google's export elides, especially for B2B and developer-leaning topics. Combining the two exports and deduplicating on a normalized form (lowercase, strip punctuation, collapse whitespace) gives a list that is fifteen to thirty percent larger than Google alone. For surfaces with no existing site (a greenfield engagement), skip step one and start here; Bing will give you a usable baseline within twenty-four hours of site verification.
Steps three and four: autosuggest scrape and manual expansion
Step three is a polite autosuggest scrape: seed the Google and Bing suggest endpoints with the head term you are targeting, walk the suggestion tree two levels deep, and pull every result. Twenty-six head terms plus the alphabet produces six to eight hundred candidate phrases for a typical surface. Step four is a manual sit-down in a spreadsheet, scoring each candidate by intent (informational, transactional, navigational) and by proximity to the product. Anything below a five-out-of-ten on proximity gets cut. The remaining set is your slug list candidate. For this surface, the manual expansion produced thirty-six candidates; we shipped twenty-two.
