Analysis
AI Search vs Google Search: What Actually Changed
Google gives you a list of links. ChatGPT gives you a name. That is the simplest way to describe the shift.
We built an open-source tool called canonry that monitors both sides of this. It tracks indexing via Google Search Console and Bing Webmaster Tools, and separately runs queries against ChatGPT, Gemini, Claude, and Perplexity to record which businesses get cited. The two systems overlap in interesting ways, but they are not the same. Here is what the data shows.
The fundamental difference: lists vs answers
Google returns a ranked list of 10 pages and the user clicks one. AI search returns a direct answer with 3 to 5 names and reasons. There is no page two. There is no position 7 that still gets some traffic.
In one canonry dataset (11 keywords, 66 runs for a local service business), the split is binary. For branded + location queries where the site is well-positioned, it gets cited 82-90% of the time. For informational queries where the site has no content, citation rate is 0%. There is no "almost cited" or "showing up on the second page of AI results." You are in the answer or you are not.
Where the data comes from
Each AI system has its own retrieval pipeline. The exact details are not fully public, and they change frequently. Here is what we know and what we have observed:
ChatGPT (OpenAI)
- Has its own crawler (OAI-SearchBot) and browses the web in real time
- Has been observed pulling from both Bing and Google for web results
- Also relies on training data with a knowledge cutoff
- What this means: Being indexed by both Google and Bing gives you the best shot. The exact retrieval mix appears to shift over time.
Gemini (Google)
- Appears to use Google's search index for "grounding" answers with web data
- Also draws on Knowledge Graph data
- What this means: Google indexing seems critical for Gemini specifically. In canonry data, sites with 0 indexed pages in GSC were invisible to Gemini entirely.
Perplexity
- Runs its own real-time web search with visible source citations
- Has its own crawler (PerplexityBot)
- What this means: Perplexity appears to re-fetch aggressively. Freshness and accessibility seem to matter most here.
Claude (Anthropic)
- Has web search capabilities and its own crawler (ClaudeBot)
- Also relies on training data
- What this means: Claude can pull live web data, though the specifics of its retrieval system are less documented than the others.
Important caveat: These retrieval systems are opaque. We are making educated guesses based on announced features, observed behavior in canonry monitoring, and published documentation. Any of this could change tomorrow. The practical takeaway is: do not bet on understanding one provider's pipeline. Be indexed and structured well enough that any retrieval system can find and parse your content.
The key insight: these are independent systems. Canonry data shows queries where a site gets cited by one provider but not another. Optimizing only for Google and assuming AI will follow is a mistake.
Signals that overlap vs signals that diverge
Both Google and AI care about:
- Content quality. Thin content fails in both systems. Google's helpful content guidelines are a reasonable baseline.
- Authority. Strong backlinks and external mentions help in both, though the mechanisms differ.
- Technical health. Clean HTML, HTTPS, fast load times. Table stakes.
AI models weight these more heavily:
- Structured data. In the aeo-audit scoring framework (an open-source tool that scores any URL across 13 factors), structured data carries the highest weight (12/100). The cited site in the dataset scores 100/100 on schema. The uncited site scores 42/100. AI models parse JSON-LD more reliably than unstructured HTML.
- Content extractability. This is the gap most SEO-optimized sites miss. Our cited site scores only 65/100 on extractability despite strong content depth (87/100). The content exists but the markup makes it harder to parse. Sites built with heavy page builders score worse here.
- Entity consistency. AI models cross-reference your business across the web. NAP consistency matters for AI in a way it has not mattered for Google ranking in years. BrightLocal's citation research covers the fundamentals.
- Definition blocks. "X is Y" opening statements. Google does not care whether your page starts with a definition. AI models do, because they need something to extract as an answer. The uncited site in the dataset scores 0/100 on this factor.
- llms.txt. A machine-readable file for AI crawlers. Does nothing for Google ranking. Does a lot for AI discoverability.
AI models care less about:
- Keyword density. AI understands semantics. Keyword stuffing does not help.
- Internal linking structure. Google uses internal links for crawling and authority flow. AI models care more about what is on the page they are reading.
- Meta descriptions for ranking. AI models extract from page content, not meta tags.
The indexing disconnect
Here is something that shows up regularly in canonry data: a site is fully indexed by Google (Search Console shows all pages crawled and indexed) but gets zero citations from Gemini.
This happens because Google indexing and Gemini grounding are not the same thing. Google knowing your page exists does not mean Gemini will use it in an answer. Gemini applies additional signals beyond indexing: entity strength, content relevance, answer quality, and competitive alternatives.
The reverse also happens. A site can be poorly indexed by Google but picked up by Perplexity's real-time search because Perplexity crawls independently.
This is why monitoring across providers matters. Canonry runs the same queries against multiple AI systems and tracks citation state independently. Without that, you are guessing about which providers see you and which do not.
Google is also becoming an answer engine
Google's AI Overviews are blurring the line. When you search on Google now, you often see an AI-generated summary above traditional results. This means Google itself is applying the same extraction logic that ChatGPT and Gemini use.
The same structured data and extractable content that helps you get cited by ChatGPT also helps you appear in Google's AI Overviews. The investment is the same. The surface area is expanding.
What the data suggests you do
Based on the citation monitoring and audit scores described above:
-
Submit your sitemap to both Google and Bing. Gemini appears to rely on Google's index. ChatGPT has been observed using both Bing and Google. Being indexed by both gives you the broadest coverage across all AI providers. Bing Webmaster Tools takes five minutes.
-
Add structured data. The biggest gap between cited and uncited sites in the dataset is schema quality (100 vs 42). Start with LocalBusiness and Service. The schema guide has copy-pasteable JSON-LD.
-
Fix your opening paragraphs. Add definition blocks to your key pages. The uncited site scores 0 here. This is a 15-minute fix that changes how models parse your page.
-
Publish llms.txt. The cited site scores 100/100 on AI-readable content. The uncited site scores 56/100. Llms.txt is part of that gap.
-
Monitor across providers. Do not check one AI system and assume the others agree. Run a free audit for your baseline, then set up monitoring:
# Point-in-time audit across 13 factors
npx @ainyc/aeo-audit@latest "https://yourbusiness.com"
Canonry handles the ongoing monitoring. Both are open source because the measurement gap in AI search should not be a bottleneck.
FAQ
Is AI search replacing Google?
Not replacing, but supplementing. Our monitoring data shows that AI models and Google operate as independent systems with different retrieval methods. A site can rank well on Google but get zero AI citations. Businesses need to optimize for both.
Do I need to optimize for both AI search and Google?
Yes. In our data, the optimized site scores 90/100 on AEO factors but still has gaps in extractability and definition blocks. Strong Google rankings help but do not guarantee AI citation.
Which AI search engine matters most?
Depends on your audience. Each provider has its own retrieval system: ChatGPT has been observed using both Bing and Google, Gemini appears to use Google's index, Perplexity runs its own search, and Claude has web search capabilities. These systems are opaque and change over time. Monitoring across all of them is the only way to know where you are visible.
Can AI search engines show wrong information about my business?
Yes. In our monitoring, we have seen models cite businesses with outdated information. Structured data and entity consistency reduce this risk by giving models authoritative facts to work with.
Try it yourself.
Run a free AEO audit to see how your site scores, or explore the tools and pages referenced in this article.