Analysis

Bots Now Outnumber Humans on the Web's HTML Pages

Arber Xhindoli · June 3, 2026 · 4 min read

Two Cloudflare Radar charts. The top chart, Bot vs. Human filtered to HTML content, shows bots at 57.5% and humans at 42.5%. The bottom chart shows HTTP responses by content type: JSON 33.1%, images 12.7%, HTML 12%, JavaScript 8.1%, plain text 6.4%, video 3.7%, other 24%.

Source: Cloudflare Radar, the seven days ending June 3, 2026.

A line just got crossed, quietly, and it changes who your website is for.

For the week ending June 3, 2026, Cloudflare Radar's Bot vs. Human view, filtered to HTML responses, put automated traffic at 57.5% and humans at 42.5%. On the part of the web that represents actual pages, machines are now the majority of requests.

What the chart is actually measuring

The number that matters here is scoped on purpose. Cloudflare filtered it to HTML responses, the documents a person opens in a browser. That filter is the point. It strips out the API calls, the image and font fetches, and the background chatter, and leaves the thing you think of as "a web page." On exactly that surface, bots passed humans.

Look one panel down and you can see why the filter matters. Across all HTTP responses by content type, HTML is only about 12% of the total. JSON leads at 33.1%, images are 12.7%, JavaScript is 8.1%, plain text 6.4%, video 3.7%, and everything else 24%. Most of the web's raw volume is machines talking to machines: APIs moving JSON, browsers pulling assets. HTML, the human-readable page, is a thin slice of all that traffic. And that thin slice is the one that just tipped majority bot.

The bots reading your pages pull HTML

Here is the part that matters for anyone who publishes content. The automated traffic hitting your pages is not abstract. A large and growing share of it is AI: crawlers that index and train, and answer-engine fetchers that pull a page live to answer a question someone just asked.

All of them read the same thing: the HTML you return. GPTBot, ClaudeBot, PerplexityBot and the rest send a plain HTTP request and parse the HTML response. They generally do not run a browser or execute your client-side JavaScript. We covered the mechanics of this in why Google Analytics misses AI traffic: no browser, no JavaScript engine, no analytics tag firing. Just a request, an HTML response, and a reader on the other end that happens to be a model.

So the version of your page a machine understands is the raw HTML you ship. Not the hydrated app a human sees after the JavaScript runs. Not the number baked into a chart image. It is the text, the headings, the links, and the structured data that arrive in the initial HTML response.

Your website's audience changed

Put the two facts together. More than half of the requests for your pages now come from machines, and those machines read your HTML directly. The primary reader of your website, by volume, is increasingly not a person scrolling. It is a model parsing markup.

You are now writing for two audiences at once. The human still needs the page to look good and read well. The machine needs the same facts to be present, plain, and parseable in the HTML. When those two pull apart, when the polished version lives in JavaScript and images while the HTML stays thin, the machine reader, now the majority, gets the worse version of your page.

What to do about it

This is the whole premise of answer engine optimization, and it is increasingly literal: write for the answer engine that is, by request count, your biggest visitor.

Put facts in HTML text. Your claims, prices, locations, and entity details belong in the HTML, not locked inside images or rendered only after JavaScript runs. If a machine reader has to execute your app to learn a fact, assume many will not.
Add structured data. Schema.org JSON-LD lets a machine parse your claims without guessing. Mark up your organization, articles, FAQs, and products so the facts are unambiguous.
Ship machine-readable surfaces. A clean sitemap and an llms.txt help machine readers find and prioritize your important pages instead of inferring them.
Measure your own split. Cloudflare's figure is an aggregate. Classify your own server logs by user-agent, verified against published operator IP ranges, to see which engines read you and how often. Your analytics dashboard will not show this.

The web did not announce that its audience changed. A chart crossed 50% and kept climbing. The pages that win the next few years will be the ones whose HTML answers well on its own, because that is the version most of your visitors were ever going to read.

Want to know how a machine reads your site today? Run a free AEO check, or see how Canonry classifies AI traffic straight from your server logs.

FAQ

Is most web traffic now bots instead of humans?

On the surface that represents actual web pages, yes. Cloudflare Radar's Bot vs. Human metric, filtered to HTML responses, showed bots at 57.5% and humans at 42.5% for the week ending June 3, 2026. HTML responses are the documents a person opens in a browser, so this is the apples-to-apples view of web page traffic. Across all HTTP responses, which include JSON APIs, images, scripts, and video, the mix is different, but for the human-readable page itself, automated requests are now the majority.

Why does it matter that bots pull HTML?

Because HTML is what machine readers actually consume. AI crawlers and answer-engine fetchers like GPTBot, ClaudeBot, and PerplexityBot send a plain HTTP request and parse the HTML you return. They generally do not run a browser or execute your client-side JavaScript. So the version of your page they understand is the raw HTML: its text, its headings, and its structured data. If a fact only appears after JavaScript renders, or only inside an image, the machine reader can miss it.

Does this mean AI crawlers are the majority of my visitors?

It depends on your site, but the industry-wide direction is clear. Cloudflare's figure is an aggregate across the sites behind its network, not a guarantee for any single domain. The way to know your own split is to classify your server logs by user-agent and verify against published operator IP ranges, because browser analytics like Google Analytics cannot see most bot traffic at all.

What should I do if bots are the majority of my web page traffic?

Write the page so the HTML answers well on its own. Put your key facts, claims, and entity details in HTML text rather than in images or JavaScript-rendered widgets. Add schema.org structured data so machines can parse those claims unambiguously. Keep a clean sitemap and an llms.txt so machine readers can find your important pages. Then measure the split from your own server logs so you know which engines read you.

Try it yourself.

Run a free AEO audit to see how your site scores, or explore the tools and pages referenced in this article.

Run Free AEO Check Why analytics misses AI traffic What is AEO?