ai-visibility

What is an llms.txt file and should your brand have one?

An llms.txt file is a markdown document at the root of your site that lists canonical content for AI systems. It is a low-cost hedge that over 844,000 sites have already adopted. Here is the honest verdict.

Updated May 21, 20269 min read

Originally published April 13, 2026

An llms.txt file is a plain markdown document that sits at the root of your site and tells AI systems which content matters. It is one of the most-discussed technical artifacts in AI visibility right now, with over 844,000 sites adopting it by late 2025. It is also one of the most-overhyped. Google's John Mueller has publicly said no AI system currently reads it at inference time. Anthropic publishes one but has not confirmed Claude uses it. Soar is a community marketing agency that has run 4,200+ community campaigns across 280+ brands since 2017, and we ship llms.txt on every week-one technical baseline because the cost is five minutes and the option value is real. So the real question is not whether llms.txt is a silver bullet. It is whether it is worth the five minutes it takes to add. Our answer is yes, and this post explains why in enough detail that you can ship one today.

Key takeaways

llms.txt is a markdown file at the root of your site that lists canonical content for AI systems. The spec was proposed in September 2024 by Jeremy Howard of Answer.AI.
Over 844,000 sites had published an llms.txt by October 2025. Mintlify rolling it out across its hosted docs platform accounts for a meaningful share of that adoption.
No major AI engine confirms it reads llms.txt at inference time today. The argument for shipping is option value, not present performance.
The five-minute cost and zero-maintenance footprint make it a default hedge. Brands that already have one will be indexed first if a provider announces support.
Keep the file under 100 links. Use sections an LLM would actually fan out to: Documentation, Blog, Product, Case Studies, About. Describe each link honestly.
llms.txt does not replace robots.txt, sitemap.xml, or schema markup. It is one item on a 30-item technical checklist, not a replacement for any of them.

Where llms.txt came from

The format was proposed on September 3, 2024 by Jeremy Howard, Co-Founder of Answer.AI. Howard's rationale was simple: "Site authors know best, and can provide a list of content that an LLM should use." Robots.txt already solved the "stay out" problem for crawlers. Sitemap.xml solved the "here is everything" problem for search engines. Neither format helps an LLM answer the question every LLM actually needs answered: of the millions of pages on this domain, which few hundred are the canonical ones a model should care about? The spec lives at llmstxt.org and is deliberately short.

What the format looks like

An llms.txt file is markdown, served at the domain root as /llms.txt. The spec defines a handful of required and optional elements: an H1 with the project or brand name (required), a blockquote with a one-sentence summary, optional body paragraphs with context, and H2-delimited sections containing markdown links to the actual canonical content. That is the whole grammar. Here is a minimal working example for a fictional brand:

# Acme Software

> Acme Software builds developer tools that help teams ship faster. This document lists our canonical content for AI systems.

## Documentation
- [API Reference](https://example.com/docs/api): Complete API documentation with authentication, endpoints, and error codes.
- [Getting Started](https://example.com/docs/start): A 15-minute setup guide for new users.

## Blog
- [Engineering Blog](https://example.com/blog): Technical posts from our engineering team.

That is the entire artifact. No JSON. No YAML. No headers beyond the H1 and H2s. The LLM reads it, parses the links, and in theory uses them as an authoritative map of the site.

The llms-full.txt variant

The spec also defines a companion file at /llms-full.txt, which contains the full markdown content of the listed pages concatenated into a single document. The idea is that an LLM can pull one file and get the entire canonical corpus in one retrieval, without needing to fetch each linked page individually. For documentation-heavy sites this is useful because the model gets the context without the cost of multiple HTTP round-trips. For marketing sites it matters less, because most marketing copy is not the content you want an LLM ingesting verbatim.

Ship llms.txt first. Add llms-full.txt only if your site has documentation worth centralizing in one file. Most brands do not need the full variant.

Adoption and the reality check

By October 25, 2025, over 844,000 websites had published an llms.txt file, according to PPC Land's adoption tracking. That is a large number until you notice the article's headline: adoption has stalled because major AI platforms do not appear to be using the file. Google's John Mueller publicly stated that no AI system currently reads llms.txt at inference time. Anthropic publishes one on its own site but has not confirmed Claude reads it when answering questions. OpenAI has said nothing. Perplexity has said nothing.

So why has anyone bothered? Two reasons:

Platform defaults. Mintlify rolled out llms.txt across every hosted docs site on its platform in November 2025, which means customers including Anthropic, Cursor, Pinecone, and Windsurf got the file added automatically as part of their existing hosting relationship. That single decision accounted for a big chunk of the 844,000 number.
Hedge-bet signal. Vercel reported in the same Mintlify post that 10 percent of its signups now come from ChatGPT referrals, which is a strong enough signal about AI-driven traffic to justify any cheap hedge that might help it grow.

Why we add it to every engagement anyway

The honest argument for llms.txt is not that it works today. It is that it costs nothing to add, it might start working tomorrow, and if it does, the brands who already have the file will be indexed first. This is standard hedge-bet logic. The file takes five minutes to write, serves at zero CPU cost, and requires zero maintenance after the initial setup. There is no credible downside.

There is also a migration argument. If a major LLM provider announces support for llms.txt next quarter, every site that already has one gets the benefit automatically. Every site that does not has to scramble. Shipping the file now is future-proofing, not present optimization.

We set up llms.txt as part of the week-one technical baseline on every engagement. It lives in the same checklist as robots.txt audits and schema markup review. The five minutes it takes is not worth debating internally. What matters is the strategic layer above it, which is deciding which content to highlight in the file. That part actually moves metrics.

How to structure yours for a brand site

The spec gives you a format. It does not tell you what to put in it. Here is how we think about the content decisions for a brand site.

Start with the H1 and a single-sentence summary that would make sense to a model that has never heard of your company. Then divide the links into sections that match how an LLM would categorize a query about you. The usual sections are Documentation, Blog, Product, Case Studies, and About. Each section holds markdown links to the canonical pages for that category, with a short description after each link explaining what the page actually contains. The descriptions are not decoration. They are the context the model will use to decide which link to fetch when it needs to answer a specific question.

Keep the file under 100 links for most brand sites. If you have more than 100 pages worth including, you probably have a taxonomy problem that needs to be solved before the file will help. The job of llms.txt is to reduce noise. Listing everything defeats the purpose.

What llms.txt does not replace

llms.txt is not a robots.txt substitute. Crawlers still need explicit allow and disallow rules, which is why we maintain a separate post on the AI bots robots.txt guide. It is not a sitemap substitute either, because sitemap.xml is still the file search engines use for discovery. It is not schema markup and does not affect how your pages appear in Google AI Overviews. For that discussion, read schema.org markup for AI citations in 2026. llms.txt is a new file with a narrow job: tell AI systems which of your pages matter. Everything else still applies.

The honest verdict

Add llms.txt. Do not expect miracles. Write the file, push it to the root of your site, and move on to work that actually moves the metric. If you want the list of things that do move the metric, the 2026 GEO guide is the long version, and the 90-day GEO program is the short version. llms.txt is one item on a 30-item checklist. It is not the top of the list and not the bottom. It is a cheap, low-risk addition that takes less time to ship than to argue about.

FAQ

Do ChatGPT, Claude, or Perplexity actually read llms.txt today?

Not in any confirmed way. Google's John Mueller publicly stated no AI system currently reads llms.txt at inference. Anthropic publishes one on its own site but has not confirmed Claude uses it. OpenAI and Perplexity have said nothing. The case for shipping it is option value: if a provider announces support, brands with the file in place are indexed first.

Where does the 844,000-site adoption number come from?

PPC Land's October 2025 tracking, citing the public footprint of sites serving an llms.txt at their root. A meaningful chunk comes from Mintlify rolling the file out across every site on its hosted docs platform in November 2025, which automatically added it for customers including Anthropic, Cursor, Pinecone, and Windsurf.

What is the difference between llms.txt and llms-full.txt?

llms.txt is a short index of canonical URLs with one-line descriptions. llms-full.txt concatenates the full markdown of those pages into a single file so an LLM can pull the entire canonical corpus in one retrieval. Documentation-heavy sites benefit from llms-full.txt. Marketing sites usually do not, because most marketing copy is not the text you want a model ingesting verbatim.

Is llms.txt a replacement for robots.txt or schema markup?

No. robots.txt still controls crawler allow and disallow, sitemap.xml still drives search engine discovery, and schema.org markup still shapes how pages appear in Google AI Overviews. llms.txt is a separate file with a narrower job: tell AI systems which pages on the domain matter. The full crawl, indexing, and citation stack still applies.

How many links should I put in llms.txt?

Under 100 for most brand sites. The job of the file is to reduce noise, so listing everything defeats the purpose. If you have more than 100 pages worth highlighting, you probably have a taxonomy problem to solve before the file will help. Group links into sections that match how an LLM would categorize a query about you: Documentation, Blog, Product, Case Studies, About.

Conclusion

llms.txt is a low-cost hedge, not a silver bullet. Google says no AI system currently reads it. Over 844,000 sites have added one anyway. The cost of shipping it is five minutes. The cost of being the last brand to add it if an LLM provider announces support is higher than that. Write the file, keep it under 100 links, describe each link honestly, and stop thinking about it. Spend the saved energy on the content and community work that actually moves AI citation rates.