What is an llms.txt file and should your brand have one?

April 13, 2026 in ai-visibility·8 min read
What is an llms.txt file and should your brand have one?

What is an llms.txt file and should your brand have one?

An llms.txt file is a plain markdown document that sits at the root of your site and tells AI systems which content matters. It is one of the most-discussed technical artifacts in AI visibility right now, with over 844,000 sites adopting it by late 2025. It is also one of the most-overhyped. Google's John Mueller has publicly said no AI system currently reads it at inference time. Anthropic publishes one but has not confirmed Claude uses it. So the real question is not whether llms.txt is a silver bullet. It is whether it is worth the five minutes it takes to add. Our answer is yes, and this post explains why in enough detail that you can ship one today.

Where llms.txt came from

The format was proposed on September 3, 2024 by Jeremy Howard, Co-Founder of Answer.AI. Howard's rationale was simple: "Site authors know best, and can provide a list of content that an LLM should use." Robots.txt already solved the "stay out" problem for crawlers. Sitemap.xml solved the "here is everything" problem for search engines. Neither format helps an LLM answer the question every LLM actually needs answered: of the millions of pages on this domain, which few hundred are the canonical ones a model should care about? The spec lives at llmstxt.org and is deliberately short.

What the format looks like

An llms.txt file is markdown, served at the domain root as /llms.txt. The spec defines a handful of required and optional elements: an H1 with the project or brand name (required), a blockquote with a one-sentence summary, optional body paragraphs with context, and H2-delimited sections containing markdown links to the actual canonical content. That is the whole grammar. Here is a minimal working example for a fictional brand:

# Acme Software

> Acme Software builds developer tools that help teams ship faster. This document lists our canonical content for AI systems.

## Documentation
- [API Reference](https://example.com/docs/api): Complete API documentation with authentication, endpoints, and error codes.
- [Getting Started](https://example.com/docs/start): A 15-minute setup guide for new users.

## Blog
- [Engineering Blog](https://example.com/blog): Technical posts from our engineering team.

That is the entire artifact. No JSON. No YAML. No headers beyond the H1 and H2s. The LLM reads it, parses the links, and in theory uses them as an authoritative map of the site.

The llms-full.txt variant

The spec also defines a companion file at /llms-full.txt, which contains the full markdown content of the listed pages concatenated into a single document. The idea is that an LLM can pull one file and get the entire canonical corpus in one retrieval, without needing to fetch each linked page individually. For documentation-heavy sites this is useful because the model gets the context without the cost of multiple HTTP round-trips. For marketing sites it matters less, because most marketing copy is not the content you want an LLM ingesting verbatim.

Ship llms.txt first. Add llms-full.txt only if your site has documentation worth centralizing in one file. Most brands do not need the full variant.

Adoption and the reality check

By October 25, 2025, over 844,000 websites had published an llms.txt file, according to PPC Land's adoption tracking. That is a large number until you notice the article's headline: adoption has stalled because major AI platforms do not appear to be using the file. Google's John Mueller publicly stated that no AI system currently reads llms.txt at inference time. Anthropic publishes one on its own site but has not confirmed Claude reads it when answering questions. OpenAI has said nothing. Perplexity has said nothing.

So why has anyone bothered? Two reasons:

  1. Platform defaults. Mintlify rolled out llms.txt across every hosted docs site on its platform in November 2025, which means customers including Anthropic, Cursor, Pinecone, and Windsurf got the file added automatically as part of their existing hosting relationship. That single decision accounted for a big chunk of the 844,000 number.
  2. Hedge-bet signal. Vercel reported in the same Mintlify post that 10 percent of its signups now come from ChatGPT referrals, which is a strong enough signal about AI-driven traffic to justify any cheap hedge that might help it grow.

Why we add it to every engagement anyway

The honest argument for llms.txt is not that it works today. It is that it costs nothing to add, it might start working tomorrow, and if it does, the brands who already have the file will be indexed first. This is standard hedge-bet logic. The file takes five minutes to write, serves at zero CPU cost, and requires zero maintenance after the initial setup. There is no credible downside.

There is also a migration argument. If a major LLM provider announces support for llms.txt next quarter, every site that already has one gets the benefit automatically. Every site that does not has to scramble. Shipping the file now is future-proofing, not present optimization.

We set up llms.txt as part of the week-one technical baseline on every engagement. It lives in the same checklist as robots.txt audits and schema markup review. The five minutes it takes is not worth debating internally. What matters is the strategic layer above it, which is deciding which content to highlight in the file. That part actually moves metrics.

How to structure yours for a brand site

The spec gives you a format. It does not tell you what to put in it. Here is how we think about the content decisions for a brand site.

Start with the H1 and a single-sentence summary that would make sense to a model that has never heard of your company. Then divide the links into sections that match how an LLM would categorize a query about you. The usual sections are Documentation, Blog, Product, Case Studies, and About. Each section holds markdown links to the canonical pages for that category, with a short description after each link explaining what the page actually contains. The descriptions are not decoration. They are the context the model will use to decide which link to fetch when it needs to answer a specific question.

Keep the file under 100 links for most brand sites. If you have more than 100 pages worth including, you probably have a taxonomy problem that needs to be solved before the file will help. The job of llms.txt is to reduce noise. Listing everything defeats the purpose.

What llms.txt does not replace

llms.txt is not a robots.txt substitute. Crawlers still need explicit allow and disallow rules, which is why we maintain a separate post on the AI bots robots.txt guide. It is not a sitemap substitute either, because sitemap.xml is still the file search engines use for discovery. It is not schema markup and does not affect how your pages appear in Google AI Overviews. For that discussion, read schema.org markup for AI citations in 2026. llms.txt is a new file with a narrow job: tell AI systems which of your pages matter. Everything else still applies.

The honest verdict

Add llms.txt. Do not expect miracles. Write the file, push it to the root of your site, and move on to work that actually moves the metric. If you want the list of things that do move the metric, the 2026 GEO guide is the long version, and the 90-day GEO program is the short version. llms.txt is one item on a 30-item checklist. It is not the top of the list and not the bottom. It is a cheap, low-risk addition that takes less time to ship than to argue about.

Conclusion

llms.txt is a low-cost hedge, not a silver bullet. Google says no AI system currently reads it. Over 844,000 sites have added one anyway. The cost of shipping it is five minutes. The cost of being the last brand to add it if an LLM provider announces support is higher than that. Write the file, keep it under 100 links, describe each link honestly, and stop thinking about it. Spend the saved energy on the content and community work that actually moves AI citation rates.

How Soar saves you time and money

llms.txt is one of those technical low-cost hedges we set up automatically in every engagement. It goes on the week-one checklist next to robots.txt audits, schema markup review, and the baseline AI visibility audit. The five minutes it takes is not worth debating internally. What matters is the strategic layer above it, which is deciding which content to highlight for AI systems and which of the four AI engines to prioritize for your specific buyer profile. That strategic work is what we actually spend time on.

A do-it-yourself AI visibility program typically spends weeks debating whether llms.txt is worth the effort while missing the much larger opportunities in source intervention, prompt tracking, and community content. We compress the technical baseline into the first week so the rest of the engagement can focus on the work that actually moves metrics. If you want a week-one technical baseline that covers llms.txt, robots.txt, schema, and the baseline AI visibility audit, request a proposal and we will scope it for your specific stack.

Community marketing strategy

Ready to grow through community marketing?

Get a custom strategy tailored to your brand, audience, and the conversations already shaping buying decisions.