Research · Technical GEO · All Industries

LLMs.txt Explained:
What It Is, How to Set It Up,
and Whether It Actually Works

The llms.txt specification is the most discussed technical GEO signal right now. We cover the format, test platform support across five AI systems, and give you an honest verdict on whether it belongs in your stack.

8 min read Intermediate · Technical Rajiv, Archon Media Updated April 2026
TL;DR

LLMs.txt is a plain-text file at your domain root that tells AI crawlers what your site contains and which pages matter most. It's confirmed active on Perplexity. Adoption by ChatGPT, Claude, and Google remains unclear as of early 2026. Worth implementing — it takes 20 minutes and the downside is zero.

LLMs.txt is a plain-text markdown file you place at the root of your website — at yourdomain.com/llms.txt — that tells AI systems what your site contains, which pages are most important, and how to accurately describe what you do. It's a voluntary convention, not a web standard, but it's gaining traction among the crawlers that power AI-generated answers.

Think of it as the AI-native version of robots.txt — except instead of telling crawlers what to avoid, it tells them what to prioritize and how to understand it.

This piece covers where it came from, how the format works, which platforms actually read it, how to implement it in under 10 minutes, and what the honest evidence says about whether it moves the needle on AI citations.

Where llms.txt came from

The specification was proposed in late 2024 by Jeremy Howard, co-founder of fast.ai and one of the researchers behind the original ULMFiT paper that shaped modern transfer learning in NLP. His reasoning was straightforward: AI systems consume web content, but the web wasn't built for them. Pages are full of navigation menus, cookie banners, footers, and boilerplate that adds noise when a language model is trying to understand what a site actually does.

LLMs.txt gives site owners a clean channel to communicate directly with AI systems — here's who we are, here's our best content, here's how to describe us accurately.

It has no formal standards body behind it. Its adoption depends entirely on AI platforms choosing to respect it, which is happening gradually and unevenly. But the logic is sound, the implementation is trivial, and the spec is already referenced by enough crawlers to be worth knowing.

The file format

An llms.txt file is a structured markdown document. The format has four elements:

  1. An H1 heading with your site or brand name
  2. A blockquote with a one-paragraph description of what you do
  3. One or more H2 sections organizing your content into categories
  4. Markdown links within each section, each with a brief description of the linked page

Here's what a well-formed llms.txt file looks like:

Example: llms.txt
# Archon Media

> Archon Media is a GEO (Generative Engine Optimization) agency that helps
> ecommerce brands, local businesses, and SaaS companies get cited and recommended
> by AI platforms including ChatGPT, Perplexity, Google AI Overviews, and Claude.

## Services

- [Free AI Visibility Audit](/): Baseline assessment of current AI citation presence
- [Entity Optimization](/): Clarifying brand signals so AI systems can confidently recommend you
- [Structured Data Implementation](/): Schema markup for Organization, Product, FAQPage, and more

## Resources

- [What Is GEO?](/resources/what-is-geo.html): Plain-English introduction to generative engine optimization
- [Ecommerce GEO](/resources/ecommerce-geo.html): 6 signals that drive ChatGPT product recommendations
- [LLMs.txt Explained](/resources/llms-txt.html): This article — format, platform support, and implementation

## Case Studies

- [Scott's Protein Balls](/case-studies/scotts-protein-balls.html): 1,682% organic traffic growth, 6,600+ backlinks, Target shelf placement
- [18 Chestnuts](/case-studies/18-chestnuts.html): Zero to 1,043 backlinks and Whole Foods nationwide distribution

The file should be served at exactly /llms.txt — not /llms/index.txt or any other path. Plain text, UTF-8, publicly accessible without authentication.

The llms-full.txt variant

The spec also defines an optional companion file: llms-full.txt. Where llms.txt is an index, llms-full.txt is a dump — it contains the full text content of your most important pages concatenated into a single document.

The purpose is efficiency: instead of an AI crawler following every link in your llms.txt, visiting each page, and parsing the HTML, it can read the entire content of your site in one request. This matters for RAG (retrieval-augmented generation) pipelines that index your content on a schedule.

When to create llms-full.txt

For most sites, llms.txt alone is sufficient to start. Consider adding llms-full.txt if your site has a lot of valuable long-form content that might otherwise be missed, or if you're trying to get indexed by RAG-based systems. For small to medium sites, it's optional for now.

Platform support: who actually reads it

This is the part most people want to know, and the honest answer is: adoption is patchy. Here's what we know as of April 2026:

Platform Status Notes
Perplexity
Confirmed
Perplexity's crawler respects llms.txt and has publicly acknowledged it. Most likely to benefit from implementation today.
ChatGPT / OpenAI
Unclear
No official statement. GPTBot crawls the web broadly; unclear if it specifically looks for or prioritizes llms.txt.
Claude / Anthropic
Unclear
ClaudeBot crawls for training data. No documented llms.txt support as of this writing.
Google AI Overviews
Unclear
Google uses its own crawler infrastructure. No official word on llms.txt. Structured data (schema.org) remains the primary technical signal for Google AI systems.
Grok / xAI
Unclear
No documented position on llms.txt.
RAG pipelines / custom AI tools
Growing
Many developer-built RAG systems and AI agents explicitly check for llms.txt as an efficient way to index site content. This is where near-term value is highest.

The picture is uneven. Perplexity is the clearest win. For the major consumer platforms, the direct evidence of llms.txt influencing citation rates is thin — but the indirect logic holds: making your content easier for AI crawlers to parse is unlikely to hurt, and the implementation cost is low enough that waiting for definitive proof isn't rational.

LLMs.txt vs robots.txt: not the same thing

These files are often confused, so let's be precise.

robots.txt is a gating mechanism. It tells crawlers which pages they are permitted or forbidden from accessing. Most AI crawlers respect it. If you want to block a crawler from a page, robots.txt is the right tool.

llms.txt is a signposting mechanism. It doesn't restrict access to anything — it guides AI systems toward your most valuable content and explains how to interpret your site. A crawler that ignores llms.txt can still access everything robots.txt permits. They operate at different layers.

In practice, you need both. robots.txt is non-negotiable for controlling access. llms.txt is the additional layer that helps AI systems understand what they're looking at once they have access.

How to implement llms.txt in 10 minutes

  1. Create the file. Create a plain text file named llms.txt. Open any text editor. Start with the format above: H1 brand name, blockquote description, H2 sections with links.
  2. Write a tight description. The blockquote is the most important sentence in the file — it's how AI systems will summarize what you do. Be specific about your category, your audience, and your differentiator. Avoid marketing language. Write the way a Wikipedia editor would describe your business.
  3. List your most important pages. Don't list everything. Prioritize: your best resource pages, your product or service pages, case studies, and your homepage. 10–25 links is a reasonable range. Each link needs a short description that tells the AI what the page contains.
  4. Place it at the root. Upload or deploy the file so it's accessible at yourdomain.com/llms.txt. On Netlify, placing it in the root of your project directory is sufficient — no configuration needed.
  5. Verify it. Visit yourdomain.com/llms.txt in your browser and confirm it loads as plain text. Check that the markdown formatting is clean.
  6. Create llms-full.txt (optional). If you want to go further, create a second file at /llms-full.txt that concatenates the full text content of your key pages. This is most valuable for documentation-heavy or content-rich sites.

Does it actually move the needle?

Honestly: the direct evidence is limited. There are no published controlled experiments that show llms.txt in isolation causing a measurable increase in AI citation rate for a given domain. The signal is real for Perplexity; for other major platforms, the mechanism isn't fully documented.

What we can say with confidence:

Our recommendation

Add llms.txt to your site. It's not a magic lever, and anyone claiming definitive ROI data is overstating what's known. But it's fast, it's free, it aligns with where AI infrastructure is going, and the brands that established good AI-crawlability early will have an advantage as the ecosystem matures. This is table stakes, not a strategy.

LLMs.txt vs structured data: which matters more

For most brands, the honest answer is: structured data (schema.org markup) still matters more — particularly for Google AI Overviews, which is the platform with the widest consumer reach.

Schema markup gives AI systems machine-readable facts embedded directly in your HTML: your business type, your products, your reviews, your FAQs, your operating hours. This is read by every major search crawler and used by every major AI answer system that retrieves from the live web.

LLMs.txt is a newer and narrower signal. Its primary confirmed value is with Perplexity and the growing ecosystem of developer-built AI tools. For Google and ChatGPT, schema remains the primary technical lever.

The practical conclusion: implement both. Schema markup is the foundation. LLMs.txt is the additional layer that costs nothing to add once the foundation is in place.

Frequently asked questions

Is llms.txt an official standard?
No. It was proposed by Jeremy Howard (fast.ai) in 2024 as a voluntary convention, similar to how robots.txt started. It has no formal standards body backing it. Whether it becomes a true standard depends on adoption by AI platforms — which is currently limited to a handful of crawlers.
Which AI platforms actually read llms.txt?
Perplexity is the most openly documented supporter. Some lesser-known AI crawlers and RAG pipelines also reference it. ChatGPT, Claude, and Google AI Overviews have not made official statements about llms.txt crawling as of April 2026.
Does llms.txt improve my AI citation rate?
There is no controlled evidence yet that llms.txt directly causes citation rate improvement. Its value is primarily indirect: it helps AI crawlers efficiently index your best content and understand your site structure, which may improve how accurately AI platforms represent you. It's a 10-minute implementation with no downside.
What is the difference between llms.txt and llms-full.txt?
llms.txt is a structured index — it lists your key pages with brief descriptions so AI crawlers know where to go. llms-full.txt is an optional companion file that includes the full text content of those pages in one place, so AI systems can ingest it without crawling every URL individually. For most sites, llms.txt alone is sufficient to start.
Is llms.txt the same as robots.txt?
No. robots.txt blocks or permits crawlers from accessing pages. llms.txt doesn't control access — it guides AI systems toward your most useful content and explains what your site is about. Think of robots.txt as a gate and llms.txt as a signpost. You need both.
Should I create llms.txt even if most AI platforms don't use it yet?
Yes, for two reasons. First, adoption is growing and adding it now means you're already set up when more platforms support it. Second, it forces you to articulate what your site is about in structured, machine-readable terms — which is a useful GEO exercise in itself.