The llms.txt specification is the most discussed technical GEO signal right now. We cover the format, test platform support across five AI systems, and give you an honest verdict on whether it belongs in your stack.
LLMs.txt is a plain-text file at your domain root that tells AI crawlers what your site contains and which pages matter most. It's confirmed active on Perplexity. Adoption by ChatGPT, Claude, and Google remains unclear as of early 2026. Worth implementing — it takes 20 minutes and the downside is zero.
LLMs.txt is a plain-text markdown file you place at the root of your website — at yourdomain.com/llms.txt — that tells AI systems what your site contains, which pages are most important, and how to accurately describe what you do. It's a voluntary convention, not a web standard, but it's gaining traction among the crawlers that power AI-generated answers.
Think of it as the AI-native version of robots.txt — except instead of telling crawlers what to avoid, it tells them what to prioritize and how to understand it.
This piece covers where it came from, how the format works, which platforms actually read it, how to implement it in under 10 minutes, and what the honest evidence says about whether it moves the needle on AI citations.
The specification was proposed in late 2024 by Jeremy Howard, co-founder of fast.ai and one of the researchers behind the original ULMFiT paper that shaped modern transfer learning in NLP. His reasoning was straightforward: AI systems consume web content, but the web wasn't built for them. Pages are full of navigation menus, cookie banners, footers, and boilerplate that adds noise when a language model is trying to understand what a site actually does.
LLMs.txt gives site owners a clean channel to communicate directly with AI systems — here's who we are, here's our best content, here's how to describe us accurately.
It has no formal standards body behind it. Its adoption depends entirely on AI platforms choosing to respect it, which is happening gradually and unevenly. But the logic is sound, the implementation is trivial, and the spec is already referenced by enough crawlers to be worth knowing.
An llms.txt file is a structured markdown document. The format has four elements:
Here's what a well-formed llms.txt file looks like:
# Archon Media > Archon Media is a GEO (Generative Engine Optimization) agency that helps > ecommerce brands, local businesses, and SaaS companies get cited and recommended > by AI platforms including ChatGPT, Perplexity, Google AI Overviews, and Claude. ## Services - [Free AI Visibility Audit](/): Baseline assessment of current AI citation presence - [Entity Optimization](/): Clarifying brand signals so AI systems can confidently recommend you - [Structured Data Implementation](/): Schema markup for Organization, Product, FAQPage, and more ## Resources - [What Is GEO?](/resources/what-is-geo.html): Plain-English introduction to generative engine optimization - [Ecommerce GEO](/resources/ecommerce-geo.html): 6 signals that drive ChatGPT product recommendations - [LLMs.txt Explained](/resources/llms-txt.html): This article — format, platform support, and implementation ## Case Studies - [Scott's Protein Balls](/case-studies/scotts-protein-balls.html): 1,682% organic traffic growth, 6,600+ backlinks, Target shelf placement - [18 Chestnuts](/case-studies/18-chestnuts.html): Zero to 1,043 backlinks and Whole Foods nationwide distribution
The file should be served at exactly /llms.txt — not /llms/index.txt or any other path. Plain text, UTF-8, publicly accessible without authentication.
The spec also defines an optional companion file: llms-full.txt. Where llms.txt is an index, llms-full.txt is a dump — it contains the full text content of your most important pages concatenated into a single document.
The purpose is efficiency: instead of an AI crawler following every link in your llms.txt, visiting each page, and parsing the HTML, it can read the entire content of your site in one request. This matters for RAG (retrieval-augmented generation) pipelines that index your content on a schedule.
For most sites, llms.txt alone is sufficient to start. Consider adding llms-full.txt if your site has a lot of valuable long-form content that might otherwise be missed, or if you're trying to get indexed by RAG-based systems. For small to medium sites, it's optional for now.
This is the part most people want to know, and the honest answer is: adoption is patchy. Here's what we know as of April 2026:
The picture is uneven. Perplexity is the clearest win. For the major consumer platforms, the direct evidence of llms.txt influencing citation rates is thin — but the indirect logic holds: making your content easier for AI crawlers to parse is unlikely to hurt, and the implementation cost is low enough that waiting for definitive proof isn't rational.
These files are often confused, so let's be precise.
robots.txt is a gating mechanism. It tells crawlers which pages they are permitted or forbidden from accessing. Most AI crawlers respect it. If you want to block a crawler from a page, robots.txt is the right tool.
llms.txt is a signposting mechanism. It doesn't restrict access to anything — it guides AI systems toward your most valuable content and explains how to interpret your site. A crawler that ignores llms.txt can still access everything robots.txt permits. They operate at different layers.
In practice, you need both. robots.txt is non-negotiable for controlling access. llms.txt is the additional layer that helps AI systems understand what they're looking at once they have access.
llms.txt. Open any text editor. Start with the format above: H1 brand name, blockquote description, H2 sections with links.yourdomain.com/llms.txt. On Netlify, placing it in the root of your project directory is sufficient — no configuration needed.yourdomain.com/llms.txt in your browser and confirm it loads as plain text. Check that the markdown formatting is clean./llms-full.txt that concatenates the full text content of your key pages. This is most valuable for documentation-heavy or content-rich sites.Honestly: the direct evidence is limited. There are no published controlled experiments that show llms.txt in isolation causing a measurable increase in AI citation rate for a given domain. The signal is real for Perplexity; for other major platforms, the mechanism isn't fully documented.
What we can say with confidence:
Add llms.txt to your site. It's not a magic lever, and anyone claiming definitive ROI data is overstating what's known. But it's fast, it's free, it aligns with where AI infrastructure is going, and the brands that established good AI-crawlability early will have an advantage as the ecosystem matures. This is table stakes, not a strategy.
For most brands, the honest answer is: structured data (schema.org markup) still matters more — particularly for Google AI Overviews, which is the platform with the widest consumer reach.
Schema markup gives AI systems machine-readable facts embedded directly in your HTML: your business type, your products, your reviews, your FAQs, your operating hours. This is read by every major search crawler and used by every major AI answer system that retrieves from the live web.
LLMs.txt is a newer and narrower signal. Its primary confirmed value is with Perplexity and the growing ecosystem of developer-built AI tools. For Google and ChatGPT, schema remains the primary technical lever.
The practical conclusion: implement both. Schema markup is the foundation. LLMs.txt is the additional layer that costs nothing to add once the foundation is in place.