Should you write documentation differently for LLMs?

Posted on Jan 24, 2025

Writing for LLMs is the new SEO obsession. Not a day passes without seeing some question popping up in tech writing communities about how to best compose content for AI scrapers. Folks even wonder if a different style guide should be necessary, or whether tables should be avoided because they’ve seen a pull request rejected in a Microsoft repository on the silliest grounds.

With AI search like Perplexity becoming more popular by the day, content creators are asking reasonable questions about their content. Will AIs be able to properly read our documentation? Will they misrepresent our product? Worse, could they hallucinate if we don’t restructure our documentation? AI is a new, unfathomable reader, so of course writers feel anxious about it.

Write for humans, let the AIs follow

In the case of AIs or LLMs, writing for humans usually does the trick. It’s a philosophical stance: if the creators of AIs aim at creating a human-like intelligence, or even a superhuman form of artificial cognition, trying to appease its algorithms would stunt its growth. General artificial intelligence should be able to process human-made materials much in the same way as humans do.

My answer is the same I gave in 2015, during the SEO-craze years: No, you don’t have to worry about SEO too much, outside of minor technical facilitators such as semantic tagging and page speed. The best way to position yourself in a search engine is still creating fresh, original, high-quality content. You are, therefore you’re positioned. You write well, therefore you’re found.

Authors’ mistake isn’t thinking Google penalizes them or that they can do something about it; their fundamental error is thinking about Google at all. The moment a book modifies itself to slip along the virtual shelf, librarian Google erases it.

Everything that a user is supposed to see, test, try, know, read, wonder about, or ask, is something that an AI is supposed to be concerned about. That includes tables, diagrams, figures, videos, code snippets, and all forms of technical communication. Altering content to supposedly satisfy the way an LLM thinks not only does a disservice to human readers, it also poisons the AI well.

Strive for high quality, then package your docs for AIs

As I wrote in my 2025 predictions on tech writing, the advent of docs-as-data will be precipitated by LLMs. They’re the new channel in town and they benefit from prepackaged, easy-to-download content, enriched with additional metadata taken from your CMS database or the frontmatter attributes of your docs. We’ll see more initiatives and standards like LLMS.txt becoming popular.

Providing a better channel for AI consumption is not the same as writing for AIs. You must keep writing for humans, because humans are the ones who’ll buy your goods or hire you; the difference is that this might now be mediated by LLMs, which act as a channel. Much in the same way you present content differently for mobile or print, you need to think how to present content to AI agents.

In this sense, writing FAQs to benefit LLMs is wrong because it caricatures AI agents as chatbots. If anything, the Q&A format might prove beneficial at times because it follows a predictable pattern. A better way of providing FAQ-like content to LLMs is to turn questions and answers into knowledge base articles, each peppered with metadata and living within a semantically sound structure.

An opportunity to improve semantics and accessibility

As readers of your docs, LLMs have special needs. They don’t see things the same way as humans do (or not yet), nor they care about layout and typography. Their short-term memory is limited and they have serious problems with seeing the big picture. The best way to provide content to LLMs is to think about accessibility options, such as exposing metadata and labeling things carefully.

There’s a certain irony in improving accessibility by bowing to the needs of an artificial being, but this might be the best opportunity yet to get resources for a11y. It’s also the best opportunity for restoring highly semantic and structured content formats to their former glory. As Michael Iantosca says in his defence of DITA as the vehicle for better LLM retrieval:

Sooner or later, developers are going to realize that the quality and utility of generative AI, especially for help and user assistance, rely as much or more on the quality and intelligence of the content as the model and its implementation.

While I don’t necessarily agree on DITA being the best vehicle for structured and semantic content these days, I do agree that there’s an opportunity in implementing better structured content for documentation, as it would serve both a multichannel content strategy and the needs of docs-as-data for AI consumption. We still need a better demo for this to become popular, though.

Documentation must serve everyone, organic or artificial

The emergence of LLMs as content consumers doesn’t fundamentally change our mission as technical writers. Just as we learned from the SEO era, our focus must remain on writing clear, high-quality content for humans. What changes is how we deliver this content: LLMs represent a new channel with specific accessibility needs, much like mobile users or screen readers.

By implementing proper semantic structure and metadata – the kind of facilitators that made SEO worthwhile – we can serve both human and AI readers without compromising our documentation’s primary purpose. The key, as always, isn’t to write differently, but to package thoughtfully.