Episode VII of Phase One: Skills, agentic workflows, and the tinker's burden
Seventh episode of the AI & Docs podcast series is up! In this one, Tom, Larah Vasquez, and I talk about local LLMs and the shift away from API dependence, why AI output is bad by default, skills as a token-efficient alternative to stuffing everything into MCP servers, the memory problem in LLMs, and whether tech writers are quietly architecting themselves into a higher role — or out of one.
You can watch / listen to the episode here:
Some of the things I said:
On local models and the wine cellar of LLMs
“It’s almost like wines, you know, like a fine wine. You have to be like a connoisseur of, oh, this is quantized. Okay. Or this is, you know, 40 billion parameters, no, something lighter, please. And yeah, you have like this bodega with the vintage.”
On AI tooling strategy and choosing the right model
“At work, I’m thinking about an AI tooling strategy. So we are thinking about the tiers, the kind of tools you’re going to use and for what. And the model of course is part of that calculation. Like, it’s overkill to use something like Opus for something like typos, for example, and it’s also slower.”
On why AI needs human signal to produce anything good
“What I’ve noticed is that as long as you provide some high quality signal that comes from a human being, and that can be a Vale rule or can be some pre-existing art like documents following the same content type — maybe it’s templates, it’s a style guide, it’s a skill, whatever, but done by humans, which means signal doesn’t mean noise — then the output, the quality of the output improves dramatically.”
“That to me signals the need of a human in the loop at all times. And not just as a reviewer or approver, but even as an initiator or as someone who keeps feeding the boiler with some coal or whatever.”
On the memory problem and model personality
“LLMs cannot learn unless they are retrained. But would you like to have them learn things for good? I would, I know I would.”
“I see people saying, I’m getting much better results now with GPT-54. But the personality, I don’t really like the personality. It’s like, gosh, now we have to deal with an entirely different artificial being with a different way of outputting stuff.”
“We don’t want too much personality. We don’t want too much memory. And I wonder if that’s because we don’t really want them to be that human or to resemble humans. There’s probably something there.”
On thin skills calling the MCP server
“We even had some internal tests where it turned out that skills that were linking to the documentation, but were light in terms of tokens, performed better or worst case the same as skills with everything just dumped into the skill itself — very long skills with all the documentation.”
“They are even easier to keep up to date because it’s the docs that we keep up to date. So the skill just has to retrieve them.”
On the meta-skill trick
“There’s something there that helped the Docs team because what we put into that skill creator skill is that any skill you create, if it refers to documentation, it has to call our MCP server. Whenever you want to retrieve documentation, when you create a skill based on things we do, first search the docs.”
On the MCP server as knowledge layer
“The MCP server is really one of the best things we have done. I know that some folks are already saying that MCP is dead. I think MCP are just like APIs for LLMs. And having the ability of performing semantic searches in our docs with the MCP server, it has become like the knowledge layer that everything else, every other AI tool is using.”
On doing more, not doing less
“There are two ways of looking at this. You either say, well, with skills, we can do the same work with less people, or more wisely, you can say we can do more work with the same people. And we don’t have to reduce the team.”
“Documentation is a never ending effort and we all have a huge backlog always. If we can have skills or agent workflows help us deal with the chores, with the business as usual stuff — that’s better. If people want to reduce teams because they want to buy more GPUs for whatever project, it’s up to them, but then they shouldn’t be surprised if they see the docs getting worse.”
On the tinkerer profile
“Larah and I are the tinkerers profile and we’ve always been busy. It was AI, it’s AI now, but before it was static site generators, it was scripts, it was linters. The tool type of writer is always going to be busy just with a different kind of tool.”
On skills forcing people to document what they never did
“I think skills are just the beginning. And my prediction is the direction is packaging — skills will become part of something bigger where you have a personality, you have long-term memory, you have the whole complete package of a full agent.”
“Right now I feel everything is too molecular. I think the direction is probably going towards a user experience and nobody really has thought about user experience of these things.”
On Simon Willison and calling things by their name
“We all follow Simon Willison and his Pelican experiments. I’m just a little mad at him because one day he said something about readme powered something, like skills or whatever. And I was like, it’s documentation. It’s not a readme.”
On reminding influencers what they’re actually talking about
“I like being there and sometimes I try to interact with those folks, the influencers, because sometimes they kind of overlook the fact that what they’re talking about is documentation. It’s nice if we are there and we interact and remind folks about what we do. I’m a very vocal guy, but I think everybody can throw their opinions and just say, well, what you’re talking about is docs, actually.”
On the operator mattering as much as the skill
“The only thing that everybody needs to remember is that even if the skill is good, who operates the skill is equally important.”