Code linters support developers by catching errors and stylistic issues in code, such as bad formatting or keywords in the wrong places. The term comes from lint traps in dryer machines, which capture the tiny bits of fiber that separate from cloth.
Below is an example of a Python language linter at work, capturing a constant name that does not conform to the PEP8 style guide; this issue would not prevent the script from running, but it’d make the code harder to read and improve over time.
What about natural language? After all, most programming languages can be described as structured subsets of English, with their grammar and stylistic conventions. Couldn’t we apply the same idea to news articles and technical documentation? The answer is Yes.
After a handful of decades, linting software has finally made the jump to the world of prose, in no small part owing to the emergence of docs-as-code and the advances in natural language processing. Popular prose linters include proselint, textlint, and Vale.
The goal of a prose linter is quite different than that of the grammar checkers you are used to invoking in Microsoft Word or Google Drive; as the authors of proselint put it, a linter for prose does not focus on grammar at all. Rather, it catches all that makes prose worse.
[…] we consider usage: redundancy, jargon, illogic, clichés, sexism, misspelling, inconsistency, misuse of symbols, malapropisms, oxymorons, security gaffes, hedging, apologizing, pretension […]
Many of the above are the stuff of documentation style guides, those hefty rulebooks that tech writers love to hate, and that technical contributors simply ignore for the most part. Getting writers to follow a style guide—any style guide—can be quite the struggle.
The pros afforded by faithfully adhering to a style guide, though, cannot be understated: a consistent writing style also improves the reading experience. Hence the importance of linters and their ability to capture issues with more focus than any copy editor.
Vale is one of the most popular prose linters in circulation. Think of it as a robotic reconnaissance team that scouts documentation for stylistic issues before they get to anybody’s eyes. This sort of computer-assisted editing can save you tons of time.
When executed in a folder that has a
.vale.ini file, Vale checks all files that match the local configuration against the styles stored in the
StylesPath. Upon detecting issues, Vale outputs warnings, errors, and suggestions to the console, like this:
Getting started takes around ten minutes:
valeon your machine using a package manager (it’s the fastest way).
stylesdirectory in the root of your repo and copy any of the available styles in subdirectories (for example,
Microsoft). The resulting folder structure would look like this:
├───styles │ ├───Google │ └───Microsoft ├───docs │ └───... │ ...
.vale.inifile with basic Vale settings in the root of your repo. This example assumes that you only want to scan Markdown files using the Google style guide.
StylesPath = styles [*.md] BasedOnStyles = Google
vale .from the root folder of your project or repository.
That’s it. If you have Markdown files in any subfolder, Vale will scan them.
Leveraging existing styles, such as
Microsoft, is alright, but what about your own? Most tech companies end up adopting mainstream style guides and building upon them, adding terminology and rules that are specific to their brand voice and tone.
Styles are collections of YAML files, each describing a single rule, how it should be captured, and the message to be shown to users. Here is a snippet from the
Wordiness.yml file of the
extends: substitution message: "Consider using '%s' instead of '%s'." link: https://docs.microsoft.com/en-us/style-guide/word-choice/use-simple-words-concise-sentences ignorecase: true level: warning action: name: replace swap: (?:give|gave) rise to: lead to (?:previous|prior) to: before a (?:large)? majority of: most a (?:large)? number of: many
The key component of a style is the
extends declaration, which describes the type of issue you aim at detecting (substitutions, repetitions, capitalization, spelling, etc.). Each
extends type defines the allowed fields.
Notice the usage of regular expressions for pattern matching: learning regex is essential for building robust styles. Even if you’re a beginner, online helpers such as Regex101 can speed up your regex crafting tenfold.
The best way of creating new styles in Vale is using the browser-based Vale Studio, a free editor that allows you to compose and test new rules easily. It even has Dark Mode!
At the moment of writing this post, Visual Studio Code is the best general-purpose code editor for all platforms. It’s free, it’s powerful, and it comes with tons of free add-ons, such as the awesome Vale for VSC extension, maintained by Chris Chinchilla.
The advantage of coupling Vale to Visual Studio Code is that you can lint your prose as you write it and get it underlined dynamically instead of having to run Vale every time you need to check the style of your docs. Here’s an example of what I mean with dynamically:
To install the VSC extension for Vale:
valein the Extensions panel. Alternatively, open the VSC Marketplace in your browser and select Install.
You could say I’m a Visual Studio Code evangelizer and it wouldn’t be untrue.
The full power of docs-as-code and linting is unlocked through automation, which nowadays means Continuous Integration (CI). If you’re not familiar with the concept, picture a conveyor belt of tests and operations triggered by a code merge.
Traditionally, setting up a CI pipeline requires learning how to use tools like Jenkins or Travis, but things like GitHub Actions have simplified CI considerably, to the point that you only have to copy and paste a snippet in a config file to get it up and running.
If you’d like to get your docs checked every time you push them to a GitHub repo, give the Vale Linter GitHub action a try.
In this quick overview, I left out some advanced topics, such as scoping and plugins. There is so much more to Vale than just plugging some styles into your docs, though that already provides immense value to doc teams.
Bring your ideas and questions the
#testthedocs channel in the Write the Docs Slack, we are a pretty welcoming and helpful bunch.