First steps with the Vale prose linter

May. 20, 2021

Code linters support developers by catching errors and stylistic issues in code, such as bad formatting or keywords in the wrong places. The term comes from lint traps in dryer machines, which capture the tiny bits of fiber that separate from cloth.

Below is an example of a Python language linter at work, capturing a constant name that does not conform to the PEP8 style guide; this issue would not prevent the script from running, but it’d make the code harder to read and improve over time.

What about natural language? After all, most programming languages can be described as structured subsets of English, with their grammar and stylistic conventions. Couldn’t we apply the same idea to news articles and technical documentation? The answer is Yes.

After a handful of decades, linting software has finally made the jump to the world of prose, in no small part owing to the emergence of docs-as-code and the advances in natural language processing. Popular prose linters include proselint, textlint, and Vale.

The goal of a prose linter is quite different than that of the grammar checkers you are used to invoking in Microsoft Word or Google Drive; as the authors of proselint put it, a linter for prose does not focus on grammar at all. Rather, it catches all that makes prose worse.

[…] we consider usage: redundancy, jargon, illogic, clichés, sexism, misspelling, inconsistency, misuse of symbols, malapropisms, oxymorons, security gaffes, hedging, apologizing, pretension […]

Many of the above are the stuff of documentation style guides, those hefty rulebooks that tech writers love to hate, and that technical contributors simply ignore for the most part. Getting writers to follow a style guide—any style guide—can be quite the struggle.

The pros afforded by faithfully adhering to a style guide, though, cannot be understated: a consistent writing style also improves the reading experience. Hence the importance of linters and their ability to capture issues with more focus than any copy editor.

Linting your prose with Vale

Vale is one of the most popular prose linters in circulation. Think of it as a robotic reconnaissance team that scouts documentation for stylistic issues before they get to anybody’s eyes. This sort of computer-assisted editing can save you tons of time.

When executed in a folder that has a .vale.ini file, Vale checks all files that match the local configuration against the styles stored in the StylesPath. Upon detecting issues, Vale outputs warnings, errors, and suggestions to the console, like this:

Getting started takes around ten minutes:

  1. Install vale on your machine using a package manager (it’s the fastest way).

  2. Create the styles directory in the root of your repo and copy any of the available styles in subdirectories (for example, Google or Microsoft). The resulting folder structure would look like this:

    ├───styles
    │   ├───Google
    │   └───Microsoft 
    ├───docs
    │   └───...
    │ ...
    
  3. Create a .vale.ini file with basic Vale settings in the root of your repo. This example assumes that you only want to scan Markdown files using the Google style guide.

    StylesPath = styles
    
    [*.md]
    BasedOnStyles = Google
    
  4. Run vale . from the root folder of your project or repository.

That’s it. If you have Markdown files in any subfolder, Vale will scan them.

The true power of Vale lies in styles

Leveraging existing styles, such as Google or Microsoft, is alright, but what about your own? Most tech companies end up adopting mainstream style guides and building upon them, adding terminology and rules that are specific to their brand voice and tone.

Styles are collections of YAML files, each describing a single rule, how it should be captured, and the message to be shown to users. Here is a snippet from the Wordiness.yml file of the Microsoft style.

extends: substitution
message: "Consider using '%s' instead of '%s'."
link: https://docs.microsoft.com/en-us/style-guide/word-choice/use-simple-words-concise-sentences
ignorecase: true
level: warning
action:
  name: replace
  swap:
    (?:give|gave) rise to: lead to
    (?:previous|prior) to: before
    a (?:large)? majority of: most
    a (?:large)? number of: many

The key component of a style is the extends declaration, which describes the type of issue you aim at detecting (substitutions, repetitions, capitalization, spelling, etc.). Each extends type defines the allowed fields.

Notice the usage of regular expressions for pattern matching: learning regex is essential for building robust styles. Even if you’re a beginner, online helpers such as Regex101 can speed up your regex crafting tenfold.

The best way of creating new styles in Vale is using the browser-based Vale Studio, a free editor that allows you to compose and test new rules easily. It even has Dark Mode!

Vale and Visual Studio Code are best friends

At the moment of writing this post, Visual Studio Code is the best general-purpose code editor for all platforms. It’s free, it’s powerful, and it comes with tons of free add-ons, such as the awesome Vale for VSC extension, maintained by Chris Chinchilla.

The advantage of coupling Vale to Visual Studio Code is that you can lint your prose as you write it and get it underlined dynamically instead of having to run Vale every time you need to check the style of your docs. Here’s an example of what I mean with dynamically:

To install the VSC extension for Vale:

  1. Open Visual Studio Code and search vale in the Extensions panel. Alternatively, open the VSC Marketplace in your browser and select Install.
  2. Go to File > Preferences, select Extensions, scroll to Vale, and check Use CLI.
  3. Restart Visual Studio Code.

You could say I’m a Visual Studio Code evangelizer and it wouldn’t be untrue.

Automating your Vale checks

The full power of docs-as-code and linting is unlocked through automation, which nowadays means Continuous Integration (CI). If you’re not familiar with the concept, picture a conveyor belt of tests and operations triggered by a code merge.

Traditionally, setting up a CI pipeline requires learning how to use tools like Jenkins or Travis, but things like GitHub Actions have simplified CI considerably, to the point that you only have to copy and paste a snippet in a config file to get it up and running.

If you’d like to get your docs checked every time you push them to a GitHub repo, give the Vale Linter GitHub action a try.

Where do we go from here?

In this quick overview, I left out some advanced topics, such as scoping and plugins. There is so much more to Vale than just plugging some styles into your docs, though that already provides immense value to doc teams.

Bring your ideas and questions the #testthedocs channel in the Write the Docs Slack, we are a pretty welcoming and helpful bunch.