We're all familar with publishing typos in production docs. But what if there was a better way? After merging Juanito's super helpful pull request to replace 'blacklist' and 'whitelist' with friendlier terms, I was finally motivated to add a few tools and checks to stop myself and others making those types of mistakes.
These ideas aren't new, here are some lovely folks talking about improving the inclusivity of docs from 2017 and 2018.
- Don't Say Simply - Write the Docs
- Harriet Lawrence: Sociolinguistics and the Javascript community: a love story | JSConf EU 2017
In this post I'll show you how to configure a couple of different tools to run various checks on the documentation, and how to configure Buildkite to run them automatically.
Anyhow, to the tools.
Alex
First up, alex is a one-stop nodejs tool for catching insensitive or inconsiderate writing.
As long as you have npm (with Node.js) installed, you can run alex directly without installing it using npx
. Running alex on our docs repo looks a bit like this:
1
$ npx alex@8 --diff pages/**/*.erb pages/**/*.txt
We're using npx
because we're planning to run this in CI, pinning alex at version 8, and checking all docs in the pages
directory, some are *.erb
files and some are plain text *.txt
. (*.erb
files are Embedded Ruby Template files, which are mostly markdown, with a sprinkling of Ruby code)
The first time I ran it, alex returned a few hundred errors and warnings in this format:
1 2 3 4 5 6 7 8 9 10 11 12 13
pages/integrations/cc_menu.md.erb: no issues found pages/integrations/github.md.erb: no issues found pages/integrations/github_enterprise.md.erb: no issues found pages/integrations/gitlab.md.erb: no issues found pages/integrations/phabricator.md.erb: no issues found pages/integrations/slack.md.erb: no issues found pages/integrations/sso.md.erb: no issues found pages/pipelines/artifacts.md.erb: no issues found pages/pipelines/block_step.md.erb 3:37-3:46 warning Be careful with `execution`, itβs profane in some cases execution retext-profanities pages/pipelines/branch_configuration.md.erb 68:65-68:72 warning Be careful with `periods`, itβs profane in some cases periods retext-profanities
I decided that neither periods
not execution
were words that I need to avoid in a software context, so I added them to the list of exceptions in .alexrc
:
1 2 3
allow: - execution - periods
A combination of adding all of the checks that I didn't find useful to .alexrc
and making edits to the docs where necessary removed most of the warnings, but sometimes as a real thinking human person you know better than the linting tool. In that situation you can explicitly tell alex to ignore something that it would otherwise warn you about, using an html style comment (which is valid in both markdown and html).
1 2
<!--alex ignore whitelist --> This sentence will **not** trigger the check for whitelist. This can be useful when writing a style guide that includes examples of what *not* to write.
Great, our docs are already much more friendly and inclusive, but what else can I improve while I'm here?
Vale
Vale is a syntax-aware linter for prose built with speed and extensibility in mind, written in Go. Sure, but what does that mean? It means that we can configure it to do various different things, such as spellchecks, testing for unintentional repeated words, suggesting alternatives to incorrect capitalization, etc.
Exactly how you install vale depends on where and how you're using it, but we're doing something like this:
1
curl -sfL https://install.goreleaser.com/github.com/ValeLint/vale.sh | sh -s v2.2.2
Note that you should never download and pipe software to a shell interpreter like this from a source you do not trust.
Configuring vale is a little trickier, because it does more things. Let's take a look at .vale.ini
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
# we'll keep our custom configuration in the vale/ directory StylesPath = vale # display warnings as well as errors MinAlertLevel = warning # Remember how we said we use both *.txt files and *.erb files? # This tells vale to treat both as markdown [formats] erb = md txt = md # For all file formats [*], run the tests and checks in vale/Buildkite # and run the built-in check for repeated words (Vale.Repetition) [*] BasedOnStyles = Buildkite, Vale.Repetition
In vale/Buildkite
, which is where we configure our custom checks there are currently two files, one for each check that we do.
existence.yml
suggests alternatives to things that we know we don't want in the docs:
- blacklist β allowlist
- oAuth β OAuth
The configuration for it looks like this:
1 2 3 4 5 6 7 8
extends: substitution message: Consider using '%s' instead of '%s' level: error # swap maps tokens in form of bad: good swap: whitelist: allowlist blacklist: blocklist oAuth: OAuth
We're using a customization of the built-in substitution
rule, that when it detects one of the words we don't want, such as whitelist
, raises an error and suggests the alternative (allowlist
):
1
Consider using 'allowlist' instead of 'whitelist'
spelling.yml
uses the default dictionary (en-US, but you can use other languages of course) to highlight potential spelling mistakes. Due to the amount of technical terms in softare documentation, it might intially seem that you have far too many of these to want to fix, but we can work around that.
Vale already ignores anything marked as code, or in links, we then used some regular expressions and shell scripts to get a list containing one (and only one) of each word that vale considers a spelling mistake. (I'll leave that regular expression work out of this article, but if you're super interested drop me a line)
We quickly went through the list, deleting any mistakes, and leaving in technical terms that we do not consider errors:
1 2 3 4 5 6 7 8
Atlassian autogenerated Autoscaling autoscaling Basecamp Basscss Bazel ...
Then we confgured the spelling rule like this, telling vale to ignore any words that we've put in vale/vocab.txt
.
1 2 3 4 5 6
extends: spelling message: "Did you really mean '%s'?" level: error ignore: vale/vocab.txt filters: - ':[a-z\-]*:' # Ignore all custom emoji words
The last couple of lines of that configuration file, show ways that you can ignore words using regular expressions. In this example we're ignoring any words in between colons, like :example:
, a common shortcut for for emoji.
And similarly to Alex, if you need to tell vale to ignore some parts of your documents, you can mark them with comments:
1 2 3 4 5 6 7 8 9
This part will be checked <!β vale off β> This part won't be checked <!β vale on β> This part will also be checked
There are many other types of checks that you can use vale for, from making sure that your capitalization and punctuation are consistent, to making sure that you're writing to a popular style guide such as the Google developer documentation style guide.
Tying it together
You can see how this all ties together in the buildkite/docs.
We run both vale and Alex automatically on every pull request, using Buildkite, configured in .buildkite/pipeline.yml. We're running Vale directly on the Buildkite Agent, but using the Docker plugin to run Alex on a Docker image that already has Node.js installed.
1 2 3 4 5 6 7 8 9 10
steps: - label: "<img class="emoji not-prose size-[1em] inline align-[-0.1em]" title="lint-roller" alt=":lint-roller:" src="https://buildkiteassets.com/emojis/img-buildkite-64/lint-roller.png" draggable="false" /> Linting for insensitive words" command: npx alex@8 --diff pages/**/*.erb pages/**/*.txt plugins: - docker#v3.5.0: image: "node:alpine" - label: "<img class="emoji not-prose size-[1em] inline align-[-0.1em]" title="lint-roller" alt=":lint-roller:" src="https://buildkiteassets.com/emojis/img-buildkite-64/lint-roller.png" draggable="false" /> Linting" commands: - curl -sfL https://install.goreleaser.com/github.com/ValeLint/vale.sh | sh -s v2.2.2 - ./bin/vale pages
We still have a lot of room for improvement, and will look at things like renaming the master branch of the git repository, linting the app as well as well as the docs, and improving our contributing and style guides.