BLOG

Setting up a pre-commit spell checker using husky and spellchecker-cli


Someone noticed a spelling mistake in the blog
Someone noticed a spelling mistake in the blog

A year after setting up this website, and getting some views on my earlier blogs, I chanced upon a feedback shared by a reader which brought a grave mistake to light.

The mistake was not that the reader found a spelling error in the blog, but that there had been tens of such errors across different pages on the website. On a lucky pass, the feedback helped me notice that I really needed some sort of spell check in all my pages. This led to me setting up a pre-commit task for spell checking on the changed files in my Next Js project.

This blog talks my setup for the same.

Starting with an empty project

For the purpose of the blog, we will start with creating a basic react app with create react app . However, the below setup will work with most modern frontend frameworks, including Next Js.

You can skip this section if you already have a working project in which you would want to integrate.

We will start with creating a starter react project.

npx create-react-app spell-checker-example
cd spell-checker-example

While we don't need to, the dev server is now available through:

npm start

Setting up the spell checker

Let's the install the spellchecker-cli npm package which would abstract out the logic behind spell checking. This package wonderfully checks for spelling errors, and basic grammar and supports a bunch of customizations ( Link to the documentation ).

npm i -D spellchecker-cli

The spell checking can now be run as a script from package.json.

// package.json{ ..."scripts": { ... // spellcheckerrc.json is the config file respected by spellchecker-cli. "spell-check": "spellchecker --config .spellcheckerrc.json"

As a next step, let's add our basic configuration for the spellchecker-cli.

touch .spellcheckerrc.json
// .spellcheckerrc.json{ "files": [ "./src" ], "generateDictionary": true, "quiet": false}

With this, our spell-checker is now working with:

npm run spell-check

Running a spell-check will find all spelling errors in the files passed in the files array. All the errors from the run are copied into dictionary.txt.

Our current setup has a couple of limitations -

  • It requires to be run manually.
  • It can be slow since it always checks all the files in the entire src folder.
  • It checks and runs into errors due to variable name and any other words which are not actually a part of the english vocabulary.

Solving for the limitations

Limitation 1: It runs manually

Husky is a library that integrates deeply with git hooks and makes native git hooks easier. We will use it to create a pre-commit hook which runs the spell checker before every commit. Depending on your usecase, the same logic below would work for a pre-push/any other git hook if needed.

husky-init is a one time command to quickly setup husky.

npx husky-init && npm install

This should create a sample precommit file. Edit it to run the spell-check command on every pre-commit.

#!/bin/sh. "$(dirname "$0")/_/husky.sh"
npm run spell-check

That's about it. Henceforth, spell-check would run before every commit on the repository.

Limitation 2: Spell-check is slow and runs more files than needed

Our problem here could be solved by running only those files which have changed, and are staged. Luckily, .husky/pre-commit file is just another shell file which has access to the git environment.

We can get the list of staged files by using git diff --name-only --cached

Let's use this command in the file to override the hardcode list of files in the .spellcheckerrc.json

#!/bin/sh. "$(dirname "$0")/_/husky.sh"
changedFiles="$(git diff --name-only --cached)"npm run spell-check --files ${changedFiles}

This works as expected and now, spell-checker would only check files which have been staged. However, now it checks images, along any other non text staged files as well. To get past that, one can eloquently pass a negation of the regex of the assets/public folder to the list of files.

npm run spell-check --files ${changedFiles} "!public/**" "!**/*.scss"<other paths to public folders>

Limitation 3: Spell-check does not recognize non english keywords

There are two ways of solving this -

  • The ignore option in spellcheckerrc would let us pass an array of regexes that can be ignored during spell-checking.

    // .spellcheckerrc.json
    // example regex to match camelCased words // variables are camelCased in Js "ignore": [ "[a-z]+((\\d)|([A-Z0-9][a-z0-9]+))+"]

  • With "generateDictionary": true, every run of spell-check generates a dictionary.txt file which contains a list of words in which the spell errors were found. The ideal flow would be fix all the spelling errors from the file until a run goes smoothly. If any valid words are found which spell-checker should recognize, they can be added to an extra dictionary file.

    // .spellcheckerrc.json
    // Our project specific vocabulary/any other valid words"dictionaries": [ "./dictionary-clean.txt"]

With this, our spell checker is all ready to go.


Hope this was helpful! Feel free to share your insights on the blog from the Contact Page, or by directly dropping an email tosaranshgupta1995@gmail.com. If you have anything that would help improve this setup, I'd love to hear and discuss about it.