Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Friday, August 26, 2016

CodeMirror with spell checker

CodeMirror with spell checker


I would like to use the functionality of CodeMirror (such as linenumbering, wrapping, search, etc.) for plain text, without particular need of code highlightening but instead with Google Chrome spell checker or some other natural language (especially English) spell checking activated (I do not need to have it work on other browsers). How can I do this? Is it possible to write a plain text mode add-on that enables spell checking?

Answer by Doug for CodeMirror with spell checker


CodeMirror is not based on an HTML textarea, so you can't use the built-in spell check

You could implement your own spell check for CodeMirror with something like typo.js

I don't believe anyone has done this yet.

Answer by James Westgate for CodeMirror with spell checker


I wrote a squiggly underline type spell checker a while ago. It needs a rewrite to be honest I was very new to JavaScript then. But the principles are all there.

https://github.com/jameswestgate/SpellAsYouType

Answer by hsk81 for CodeMirror with spell checker


I actually integrated typo.js with CodeMirror while coding for NoTex.ch; you can have a look at it here CodeMirror.rest.js; I needed a way to get the reStructuredText markup spell checked, and since I use CodeMirror's excellent syntax highlighting capabilities, it was quite straight forward to do.

You can check the code at the provided link, but I'll summarize, what I've done:

  1. Initialize the typo.js library; see also the author's blog/documentation:

    var typo = new Typo ("en_US", AFF_DATA, DIC_DATA, {      platform: 'any'  });  
  2. Define a regular expression for your word separators:

    var rx_word = "!\"#$%&()*+,-./:;<=>?@[\\\\\\]^_`{|}~";  
  3. Define an overlay mode for CodeMirror:

    CodeMirror.defineMode ("myoverlay", function (config, parserConfig) {      var overlay = {          token: function (stream, state) {                if (stream.match (rx_word) &&                  typo && !typo.check (stream.current ()))                    return "spell-error"; //CSS class: cm-spell-error                while (stream.next () != null) {                  if (stream.match (rx_word, false)) return null;              }                return null;          }      };        var mode = CodeMirror.getMode (          config, parserConfig.backdrop || "text/x-myoverlay"      );        return CodeMirror.overlayMode (mode, overlay);  });  
  4. Use the overlay with CodeMirror; see the user manual to figure out how exactly you do this. I've done it in my code so you could check it out there too, but I recommend the user manual.

  5. Define CSS class:

    .CodeMirror .cm-spell-error {       background: url(images/red-wavy-underline.gif) bottom repeat-x;  }  

This approach works great for German, English and Spanish. With the French dictionary typo.js seems to have some (accent) problems, and languages like Hebrew, Hungarian, and Italian - where the number of affixes is long or the dictionary is quite extensive - it does not work really, since typo.js at its current implementation uses too much memory and is too slow.

With German (and Spanish) typo.js can block the JavaScript VM for a few hundred milliseconds (but only during initialization!), so you might want to consider background threads with HTML5 web workers (see CodeMirror.typo.worker.js for an example). Further typo.js does not seem to support Unicode (due to JavaScript restrictions): At least, I did not manage to get it to work with non-Latin languages like Russian, Greek, Hindi etc.

I've not refactored the described solution into a nice separate project apart from (now quite big) NoTex.ch, but I might do it quite soon; till then you've to patch your own solution based on the above description or hinted code. I hope this helps.

Answer by Doug Blank for CodeMirror with spell checker


This is a working version of hsk81's answer. It uses CodeMirror's overlay mode, and looks for any word inside quotes, html tags, etc. It has a sample typo.check that should be replaced with something like Typo.js. It underlines unknown words with a red squiggly line.

This was tested using an IPython's %%html cell.

    

Overlay Parser Demo