Node.js Syntax Highlighter

Update: In general my preferences (and requirements) have evolved slightly and I have switched from using pre-generated HTML in practice to using Prism.js... for now. That said, I still use highlight.js as noted below but wanted to point this out in case you happen to notice. I may create a follow-up post in the future to discuss the "why".

When it comes to highlighting source code for the web there are a several options with different bells and whistles available. One such option is highlight.js and it can be included in your web application's source code for dynamic syntax highlighting or you can pre-render the highlighted source code as static HTML. Before you decide on any particular option you should determine how your web application will be architected. Here is a list of the requirements I used to determine how I wanted to handle syntax highlighting for my own use-case:

  1. Enabled for use in blogs or other web applications
  2. Support for highlighting languages that I use most often
  3. Support for rendering highlighted code as static HTML
  4. Use of CSS classes to enable custom colors and styles
  5. Ability to highlight code using a familiar toolchain (JS, Python, Etc...)

There are no doubt a few options that meet these criteria but ultimately I decided to go with highlight.js. It is based on JavaScript and supports all of the above requirements. I wanted to be able to render highlighted source code as static HTML to eliminate the client-side overhead as well as the need for additional JavaScript code in my web application.

The Architecture

Here is a high-level picture that describes my workflow to use highlight.js by converting source code into static HTML which is then highlighted using CSS.

Highlight.js takes care of inserting all of the necessary CSS class selectors into the resulting HTML. By the time a web browser gets the source code it is already highlighted. By that I mean the HTML is pre-rendered by wrapping the source code in HTML tags and CSS classes. An alternative approach to this involves referencing the highlight.js library in your application and allowing the web browser to run JavaScript to highlight your source code once the DOM is loaded. I may do this someday but for now I want to keep it static.

Using Node.js

I assume that you are already somewhat familiar with Node.js and that you already have the node and npm (Node Package Manager) executables installed. npm is used to install modules in a Node.js application and node is used to actually run a Node program. If you don't have Node.js yet then you will want to head on over to nodejs.org and get yourself up and running.

Since highlight.js has been ported to a Node module and my application is rendered using Node, I decided to stick with this approach and use Node.js. I use Node.js to read source code from a file, highlight the code and output the HTML version of my code snippets.

Step 1: Open up a command-prompt and make a new directory as needed then install the highlight.js module by typing the following command:

npm install highlight.js

This will install the highlight.js module including supported language definitions and the various syntax highlighting CSS files that create the look and feel of the highlighted source code. At the time of this post, highlight.js ships with 112 different programming languages and 49 different options for cascading style sheets that can be used to highlight the HTML formatted source code.

ls -l ./node_modules/highlight.js/
total 48
-rw-r--r--   1 jbiard  staff   1498 Nov 18 13:37 LICENSE
-rw-r--r--   1 jbiard  staff   3362 Nov 18 13:37 README.md
drwxr-xr-x  13 jbiard  staff    442 Mar 14 23:27 docs
drwxr-xr-x   6 jbiard  staff    204 Mar 14 23:35 lib
-rw-r--r--   1 jbiard  staff  13585 Mar 14 23:27 package.json
drwxr-xr-x  51 jbiard  staff   1734 Mar 14 23:27 styles

The available languages can be found in node_modules/highlight.js/lib/languages and the various CSS files can be found under node_modules/styles. Now that we have highlight.js installed we can create a little script that can be executed on the command-line to import source code and output HTML fragments.

Step 2: Create a new Node.js program and add the following source code. (I called mine highlighter.js).

const fs = require('fs'),  
    hljs = require('highlight.js')
    file = process.argv[2],
    lang = process.argv[3],
    data = fs.readFileSync(file);

var code = data.toString();

console.log("<pre class=\"hljs\"><code class=\"" + lang + "\">"  
           + hljs.highlight(lang, code).value 
           + "</code></pre>");

This program expects two command-line parameters. It is also very simple and contains absolutely no error checking. If you are at all concerned about it then you should add any error checking code as needed. The program will read the contents of a file identified by argv[2] and output HTML formatted source code by calling hljs.highlight(...).value using the programming language identified by argv[3]. The final piece of code in this program simply outputs the HTML to the console using console.log(...). The usage of this program would look something like this:

Usage:
   node highlighter.js <file> <language>

Note: I have included pre and code blocks within the console output shown above because this is what my web application expects. This makes it easier for me to insert the HTML into my application pages without having to manually add these tags. If you are working with div tags or any other HTML to wrap the formatted source code then you may need to adjust this example accordingly to suite your own needs.

Step 3: Test the program is working correctly by running it from the command-line against itself and observe the output.

node highlighter.js highlighter.js javascript

This command should output the following HTML fragment to the console from where you ran the command. From there you should be able to copy/paste the code or incorporate this script in other ways into your own process. The following HTML fragment represents the source code of highlighter.js wrapped in HTML tags and CSS class selectors.

<pre class="hljs"><code class="javascript"><span class="hljs-keyword">const</span> fs = <span class="hljs-built_in">require</span>(<span class="hljs-string">'fs'</span>),  
    hljs = <span class="hljs-built_in">require</span>(<span class="hljs-string">'highlight.js'</span>)
    file = process.argv[<span class="hljs-number">2</span>],
    lang = process.argv[<span class="hljs-number">3</span>],
    data = fs.readFileSync(file);

<span class="hljs-keyword">var</span> code = data.toString();

<span class="hljs-built_in">console</span>.log(<span class="hljs-string">"<pre class=\"hljs\"><code class=\""</span> + lang + <span class="hljs-string">"\">"</span>  
               + hljs.highlight(lang, code).value 
               + <span class="hljs-string">"</code></pre>"</span>);
</code></pre>

The method I describe in this post assumes you want to output static HTML fragments. If you want to generate highlights dynamically then take a look at highlightjs.org for more information on usage. With the approach shown here you can use the resulting HTML in any context where valid HTML and CSS are allowed. The hljs prefixes in the CSS class selectors is used to apply the various style or color elements to the source code using plain old CSS.

Wrapping Up

With this approach to highlighting source code for the web I am able to generate static HTML and CSS. This static HTML can be included within any other web content management or similar application where plain old HTML is acceptable. Finally, by pre-rendering source code to static HTML I am able to eliminate some overhead from an HTTP perspective, as well as from the web browser rendering of this content. I hope you found this to be useful.

Cheers!