1# Chroma — A general purpose syntax highlighter in pure Go
2
3[](https://godoc.org/github.com/alecthomas/chroma) [](https://github.com/alecthomas/chroma/actions/workflows/ci.yml) [](https://invite.slack.golangbridge.org/)
4
5Chroma takes source code and other structured text and converts it into syntax
6highlighted HTML, ANSI-coloured text, etc.
7
8Chroma is based heavily on [Pygments](http://pygments.org/), and includes
9translators for Pygments lexers and styles.
10
11## Table of Contents
12
13<!-- TOC -->
14
151. [Supported languages](#supported-languages)
162. [Try it](#try-it)
173. [Using the library](#using-the-library)
18 1. [Quick start](#quick-start)
19 2. [Identifying the language](#identifying-the-language)
20 3. [Formatting the output](#formatting-the-output)
21 4. [The HTML formatter](#the-html-formatter)
224. [More detail](#more-detail)
23 1. [Lexers](#lexers)
24 2. [Formatters](#formatters)
25 3. [Styles](#styles)
265. [Command-line interface](#command-line-interface)
276. [Testing lexers](#testing-lexers)
287. [What's missing compared to Pygments?](#whats-missing-compared-to-pygments)
29
30<!-- /TOC -->
31
32## Supported languages
33
34| Prefix | Language |
35| :----: | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
36| A | ABAP, ABNF, ActionScript, ActionScript 3, Ada, Agda, AL, Alloy, Angular2, ANTLR, ApacheConf, APL, AppleScript, ArangoDB AQL, Arduino, ArmAsm, AutoHotkey, AutoIt, Awk |
37| B | Ballerina, Bash, Bash Session, Batchfile, BibTeX, Bicep, BlitzBasic, BNF, BQN, Brainfuck |
38| C | C, C#, C++, Caddyfile, Caddyfile Directives, Cap'n Proto, Cassandra CQL, Ceylon, CFEngine3, cfstatement, ChaiScript, Chapel, Cheetah, Clojure, CMake, COBOL, CoffeeScript, Common Lisp, Coq, Crystal, CSS, Cython |
39| D | D, Dart, Dax, Desktop Entry, Diff, Django/Jinja, dns, Docker, DTD, Dylan |
40| E | EBNF, Elixir, Elm, EmacsLisp, Erlang |
41| F | Factor, Fennel, Fish, Forth, Fortran, FortranFixed, FSharp |
42| G | GAS, GDScript, Genshi, Genshi HTML, Genshi Text, Gherkin, Gleam, GLSL, Gnuplot, Go, Go HTML Template, Go Text Template, GraphQL, Groff, Groovy |
43| H | Handlebars, Hare, Haskell, Haxe, HCL, Hexdump, HLB, HLSL, HolyC, HTML, HTTP, Hy |
44| I | Idris, Igor, INI, Io, ISCdhcpd |
45| J | J, Java, JavaScript, JSON, Jsonnet, Julia, Jungle |
46| K | Kotlin |
47| L | Lighttpd configuration file, LLVM, Lua |
48| M | Makefile, Mako, markdown, Mason, Materialize SQL dialect, Mathematica, Matlab, MCFunction, Meson, Metal, MiniZinc, MLIR, Modula-2, MonkeyC, MorrowindScript, Myghty, MySQL |
49| N | NASM, Natural, Newspeak, Nginx configuration file, Nim, Nix, NSIS |
50| O | Objective-C, OCaml, Octave, Odin, OnesEnterprise, OpenEdge ABL, OpenSCAD, Org Mode |
51| P | PacmanConf, Perl, PHP, PHTML, Pig, PkgConfig, PL/pgSQL, plaintext, Plutus Core, Pony, PostgreSQL SQL dialect, PostScript, POVRay, PowerQuery, PowerShell, Prolog, PromQL, Promela, properties, Protocol Buffer, PRQL, PSL, Puppet, Python, Python 2 |
52| Q | QBasic, QML |
53| R | R, Racket, Ragel, Raku, react, ReasonML, reg, Rego, reStructuredText, Rexx, RPMSpec, Ruby, Rust |
54| S | SAS, Sass, Scala, Scheme, Scilab, SCSS, Sed, Sieve, Smali, Smalltalk, Smarty, SNBT, Snobol, Solidity, SourcePawn, SPARQL, SQL, SquidConf, Standard ML, stas, Stylus, Svelte, Swift, SYSTEMD, systemverilog |
55| T | TableGen, Tal, TASM, Tcl, Tcsh, Termcap, Terminfo, Terraform, TeX, Thrift, TOML, TradingView, Transact-SQL, Turing, Turtle, Twig, TypeScript, TypoScript, TypoScriptCssData, TypoScriptHtmlData, Typst |
56| V | V, V shell, Vala, VB.net, verilog, VHDL, VHS, VimL, vue |
57| W | WDTE, WebGPU Shading Language, Whiley |
58| X | XML, Xorg |
59| Y | YAML, YANG |
60| Z | Z80 Assembly, Zed, Zig |
61
62_I will attempt to keep this section up to date, but an authoritative list can be
63displayed with `chroma --list`._
64
65## Try it
66
67Try out various languages and styles on the [Chroma Playground](https://swapoff.org/chroma/playground/).
68
69## Using the library
70
71This is version 2 of Chroma, use the import path:
72
73```go
74import "github.com/alecthomas/chroma/v2"
75```
76
77Chroma, like Pygments, has the concepts of
78[lexers](https://github.com/alecthomas/chroma/tree/master/lexers),
79[formatters](https://github.com/alecthomas/chroma/tree/master/formatters) and
80[styles](https://github.com/alecthomas/chroma/tree/master/styles).
81
82Lexers convert source text into a stream of tokens, styles specify how token
83types are mapped to colours, and formatters convert tokens and styles into
84formatted output.
85
86A package exists for each of these, containing a global `Registry` variable
87with all of the registered implementations. There are also helper functions
88for using the registry in each package, such as looking up lexers by name or
89matching filenames, etc.
90
91In all cases, if a lexer, formatter or style can not be determined, `nil` will
92be returned. In this situation you may want to default to the `Fallback`
93value in each respective package, which provides sane defaults.
94
95### Quick start
96
97A convenience function exists that can be used to simply format some source
98text, without any effort:
99
100```go
101err := quick.Highlight(os.Stdout, someSourceCode, "go", "html", "monokai")
102```
103
104### Identifying the language
105
106To highlight code, you'll first have to identify what language the code is
107written in. There are three primary ways to do that:
108
1091. Detect the language from its filename.
110
111 ```go
112 lexer := lexers.Match("foo.go")
113 ```
114
1152. Explicitly specify the language by its Chroma syntax ID (a full list is available from `lexers.Names()`).
116
117 ```go
118 lexer := lexers.Get("go")
119 ```
120
1213. Detect the language from its content.
122
123 ```go
124 lexer := lexers.Analyse("package main\n\nfunc main()\n{\n}\n")
125 ```
126
127In all cases, `nil` will be returned if the language can not be identified.
128
129```go
130if lexer == nil {
131 lexer = lexers.Fallback
132}
133```
134
135At this point, it should be noted that some lexers can be extremely chatty. To
136mitigate this, you can use the coalescing lexer to coalesce runs of identical
137token types into a single token:
138
139```go
140lexer = chroma.Coalesce(lexer)
141```
142
143### Formatting the output
144
145Once a language is identified you will need to pick a formatter and a style (theme).
146
147```go
148style := styles.Get("swapoff")
149if style == nil {
150 style = styles.Fallback
151}
152formatter := formatters.Get("html")
153if formatter == nil {
154 formatter = formatters.Fallback
155}
156```
157
158Then obtain an iterator over the tokens:
159
160```go
161contents, err := ioutil.ReadAll(r)
162iterator, err := lexer.Tokenise(nil, string(contents))
163```
164
165And finally, format the tokens from the iterator:
166
167```go
168err := formatter.Format(w, style, iterator)
169```
170
171### The HTML formatter
172
173By default the `html` registered formatter generates standalone HTML with
174embedded CSS. More flexibility is available through the `formatters/html` package.
175
176Firstly, the output generated by the formatter can be customised with the
177following constructor options:
178
179- `Standalone()` - generate standalone HTML with embedded CSS.
180- `WithClasses()` - use classes rather than inlined style attributes.
181- `ClassPrefix(prefix)` - prefix each generated CSS class.
182- `TabWidth(width)` - Set the rendered tab width, in characters.
183- `WithLineNumbers()` - Render line numbers (style with `LineNumbers`).
184- `WithLinkableLineNumbers()` - Make the line numbers linkable and be a link to themselves.
185- `HighlightLines(ranges)` - Highlight lines in these ranges (style with `LineHighlight`).
186- `LineNumbersInTable()` - Use a table for formatting line numbers and code, rather than spans.
187
188If `WithClasses()` is used, the corresponding CSS can be obtained from the formatter with:
189
190```go
191formatter := html.New(html.WithClasses(true))
192err := formatter.WriteCSS(w, style)
193```
194
195## More detail
196
197### Lexers
198
199See the [Pygments documentation](http://pygments.org/docs/lexerdevelopment/)
200for details on implementing lexers. Most concepts apply directly to Chroma,
201but see existing lexer implementations for real examples.
202
203In many cases lexers can be automatically converted directly from Pygments by
204using the included Python 3 script `pygments2chroma_xml.py`. I use something like
205the following:
206
207```sh
208python3 _tools/pygments2chroma_xml.py \
209 pygments.lexers.jvm.KotlinLexer \
210 > lexers/embedded/kotlin.xml
211```
212
213See notes in [pygments-lexers.txt](https://github.com/alecthomas/chroma/blob/master/pygments-lexers.txt)
214for a list of lexers, and notes on some of the issues importing them.
215
216### Formatters
217
218Chroma supports HTML output, as well as terminal output in 8 colour, 256 colour, and true-colour.
219
220A `noop` formatter is included that outputs the token text only, and a `tokens`
221formatter outputs raw tokens. The latter is useful for debugging lexers.
222
223### Styles
224
225Chroma styles are defined in XML. The style entries use the
226[same syntax](http://pygments.org/docs/styles/) as Pygments.
227
228All Pygments styles have been converted to Chroma using the `_tools/style.py`
229script.
230
231When you work with one of [Chroma's styles](https://github.com/alecthomas/chroma/tree/master/styles),
232know that the `Background` token type provides the default style for tokens. It does so
233by defining a foreground color and background color.
234
235For example, this gives each token name not defined in the style a default color
236of `#f8f8f8` and uses `#000000` for the highlighted code block's background:
237
238```xml
239<entry type="Background" style="#f8f8f2 bg:#000000"/>
240```
241
242Also, token types in a style file are hierarchical. For instance, when `CommentSpecial` is not defined, Chroma uses the token style from `Comment`. So when several comment tokens use the same color, you'll only need to define `Comment` and override the one that has a different color.
243
244For a quick overview of the available styles and how they look, check out the [Chroma Style Gallery](https://xyproto.github.io/splash/docs/).
245
246## Command-line interface
247
248A command-line interface to Chroma is included.
249
250Binaries are available to install from [the releases page](https://github.com/alecthomas/chroma/releases).
251
252The CLI can be used as a preprocessor to colorise output of `less(1)`,
253see documentation for the `LESSOPEN` environment variable.
254
255The `--fail` flag can be used to suppress output and return with exit status
2561 to facilitate falling back to some other preprocessor in case chroma
257does not resolve a specific lexer to use for the given file. For example:
258
259```shell
260export LESSOPEN='| p() { chroma --fail "$1" || cat "$1"; }; p "%s"'
261```
262
263Replace `cat` with your favourite fallback preprocessor.
264
265When invoked as `.lessfilter`, the `--fail` flag is automatically turned
266on under the hood for easy integration with [lesspipe shipping with
267Debian and derivatives](https://manpages.debian.org/lesspipe#USER_DEFINED_FILTERS);
268for that setup the `chroma` executable can be just symlinked to `~/.lessfilter`.
269
270## Testing lexers
271
272If you edit some lexers and want to try it, open a shell in `cmd/chromad` and run:
273
274```shell
275go run . --csrf-key=securekey
276```
277
278A Link will be printed. Open it in your Browser. Now you can test on the Playground with your local changes.
279
280If you want to run the tests and the lexers, open a shell in the root directory and run:
281
282```shell
283go test ./lexers
284```
285
286When updating or adding a lexer, please add tests. See [lexers/README.md](lexers/README.md) for more.
287
288## What's missing compared to Pygments?
289
290- Quite a few lexers, for various reasons (pull-requests welcome):
291 - Pygments lexers for complex languages often include custom code to
292 handle certain aspects, such as Raku's ability to nest code inside
293 regular expressions. These require time and effort to convert.
294 - I mostly only converted languages I had heard of, to reduce the porting cost.
295- Some more esoteric features of Pygments are omitted for simplicity.
296- Though the Chroma API supports content detection, very few languages support them.
297 I have plans to implement a statistical analyser at some point, but not enough time.