Previous: Multiline Font Lock Constructs, Up: Font Lock Mode [Contents][Index]
Besides simple syntactic font lock and regexp-based font lock, Emacs also provides complete syntactic font lock with the help of a parser. Currently, Emacs uses the tree-sitter library (see Parsing Program Source) for this purpose.
Parser-based font lock and other font lock mechanisms are not mutually exclusive. By default, if enabled, parser-based font lock runs first, replacing syntactic font lock, then the regexp-based font lock.
Although parser-based font lock doesn’t share the same customization variables with regexp-based font lock, it uses similar customization schemes. The tree-sitter counterpart of font-lock-keywords is treesit-font-lock-settings.
In general, tree-sitter fontification works as follows:
font-lock-keyword
would be highlighted in font-lock-keyword
face.
For more information about queries, patterns, and capture names, see Pattern Matching Tree-sitter Nodes.
To setup tree-sitter fontification, a major mode should first set
treesit-font-lock-settings
with the output of
treesit-font-lock-rules
, then call
treesit-major-mode-setup
.
This function is used to set treesit-font-lock-settings. It takes care of compiling queries and other post-processing, and outputs a value that treesit-font-lock-settings accepts. Here’s an example:
(treesit-font-lock-rules :language 'javascript :feature 'constant :override t '((true) @font-lock-constant-face (false) @font-lock-constant-face) :language 'html :feature 'script "(script_element) @font-lock-builtin-face")
This function takes a series of query-specs, where each query-spec is a query preceded by one or more :keyword/value pairs. Each query is a tree-sitter query in either the string, s-expression or compiled form.
For each query, the :keyword/value pairs that
precede it add meta information to it. The :language
keyword
declares query’s language. The :feature
keyword sets the
feature name of query. Users can control which features are
enabled with treesit-font-lock-level
and
treesit-font-lock-feature-list
(described below). These two
keywords are mandatory.
Other keywords are optional:
Keyword | Value | Description |
---|---|---|
:override | nil | If the region already has a face, discard the new face |
t | Always apply the new face | |
append | Append the new face to existing ones | |
prepend | Prepend the new face to existing ones | |
keep | Fill-in regions without an existing face |
Lisp programs mark patterns in query with capture names (names
that starts with @
), and tree-sitter will return matched nodes
tagged with those same capture names. For the purpose of
fontification, capture names in query should be face names like
font-lock-keyword-face
. The captured node will be fontified
with that face.
Capture names can also be function names, in which case the function
is called with 4 arguments: node and override, start
and end, where node is the node itself, override is
the override property of the rule which captured this node, and
start and end limits the region in which this function
should fontify. (If this function wants to respect the override
argument, it can use treesit-fontify-with-override
.)
Beyond the 4 arguments presented, this function should accept more arguments as optional arguments for future extensibility.
If a capture name is both a face and a function, the face takes priority. If a capture name is neither a face nor a function, it is ignored.
This is a list of lists of feature symbols. Each element of the list
is a list that represents a decoration level.
treesit-font-lock-level
controls which levels are
activated.
Each element of the list is a list of the form (feature …)
, where each feature corresponds to the
:feature
value of a query defined in
treesit-font-lock-rules
. Removing a feature symbol from this
list disables the corresponding query during font-lock.
Common feature names, for many programming languages, include
definition
, type
, assignment
, builtin
,
constant
, keyword
, string-interpolation
,
comment
, doc
, string
, operator
,
preprocessor
, escape-sequence
, and key
. Major
modes are free to subdivide or extend these common features.
Some of these features warrant some explanation: definition
highlights whatever is being defined, e.g., the function name in a
function definition, the struct name in a struct definition, the
variable name in a variable definition; assignment
highlights
the whatever is being assigned to, e.g., the variable or field in an
assignment statement; key
highlights keys in key-value pairs,
e.g., keys in a JSON object, or a Python dictionary; doc
highlights docstrings or doc-comments.
For example, the value of this variable could be:
((comment string doc) ; level 1 (function-name keyword type builtin constant) ; level 2 (variable-name string-interpolation key)) ; level 3
Major modes should set this variable before calling
treesit-major-mode-setup
.
For this variable to take effect, a Lisp program should call
treesit-font-lock-recompute-features
(which resets
treesit-font-lock-settings
accordingly), or
treesit-major-mode-setup
(which calls
treesit-font-lock-recompute-features
).
A list of settings for tree-sitter based font lock. The exact format
of each setting is considered internal. One should always use
treesit-font-lock-rules
to set this variable.
Multi-language major modes should provide range functions in
treesit-range-functions
, and Emacs will set the ranges
accordingly before fontifing a region (see Parsing Text in Multiple Languages).
Previous: Multiline Font Lock Constructs, Up: Font Lock Mode [Contents][Index]