Emacs 29 tree-sitter support
If you follow Emacs' development, you’ll probably have heard that the upcoming 29 release will have support for tree-sitter, which is an incremental code parser. In short, it provides a syntax tree for the source file that you’re currently viewing. This is especially useful for syntax highlighting as well as code navigation. Since it builds a syntax tree for the language, we can actually get proper highlighting that isn’t regex dependent, Emacs' highlighting strategy for most languages. As such, we’ll get faster highlighting and avoid re-fontification when simple characters are missing, like commas or equal signs.
The ability to navigate the code will also be enhanced, since it’ll now be possible to be more specific during the code editing process: getting variables inside a specific class, act on all function, act on inner blocks, etc. There is already some works to do this, in the form of tree-edit. A nice user on reddit also added a list of structural editing modes for Emacs.
The new support for tree-sitter can be found by building from the
branch or building from the main branch. It has been dubbed
treesit.el and is
already documented; you’ll have to type
C-h i to open the info pages. After,
navigate to “Elisp” and then “Parsing Program Source”, which is the information
page detailing the use of this new package.
Right now, some modes like python-mode or c-mode have a tree-sitter equivalent, python-ts-mode and c-ts-mode respectively; there are other modes as well, you just need to search for that extra “ts” in the mode’s name. If you try to use those modes though, you’ll get an error explaining that the corresponding library is not available. That’s because you need, on top of installing tree-sitter, to install a language’s corresponding parsing library. There are already some notes on how to install those languages, as well as a script that automates it.
I wanted a simple Makefile alternative though, which is explained below.
Building the tree-sitter parsers manually
The following Makefile rests on some assumptions:
You already have tree-sitter installed and built Emacs after installing tree-sitter and with the
You have a Unix like system, though this should be easily to change for a DOS like system.
You’re storing the different parsers in the same directory.
You’re putting the shared libraries in the
~/.emacs.d/tree-sitter/folder. If not, you need to point Emacs to the proper directory with the
treesit-extra-load-pathvariable in your
For example, I have a directory called
pkgs/langs/ where I put the packages I
manage manually and I added a languages sub-folder where I clone all the
repositories, which you can find here, on the project’s page. If I wanted to
download the Python parser, I would do:
cd ~/pkgs/langs/ git clone firstname.lastname@example.org:tree-sitter/tree-sitter-python.git
Now, I have the following Makefile in the
pkgs/langs/ directory which builds
all the languages and stores the parsers (which are shared libraries) in the
CC := gcc # or clang SUBDIRS := $(wildcard tree-sitter-*) EXT := so # dylib on MacOS and dll on Windows SRC_DIR := src CPPFLAGS := -shared -fPIC -g -O2 OUTPUT_DIR := $(HOME)/.emacs.d/tree-sitter # HOME is defined on Unix like systems EXECS := $(patsubst %,$(OUTPUT_DIR)/lib%.$(EXT),$(SUBDIRS)) # tree-sitter-python -> libtree-sitter-python.so # For each subdir, find the files in the source folder which are parser.c* or scanner.c* FILES := $(foreach dir,$(SUBDIRS),$(wildcard $(dir)/$(SRC_DIR)/parser.c* $(dir)/$(SRC_DIR)/scanner.c*)) all: $(SUBDIRS) $(SUBDIRS): # Compiler + flags + include source + filter only files from current dir and output with correct extension $(CC) $(CPPFLAGS) -I$@/$(SRC_DIR) $(filter $@%,$(FILES)) -o $(OUTPUT_DIR)/lib$@.$(EXT) clean: rm -f $(EXECS) # use the del command on DOS systems # avoids considering all, clean or the subdirs as files and override targets .PHONY: all clean $(SUBDIRS)
Now, you can simply run:
make -j12 # or whatever number of cores you have
Native Emacs solution
I learned that as of December 30th 2022, there is a native solution to
install the grammars. One only needs to specify
with the languages and their corresponding URL to the parser repository. This
provides the required shared library in the
.emacs.d/tree-sitter directory. I
also took the liberty to add a function that installs (or updates) all the
currently available parsers.
Like that, you can either use the interactive
treesit-install-language-grammar function to install or update one specific
language from the list, or my own function to update them all.