Indexing a C++ repository with LSIF
For C and C++ code, we recommend following the steps in the lsif-clang repository. They describe how you can use a prebuilt binary bundle or build
lsif-clang from source. Those are easier to set up compared to using Docker. However, we describe how to use Docker as an alternative approach below.
Local dev setup
Copy the files in the
lsif-dockerdirectory of sourcegraph/tesseract to a local
lsif-dockerdirectory in your C++ repository (the one you wish to index).
Replace the contents of
lsif-docker/install_build_deps.shwith commands that install any requisite build dependencies of the project. These should be dependencies that do not vary from revision to revision.
lsif-docker/checkout.shto clone your repository to the
/sourcedirectory in the Docker container’s filesystem.
lsif-docker/gen_compile_commands.shto generate a compilation database (
If you use autotools to build your project (
./autogen.sh && ./configure && make), you can probably keep the existing contents.
If you build your project using CMake, you can use
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
If you use Bazel, you can use bazel-compilation-database:
git clone --depth=10 https://github.com/grailbio/bazel-compilation-database.git /bazel-compilation-database /bazel-compilation-database/generate.sh
If you use another build system or if any of the above steps break, consult this very helpful guide to generating compilation databases for various build systems. It may be helpful to
docker buildyour container and
docker run -it $IMAGEto get an interactive shell into the container, so you can ensure the build environment is correct. We recommend getting the project to build normally first (e.g., emit a binary) and then following the aforementioned guide to modify the regular build steps to emit a compilation database.
- Most often, the
compile_commands.jsonfile will be emitted in the root directory of the repository. If this is not the case, you’ll also need to modify
cdinto the directory containing it and then run
lsif-clang --project-root=/source compile_commands.json. If you’re unsure of where
compile_commands.jsonwill be emitted, just continue to the next step for now.
- Most often, the
docker build lsif-dockerto build the Docker image.
Generate a Sourcegraph access token from your Sourcegraph instance (Settings > Access tokens). Give it
Run the following command to generate and upload LSIF data to Sourcegraph:
docker run -e SRC_ACCESS_TOKEN=$ACCESS_TOKEN -e SRC_ENDPOINT=https://sourcegraph.example.com -e PROJECT_REV=HEAD $IMAGE_ID
with the following substitutions:
SRC_ACCESS_TOKEN=: the Sourcegraph access token you just created
SRC_ENDPOINT=: the URL to your Sourcegraph instance
PROJECT_REV=: the revision of your repository to be indexed
$IMAGE_ID: the ID of the Docker image you just built
If successful, you should see the upload visible in the repository settings page like this.
For reference, some examples of Dockerized C++ LSIF generation are:
Incorporating LSIF generation and uploading in CI will allow precise code navigation to remain up-to-date without any human intervention.
If you created a
Dockerfile that encapsulates LSIF generation, you can use the same one in your CI
docker run command fails, you likely have an error in one of the
files. The general rule is if you can get your project to build normally (i.e., generate an
executable), you can get the LSIF indexer to generate LSIF. So we recommend the following approach
if things don’t work on the first try:
- Build the Docker image:
docker build lsif-docker
- Run the container with an interactive shell:
docker run -it $IMAGE_ID bash
- In the container shell,
cd /sourceand figure out what steps are needed to build the project.
- Once the build successfully completes, figure out which steps are needed to generate the
compile_commands.jsonfile. We have found this guide to be a useful resource.
- Once you’ve successfully generated
cdinto the directory containing
lsif-clang --project-root=/source compile_commands.json. This should generate a
dump.lsiffile in the same directory. This
dump.lsifshould contain JSON describing all the symbols and references in the codebase (it should be rather large).
- Once the
dump.lsiffile is generated correctly, set the environment variables
SRC_ENDPOINTto the appropriate values in your shell. Then run
src code-intel uploadfrom the directory containing the
lsif.dumpfile. This should successfully upload the LSIF dump to Sourcegraph.
- After you’ve successfully done all of the above in the container’s interactive shell, incorporate
these steps into the
lsif-docker/*.shfiles. Then re-build the Docker container and try running