Precise code intelligence relies on LSIF (Language Server Index Format) data to deliver precomputed code intelligence. It provides fast and highly accurate code intelligence but needs to be periodically generated and uploaded to your Sourcegraph instance. Precise code intelligence is an opt-in feature: repositories for which you have not uploaded LSIF data will continue to use the basic search based code intelligence.
First navigate to your global settings on your Sourcegraph instance and enable LSIF:
Then select a language specific guide from the list below to generate and upload LSIF files for your repository. The LSIF data is used by Sourcegraph instances to power code intelligence requests (such as hovers, definitions, and references). If you don’t see a guide for the language you need below, follow our general quickstart guide to setup precise code intelligence.
After completing the initial setup, follow the continuous integration guide to automate indexing of code changes as part of your CI/CD practice.
Cross-repository code intelligence will only be powered by LSIF when both repositories have LSIF data. When the current file has LSIF data and the other repository doesn’t, the missing precise results will be supplemented with imprecise search-based code intelligence.
If LSIF data is not found for a particular file in a repository, Sourcegraph will fall back to basic code intelligence. You may occasionally see results from basic code intelligence even when you have uploaded LSIF data. Such results are indicated with a tooltip. This can happen in the following scenarios:
The following table gives a rough estimate for the space and time requirements for indexing and conversion. These repositories are a representative sample of public Go repositories available on GitHub. The working tree size is the size of the clone at the given commit (without git history), the number of files indexed, and the number of lines of Go code in the repository. The index size gives the size of the uncompressed LSIF output of the indexer. The conversion size gives the total amount of disk space occupied after uploading the dump to a Sourcegraph instance.
|Repository||Working tree size||Index time||Index size||Processing time||Post-processing size|
|bigcache||216KB, 32 files, 2.585k loc||1.18s||3.5MB||0.45s||0.6MB|
|sqlc||396KB, 24 files, 7.041k loc||1.53s||7.2MB||1.62s||1.6MB|
|nebula||700KB, 71 files, 10.704k loc||2.48s||16MB||1.63s||2.9MB|
|cayley||5.6MB, 226 files, 36.346k loc||5.58s||51MB||4.68s||11MB|
|go-ethereum||27MB, 945 files, 317.664k loc||20.53s||255MB||77.40s||50MB|
|kubernetes||301MB, 4577 files, 1.550m loc||1.21m||910MB||80.06s||162MB|
|aws-sdk-go||119MB, 1759 files, 1.067m loc||8.20m||1.3GB||155.82s||358MB|
The bulk of LSIF data is stored on-disk, and as code intelligence data for a commit ages it becomes less useful. Sourcegraph will automatically remove the least recently uploaded data if the amount of used disk space exceeds a configurable threshold. This value defaults to 10 GiB (10⨉2^30 = 10737418240 bytes), and can be changed via the
DBS_DIR_MAXIMUM_SIZE_BYTES environment variable.
To learn more, check out our lightning talk about LSIF from GopherCon 2019 or the introductory blog post: