How precise code intelligence queries are resolved

Precise code intelligence results are obtained by making GraphQL requests to the frontend service. The code intelligence extensions are example consumer of this API, and its documentation details how code intelligence results are used.

Definitions

A definitions request returns the set of locations that define the symbol at a particular location (defined uniquely by a repository, commit, path, line offset, and character offset). The sequence of actions required to resolve a definitions query is shown below (click to enlarge).

First, the repository, commit, and path inputs are used to determine the set of LSIF uploads that can answer queries for that data. Such an upload may have been indexed on another commit. In this case, the output of git diff between the two commits is used to adjust the input path and line number.

The adjusted path and position is used to query the definitions at that position using the selected upload data. If a definition is local to the upload, the LSIF store can resolve the query without any additional data. If the definition is remote (defined in a different root of the same repository, or defined in a different repository), the import monikers of the symbol at the adjusted path and position in the selected upload are determined, as are the package information data of those monikers. Using an upload that provides one of the selected packages, definitions of the associated moniker are queried from the codeintel database.

If the resulting locations were provided by an upload that was indexed on a commit distinct from the input commit, git diff is used to again re-adjust the results to the target commit.

Code appendix

References

A references request returns the set of locations that reference the symbol at a particular location (defined uniquely by a repository, commit, path, line offset, and character offset). Unlike the set of definitions, which should generally have only member, the set of references can unbounded for popular repositories. The resolution of references is therefore done in chunks, allowing the user to request reference results page-by-page. The sequence of actions required to resolve a references query is shown below (click to enlarge).

First, the repository, commit, and path inputs are used to determine the set of LSIF uploads that can answer queries for that data. Such an upload may have been indexed on another commit. In this case, the output of git diff between the two commits is used to adjust the input path and line number.

A references request optionally supplies a cursor that encodes the state of the previous request (if any). If a cursor is supplied, it is decoded and validated. Otherwise, one is created with some additional state including the repository, commit, adjusted path and position, the selected indexes providing intelligence for this result set, monikers attached to the range intersecting the input position, and the index defining the target symbol (if remotely defined). Note that this step may be repeated over multiple uploads: each upload returned in the previous step will have its own cursor, encoded/decoded independently at the GraphQL resolver layer.

The cursor decoded or created above is used to drive the resolution of the current page of results. While the number of results in the current page is less than the requested number of results, another batch of locations is requested using the current cursor and it is appended to the current page. This cursor is ultimately sent back to the client so they can make a subsequent request, and is also used as the new current cursor if a subsequent batch of locations is requested.

Each batch of locations is fetched using the contents of the request's cursor. If the cursor is not in its remote phase, it will fetch additional locations from the index that is providing intelligence. Once these locations are exhausted, the cursor switches to the remote phase. In this phase, a batch of remote dumps referencing one of the monikers attached to the range intersecting the input position is used as the set of indexes from which to fetch the next sequence of location results.

For each returned batch of locations, if the resulting locations were provided by an upload that was indexed on a commit distinct from the input commit (only possible for non-remote locations), git diff is used to again re-adjust the results to the target commit.

Code appendix

Hover

A hover request returns the hover text associated with the symbol at a particular location (defined uniquely by a repository, commit, path, line offset, and character offset), as well as the range of the hovered symbol. The sequence of actions required to resolve a hover query is shown below (click to enlarge).

First, the repository, commit, and path inputs are used to determine the set of LSIF uploads that can answer queries for that data. Such an upload may have been indexed on another commit. In this case, the output of git diff between the two commits is used to adjust the input path and line number.

The adjusted path and position is used to query the hover text at that position using the selected upload data. If there is no hover text associated with a reference (which may be the case for indexers that do not provide third-party hover text), we attempt to resolve the location of the definition in another dump. This moniker search is almost identical to the flow within the definitions resolver. Once a definition location is known, its hover text can be queried directly by position.

If the resulting locations were provided by an upload that was indexed on a commit distinct from the input commit, git diff is used to again re-adjust the results to the target commit.

Code appendix