Find answers to the most common questions about Cody.
Does Cody train on my code?
No, Cody does not train on your code. Our third-party Language Model (LLM) providers also do not train on your specific codebase. Cody operates by following a specific process to generate answers to your queries:
- User query: A user asks a question
- Code retrieval: Sourcegraph, our underlying code intelligence platform, performs a search and code intelligence operation to retrieve code snippets relevant to the user's question. During this process, strict permissions are enforced to ensure that only code that the user has read permission for is retrieved
- Prompt to Language Model: Sourcegraph sends a prompt, and the code snippets are retrieved to a Language Model (LLM). This prompt provides the context for the LLM to generate a meaningful response
- Response to user: The response generated by the LLM is then sent back to Cody and presented to the user
This process ensures that Cody can provide helpful answers to your questions while respecting data privacy and security by not training on or retaining your specific code.
Does Cody work with self-hosted Sourcegraph?
Yes, Cody is compatible with self-hosted Sourcegraph instances. However, there are a few considerations:
- Cody operates by sending code snippets (up to 28 KB per request) to a third-party cloud service. By default, this service is Anthropic but can also be OpenAI
- For certain repositories, Cody may utilize embeddings, which involves sending repository data to another third-party service like OpenAI
- To use Cody effectively, your self-hosted Sourcegraph instance must have internet access for these interactions with external services
Is there a public facing Cody API?
Currently, there is no public-facing Cody API available.
Does Cody require Sourcegraph to function?
Yes, Cody relies on Sourcegraph for two essential functions:
- It is used to retrieve context relevant to user queries
- Sourcegraph acts as a proxy for the LLM provider to facilitate the interaction between Cody and the LLM
What programming languages Cody supports?
Cody supports a wide range of programming languages, including:
What are embeddings for?
Embeddings help Sourcegraph retrieve relevant code to feed the Large Language Model as context. Embeddings, often associated with vector search, complement other strategies in the code retrieval process.
While embeddings excel in semantic matching — determining "what is this code about" and "what does it do" — they may not capture syntax and other specific matching details as effectively. Sourcegraph's approach involves getting the best results from various sources to deliver the most accurate and comprehensive answers possible.
Do embeddings enforce permissions? Does Cody receive code that users don't have access to?
When using embeddings, permissions are enforced to ensure Cody does not receive code the user cannot access. Currently, Sourcegraph uses embeddings search for a single repository, with a prior check to confirm user access.
In the future, the process will involve the following steps:
- Determine which repositories the user has access to
- Query embeddings for each of these repositories
- Select the most relevant results and provide them to the user
This approach safeguards data privacy and ensures that Cody's responses are based on code accessible to the user.
Why isn't my scheduled embedding job listed?
There can be several reasons why your scheduled one-off embedding job isn't appearing in the job list:
- The repository is already in the queue or currently being processed
- The system has successfully completed a job for the same repository and revision
- Another job for the same repository is in the queue, scheduled within the
How do I stop a running embeddings job?
A running embeddings job with the state
PROCESSING can be stopped by admins from the Cody > Embeddings Jobs page. To do so:
- Click on the "Cancel" button associated with the job you wish to terminate
- The job will then be tagged for cancellation. Please note that the time required for the job to be fully canceled may vary depending on its current state, ranging from a few seconds to a few minutes
Why are files skipped?
Files may be skipped for the following reasons:
- The file size exceeds 1 MB
- The file path matches an exclusion pattern
- The repository has already reached the maximum limit for generated embeddings, as specified by
Third party dependencies
What is the default
sourcegraph provider for completions and embeddings?
The default provider for completions and embeddings, specified as
"provider": "sourcegraph" refers to the Sourcegraph Cody Gateway. The Cody Gateway facilitates access to completions and embeddings for Sourcegraph enterprise instances by leveraging third-party services such as Anthropic and OpenAI.
What third-party cloud services does Cody depend on?
Cody relies on one primary third-party dependency, i.e., Anthropic's Claude API. Users can use this with the OpenAI API configuration.
Additionally, Cody can optionally use OpenAI for generating embeddings, enhancing the quality of its context snippets, although this is not mandatory.
It's worth noting that these dependencies remain consistent when utilizing the default
sourcegraph provider, Cody Gateway, which uses the same third-party providers.
What is the retention policy for Anthropic and OpenAI?
Please refer to this terms and conditions for details regarding the retention policy for data managed by Anthropic and OpenAI.
Can I use my own API keys?
Yes! you can use your own API keys.
Can I use with my Cloud IDE?
Yes, Cody supports the following cloud development environments:
- vscode.dev and GitHub Codespaces (install from the VS Code extension marketplace)
- Any editor supporting the Open VSX Registry, including Gitpod, Coder, and
code-server(install from the Open VSX Registry)
For more information on what to do next, we recommend the following resources: