Cody (experimental)
Cody is an AI coding assistant that lives in your editor that can find, explain, and write code. Cody uses a combination of Large Language Models (LLMs), Sourcegraph search, and Sourcegraph code intelligence to provide answers that eliminate toil and keep human programmers in flow. You can think of Cody as your programmer buddy who has read through all the code in open source, all the questions on StackOverflow, and all your organization's private code, and is always there to answer questions you might have or suggest ways of doing something based on prior knowledge.
Cody is in private alpha (tagged as an experimental feature) at this stage.
- If you are an existing Sourcegraph Enterprise customer or want to use Cody for your team, contact your techical advisor or sign up here to get access
- If you want to try Cody on open source code, sign up here and we'll e-mail you instructions to connect Cody to sourcegraph.com as soon as your account is added.
Currently, Cody is available for VS Code. More editors are on the way—join the Discord to inquire about your editor of choice.
Cody on Sourcegraph.com
Cody uses Sourcegraph to fetch relevant context to generate answers and code. These instructions walk through installing Cody and connecting it to sourcegraph.com. For private instances of Sourcegraph, see the section below about enabling Cody for Enterprise.
- Sign into sourcegraph.com
- Request access here and we'll send you an e-mail you as soon as your account is added. At this time, we are approving all requests.
- Create a Sourcegraph access token
- Install the Cody VS Code extension
- Set the Sourcegraph URL to be
https://sourcegraph.com
- Set the access token to be the token you just created

After installing, we recommend the following:
- See the list of embedded repositories and request any that you'd like to add by pinging a Sourcegraph team member in Discord. Embeddings significantly improve the accuracy and quality of Cody's responses. Note that embeddings are only available for public repositories on sourcegraph.com. If you want to use Cody with embeddings on private code, consider moving to a Sourcegraph Enterprise instance.
- Spread the word online and send us your feedback in Discord. Cody is open source and we'd love to hear from you if you have bug reports or feature requests.
Cody on Sourcegraph Cloud
On Sourcegraph Cloud, Cody is a managed service and you do not need to follow the self-hosted installation guide. Cody can be enabled on demand by contacting your account manager.
Learn more from Cody on Cloud
Cody on your self-hosted Sourcegraph Enterprise instance
There are two steps required to enable Cody for Enterprise: enable your Sourcegraph instance and configure the VS Code extension.
Step 1: Enable Cody on your Sourcegraph instance
Note that this requires site-admin privileges.
- Cody uses one or more third-party LLM (Large Language Model) providers. Make sure you review the Cody usage and privacy notice. In particular, code snippets will be sent to a third-party language model provider when you use the Cody extension.
- To turn Cody on, you will need to set an access token for Sourcegraph to authentify with the third-party large language model provider (currently Anthropic but we may use different or several models over time). Reach out to your Sourcegraph Technical Advisor to get a key.
- Once you have the key, go to Site admin > Site configuration (
/site-admin/configuration
) on your instance and set:
"completions": { "enabled": true, "accessToken": "<token>", "model": "claude-v1", "provider": "anthropic" }
- You're done!
- (Optional). Cody can be configured to use embeddings to improve the quality of its responses. This involves sending your entire codebase to a third-party service to generate a low-dimensional semantic representation, that is used for improved context fetching. See the embeddings section for more.
Step 2: Configure the VS Code extension
Now that Cody is turned on on your Sourcegraph instance, any user can configure and use the Cody VS Code extension. This does not require admin privilege.
- If you currently have a previous version of Cody installed, uninstall it and reload VS Code before proceeding to the next steps.
- Search for “Sourcegraph Cody” in your VS Code extension marketplace, and install it.

-
Reload VS Code, and open the Cody extension. Review and accept the terms.
-
Now you'll need to point the Cody extension to your Sourcegraph instance. On your instance, go to
settings
/access token
(https://<your-instance>.sourcegraph.com/users/<your-instance>/settings/tokens
). Generate an access token, copy it, and set it in the Cody extension.

- In the Cody VS Code extension, set your instance URL and the access token

You're all set!
Step 3: Try Cody!
A few things you can ask Cody:
- "What are popular go libraries for building CLIs?"
- Open your workspace, and ask "Do we have a React date picker component in this repository?"
- Right click on a function, and ask Cody to explain it
- Try any of the Cody recipes!

Embeddings
Embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The useful thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). We are using embeddings to create a search index over an entire codebase which allows us to perform natural language code search over the codebase. Indexing involves splitting the entire codebase into searchable chunks, and sending them to the external service specified in the site config for embedding. The final embedding index is stored in a managed object storage service. The available storage configurations are listed in the next section.
Configuring embeddings
Here is the config for the OpenAI Embeddings API:
"embeddings": { "enabled": true, "url": "https://api.openai.com/v1/embeddings", "accessToken": "<token>", "model": "text-embedding-ada-002", "dimensions": 1536 }
- Navigate to Site admin > Cody (
/site-admin/cody
) and schedule repositories for embedding.
Storing embedding indexes
To target a managed object storage service, you will need to set a handful of environment variables for configuration and authentication to the target service. If you are running a sourcegraph/server deployment, set the environment variables on the server container. Otherwise, if running via Docker-compose or Kubernetes, set the environment variables on the frontend
, embeddings
, and worker
containers.
Using S3
To target an S3 bucket you've already provisioned, set the following environment variables. Authentication can be done through an access and secret key pair (and optional session token), or via the EC2 metadata API.
Warning: Remember never to commit aws access keys in git. Consider using a secret handling service offered by your cloud provider.
EMBEDDINGS_UPLOAD_BACKEND=S3
EMBEDDINGS_UPLOAD_BUCKET=<my bucket name>
EMBEDDINGS_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com
EMBEDDINGS_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>
EMBEDDINGS_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>
EMBEDDINGS_UPLOAD_AWS_SESSION_TOKEN=<your session token>
(optional)EMBEDDINGS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true
(optional; set to use EC2 metadata API over static credentials)EMBEDDINGS_UPLOAD_AWS_REGION=us-east-1
(default)
Note: If a non-default region is supplied, ensure that the subdomain of the endpoint URL (the AWS_ENDPOINT
value) matches the target region.
Using GCS
To target a GCS bucket you've already provisioned, set the following environment variables. Authentication is done through a service account key, supplied as either a path to a volume-mounted file, or the contents read in as an environment variable payload.
EMBEDDINGS_UPLOAD_BACKEND=GCS
EMBEDDINGS_UPLOAD_BUCKET=<my bucket name>
EMBEDDINGS_UPLOAD_GCP_PROJECT_ID=<my project id>
EMBEDDINGS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>
EMBEDDINGS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>
Provisioning buckets
If you would like to allow your Sourcegraph instance to control the creation and lifecycle configuration management of the target buckets, set the following environment variables:
EMBEDDINGS_UPLOAD_MANAGE_BUCKET=true
embeddings
service
Environment variables for the EMBEDDINGS_REPO_INDEX_CACHE_SIZE
: Number of repository embedding indexes to cache in memory (the default cache size is 5). Increasing the cache size will improve the search performance but require more memory resources.