How `src` executes a campaign spec

This document is meant to help with debugging and troubleshooting the writing and execution of campaign specs with Sourcegraph CLI src.

It explains what happens under the hood when a user uses applies or previews a campaign spec by running src campaign apply or src campaign preview.

Overview

src campaign apply and src campaign preview execute a campaign spec the same way:

Parse the campaign spec
Resolve the namespace
Prepare container images
Resolve repositories
Executing steps
Sending changeset specs
Sending campaign spec
Preview or apply the campaign spec

The difference is the last step: src campaign apply applies the campaign spec where the src campaign preview only prints a URL at which you can preview what would happen if you apply it.

The rest of the document explains each step in more detail.

Parse campaign spec

src reads in, parses and validates the campaign spec YAML specified with the -f flag.

It validates the campaign spec against its schema and does some semantic checks to make sure that, for example, changesetTemplate is specified if steps are specified, or that no feature is used that's not supported by the Sourcegraph instance.

Resolving namespace

src resolves the given namespace in which to apply/preview the campaign spec by sending a GraphQL request to the Sourcegraph instance to fetch the ID for the given namespace name.

If no namespace is specified with -namespace (or -n) then the currently authenticated user is used as the namespace. See "Connect to Sourcegraph" in the CLI docs for details on how to authenticate.

Preparing container images

If the campaign spec contains steps, then for each step src checks its container image to see whether it's already available locally.

To do that it runs docker image inspect --format {{.Id}} -- <container-image-name> to get the specific image ID for the container image.

If that fails with a "No such image" error, src tries to pull the image by running docker image pull <container-image-name> and then running docker image inspect --format {{.Id}} -- <container-image-name> again.

Resolving repositories

src resolves each entry in the campaign spec's on property to produce a unique list of repositories (!) in which to execute the campaign spec's steps.

With an on property like this

on:
  - repositoriesMatchingQuery: lang:go fmt.Sprintf("%d", :[v]) patterntype:structural -file:vendor
  - repositoriesMatchingQuery: repohasfile:README.md
  - repository: github.com/sourcegraph/sourcegraph
  - repository: github.com/sourcegraph/automation-testing
    branch: thorstens-test-branch

src will do the following:

For each repositoriesMatchingQuery it will:
1. Send a request to the Sourcegraph API to execute the search query.
2. Collect each result's repository: the ID, the name, the default branch and the current revision of the default branch. If the search result is a repository result (i.e. a search query of type:repo only produces repositories) that's used. If it's a file match the file match's repository is used.
3. Optional: if the results are file matches, then their path in the repository is also saved, so that they can be used in the steps with templating.
For each repository without a branch it will:
1. Send a request to the Sourcegraph API to get the repository's ID, name, its default branch and the current revision of the default branch.
For each repository with a branch it will:
1. Send a request to the Sourcegraph API to get the repository's ID, and name and the current revision of the specified branch.
It then creates a unique list of all repositories yielded by the previous three steps by going through all repositories and comparing them, skipping those where no current revision of a branch could be resolved, checking whether they're on a supported code host. If they are on unsupported code hosts and no -allow-unsupported flag is given, then a warning is printed and the repositories are not added to the list.

Executing steps

If a campaign spec contains steps then src executes the steps locally, on the machine on which src is run, for each repository yielded by the previous "Resolving repositories" step.

If -clear-cache is not set and it previously executed the same steps for the same repository at the same revision of the base branch, it will try to use cached results instead of re-executing the steps.

The following is what src does for each repository:

1. Download archive and prepare

Download archive of repository. What it does is equivalent to:

curl -L -v -X GET -H 'Accept: application/zip' \
  -H 'Authorization: token <THE_SRC_TOKEN>' \
  'http://sourcegraph.example.com/github.com/my-org/my-repo@refs/heads/master/-/raw' \
  --output ~/tmp/my-repo.zip

Unzip archive into the workspace. Where the workspace lives depends on the workspace mode, which can be controlled by the -workspace flag. The two modes are:

Bind mount mode (the default everywhere except Intel macOS), this will be somewhere on the filesystem, e.g. ~/.cache/sourcegraph/campaigns (see src campaign preview -h for the default value of cache directory, overwrite with -cache)
Volume mount mode (the default on Intel macOS): a Docker volume will be created using docker volume create and attached to all running containers, then removed before src exits

cd into the workspace, which now contains the unzipped archive

In the workspace, create a git repository:

Configure git to not use local configuration (see the code for explanations on what each variable does):

export GIT_CONFIG_NOSYSTEM=1 \
       GIT_CONFIG=/dev/null \
       GIT_AUTHOR_NAME=Sourcegraph \
       [email protected] \
       GIT_COMMITTER_NAME=Sourcegraph \
       [email protected]

Run git init
Run git config --local user.name Sourcegraph
Run git config --local user.email [email protected]
Run git add --force --all
Run git commit --quiet --all -m sourcegraph-campaigns

2. Run the steps

For each step in the campaign specs steps:

Probe container image (the container property of the step) to see whether it has /bin/sh or /bin/bash
Write the step's run command to a temp file on the host, e.g. /tmp-script
Run chmod 644 /tmp-script
Run the Docker container. The exact command will depend on the workspace mode:

Bind:

docker run --rm --init --workdir /work \
  --mount type=bind,source=/unzipped-archive-locally,target=/work \
  --mount type=bind,source=/tmp-script,target=/tmp-file-in-container \
  --entrypoint /bin/bash -- <IMAGE> /tmp-file-in-container

Volume:

docker run --rm --init --workdir /work \
  --mount type=volume,source=temporary-docker-volume-id,target=/work \
  --mount type=bind,source=/tmp-script,target=/tmp-file-in-container \
  --entrypoint /bin/bash -- <IMAGE> /tmp-file-in-container

Add all produced changes to the git index: git add --all

3. Create final diff

In the workspace:

Create a diff by running: git diff --cached --no-prefix --binary

4. Saving a changeset spec

The produced diff is added to the local cache so that re-executing the same steps in the same repository can be skipped if the base branch did not changed.

The diff is then combined with information about the repository in which the changes have been made (the name and ID of the repository, the revision of its base branch) and together with the changesetTemplate turned into a changeset spec: a description of what the changeset should look like.

Importing changesets

If the campaign spec contains importChangesets then src goes through the list of importChangesets and for each entry it will:

Resolve the repository name, trying to get to get an ID, base branch, and revision for the given repository name.
Parse the externalIDs, checking that they're valid strings or numbers.
For each external ID it saves a changeset spec that describes that a changeset with the given external ID, in the given repository, should be imported and tracked in the campaign.

Sending changeset specs

The previous two steps, "Executing steps" and "Importing changesets", can produce changeset specs, each one describing either a changeset to create or to import.

These changeset specs are now uploaded to the connected Sourcegraph instance, one request per changeset spec.

Each request yields an ID that uniquely identifies the changeset spec on the Sourcegraph instance. These IDs are used for the next step.

Sending campaign spec

The IDs of the changeset specs that were created in the previous step, "Sending changeset specs", are collected into a list and used for the next request with which src uploads the campaign spec to the connected Sourcegraph instance.

src creates the campaign spec on the Sourcegraph instance, together with the changeset spec IDs, so that the campaign spec fully describes the desired state of a campaign: its name, its description, and which changesets should be created or imported from which repository on which code host.

That request yields an ID that uniquely identifies this expanded version of the campaign spec.

Preview or apply the campaign spec

If src campaign apply was used, then the ID of the campaign is then used to send another request to the Sourcegraph instance, to apply the campaign spec.

If src campaign preview was used to execute and create the campaign spec, then a URL is printed, pointing to a preview page on the Sourcegraph instance on which we can see what would happen if we were to apply the campaign spec.

How src executes a campaign spec