Campaigns design doc

Why are campaigns designed the way they are?

Principles

  • Declarative API (not imperative). You declare your intent, such as "lint files in all repositories with a package.json file". The campaign figures out how to achieve your desired state. The external state (of repositories, changesets, code hosts, access tokens, etc.) can change at any time, and temporary errors frequently occur when reading and writing to code hosts. These factors would make an imperative API very cumbersome because each API client would need to handle the complexity of the distributed system.
  • Define a campaign in a file (not some online API). The source of truth of a campaign's definition is a file that can be stored in version control, reviewed in code review, and re-applied by CI. This is in the same spirit as IaaC (infrastructure as code; e.g., storing your Terraform/Kubernetes/etc. files in Git). We prefer this approach over a (worse) alternative where you define a campaign in a UI with a bunch of text fields, checkboxes, buttons, etc., and need to write a custom API client to import/export the campaign definition.
  • Shareable and portable. You can share your campaign specs, and it's easy for other people to use them. A campaign spec expresses an intent that's high-level enough to (usually) not be specific to your own particular repositories. You declare and inject configuration and secrets to customize it instead of hard-coding those values.
  • Large-scale. You can run campaigns across 10,000s of repositories. It might take a while to compute and push everything, and the current implementation might cap out lower than that, but the fundamental design scales well.
  • Accommodates a variety of code hosts and review/merge processes. Specifically, we don't to limit campaigns to only working for GitHub pull requests. (See current support list.)

Comparison to other distributed systems

Kubernetes is a distributed system with an API that many people are familiar with. Campaigns is also a distributed system. All APIs for distributed systems need to handle a similar set of concerns around robustness, consistency, etc. Here's a comparison showing how these concerns are handled for a Kubernetes Deployment and a Sourcegraph campaign. In some cases, we've found Kubernetes to be a good source of inspiration for the campaigns API, but resembling Kubernetes is not an explicit goal.

Kubernetes Deployment Sourcegraph campaign
What underlying thing does this API manage? Pods running on many (possibly unreliable) nodes Branches and changesets on many repositories that can be rate-limited and externally modified (and our authorization can change)
Spec YAML
# File: foo.Deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
# Evaluate this to enumerate instances of... replicas: 2
# ...this template. template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80
# File: hello-world.campaign.yaml
name: hello-world
description: Add Hello World to READMEs
# Evaluate this to enumerate instances of... on: - repositoriesMatchingQuery: file:README.md

steps:

  • run: echo Hello | tee -a $(find -name '*.md') container: alpine:3
# ...this template. changesetTemplate: title: Hello World body: My first campaign! branch: hello-world commit: message: Append Hello to .md files published: false
How desired state is computed
  1. Evaluate replicas, etc. (blue) to determine pod count and other template inputs
  2. Instantiate template (pink) once for each pod to produce PodSpecs
  1. Evaluate on, steps (blue) to determine list of patches
  2. Instantiate changesetTemplate (purple) once for each patch to produce ChangesetSpecs
Desired state consists of...
  • DeploymentSpec file (the YAML above)
  • List of PodSpecs (template instantiations)
  • CampaignSpec file (the YAML above)
  • List of ChangesetSpecs (template instantiations)
Where is the desired state computed? The deployment controller (part of the Kubernetes cluster) consults the DeploymentSpec and continuously computes the desired state.

The Sourcegraph CLI (running on your local machine, not on the Sourcegraph server) consults the campaign spec and computes the desired state when you invoke src campaign apply.

Difference vs. Kubernetes: A campaign's desired state is computed locally, not on the server. It requires executing arbitrary commands, which is not yet supported by the Sourcegraph server. See campaigns known issue "Campaign steps are run locally...".

Reconciling desired state vs. actual state The "deployment controller" reconciles the resulting PodSpecs against the current actual PodSpecs (and does smart things like rolling deploy). The "campaign controller" (i.e., our backend) reconciles the resulting ChangesetSpecs against the current actual changesets (and does smart things like gradual roll-out/publishing and auto-merging when checks pass).

These docs explain more about Kubernetes' design: