Executors are Sourcegraph's solution for running untrusted code in a secure and controllable way.
To deploy executors to target your Sourcegraph instance, follow our deployment guide. We currently provide resources to deploy on Google Cloud, AWS, and bare-metal machines.
Why use executors?
Running untrusted code is a core requirement of features such as precise code intelligence auto-indexing, and running batch changes server-side.
Auto-indexing jobs, in particular, require the invocation of arbitrary and untrusted code to support the resolution of project dependencies. Invocation of post-install hooks, use of insecure package management tools, and package manager proxy attacks can create opportunities in which an adversary can gain unlimited use of compute or exfiltrate data. The latter outcome is particularly dangerous for on-premise installations of Sourcegraph, which is the chosen option for companies wanting to maintain strict privacy of their code property.
Instead of performing this work within the Sourcegraph instance, where code is available on disk and unprotected internal services are available over the local network, we move untrusted compute into a sandboxed environment, the executor, that has access only to the clone of a single repository on disk (its workspace) and to the public internet.
How it works
Compute jobs are coordinated by the executor binary, which polls a configured Sourcegraph instance for work over HTTPS. There is no need to forward ports or provide incoming firewall access, and the executors can be run across any number of machines and networks.
When a compute job is available, it will be handed out to an executor polling for work. After accepting a job, the executor spawns an empty Firecracker microVM via Waveworks Ignite. A workspace prepared with the target repository is moved into virtual machine. A series of Docker commands are invoked inside of the microVM, which generally produces an artifact on disk to send back to the Sourcegraph instance via src CLI. The status and logs of this compute job are streamed back to the Sourcegraph instance as the job progresses.
We perform layered security/security in-depth at untrusted boundaries. Untrusted code is run only within a fresh virtual machine, and the host machine running untrusted code does not have privileged access to the Sourcegraph instance. The API to which the executor instances can authenticate provides only the exact data needed to perform the job. See Firecracker: Lightweight Virtualization for Serverless Applications for an in-depth look at the isolation model provided by the Firecracker Virtual Machine Monitor (VMM).