Developing code monitoring

What are code monitors?

In the simplest case, a code monitor runs a user-defined query and alerts theuser whenever a new search result appears. In more general terms, a code monitorlets a user define one trigger and at least one action. A trigger defines acondition and if that condition evaluates to true, the trigger triggers theactions.

Glossary

term description example
code monitor A code monitor is a group of 1 trigger and possibly multiple actions
trigger A trigger defines a condition which is checked regularly. New results for a diff/commit query
event If a condition evaluates to true we call this an event. New results found
action Events trigger actions. Send email

Starting up your environment

Code monitoring is an enterprise functionality. To run it locally you need to start Sourcegraph as enterprise service:

<sourcegraph root>/enterprise/dev/start.sh

Code monitoring is still a prototype which means you have to enable it in thesettings to be visible in the UI. Open your local instance of Sourcegraph and goto settings > User account > settings.

{
  "experimentalFeatures": {
    "codeMonitoring": true
  }
}

Database layout

Table Description
cm_monitors Holds metadata of a code monitor.
cm_queries Contains data for each (trigger) query.
cm_emails Contains data for each (action) email.
cm_recipients Each email action can have multiple recipients. Each recipient can either be a user or an organization. Each row in this table corresponds to one reciepient.
cm_trigger_jobs Contains jobs (past, present, future) to run triggers. Trigger jobs are linked to their triggers via a foreign key.
cm_actions_jobs Contains jobs (past, present, future) to run actions. Actions jobs are linked to their action and to the event that triggered them via foreign keys.

Each type of trigger or type of action is represented by its own table in thedatabase; queries are represented by cm_queries, and emails are represented bycm_emails and cm_recipients. The job tables (cm_trigger_jobs andcm_action_jobs) on the other hand contain the jobs for all types of triggersand actions.

For example: Each type of action is represented by a separate nullable column incm_action_jobs. The dequeue worker reads a record and dispatches based onwhich of the columns is filled. The table below shows cm_action_jobswith twojobs enqueued, one for sending emails (id=1) and one for posting to a webhook(id=2). The details for each action are contained in the records linked to withthe foreign keys in columns email and webhook.

cm_action_jobs

id email webhook state
1 1 null queued
2 null 1 queued

For more details, seeschema.md.

Life of a code monitor

Let's follow the life of a code monitor with a query as a trigger and 1 emailaction.

  1. After you have created a code monitor, the following tables are filled:
    1. cm_monitors (1 entry)
    2. cm_queries (1 entry)
    3. cm_actions (at least 1 entry)
    4. cm_recipients (at least 1 entry)
  2. Enqueue trigger: Periodically, a background job enqueues queries, i.e. for eachactive query (column enabled=true in cm_queries), we create an entry incm_trigger_jobs.
  3. Dequeue trigger/enqueue actions: Periodically, a background worker dequeuesthe trigger job and processes it. In our case the query is run. Thelast_result and next_run are logged to cm_queries, the num_results and the query are logged to cm_query_jobs. If the query returned at least1 result, we call it an event. For each event the corresponding actionsare enqueued in cm_actions_jobs.
  4. Dequeue actions: Periodically, a background worker dequeues the action jobsqueued in cm_action_jobs and processes them. In our cases we retrieve allrelevant information from cm_monitors, cm_trigger_jobs, cm_query,cm_emails, cm_recipients and send out an email to the recipients.
  5. Clean-up: Job logs are deleted after a predefined retention period. Job logswithout search results, are deleted soon after the trigger jobs ran.

Architecture

The back end of code monitoring is split into two parts, the GraphQL API, runningon frontend, and the background workers, running on repo-updater. Both relyon thestore to access the database.

GraphQL API

The GraphQL API is definedhere.The interfaces and stub-resolvers are definedhere,while the enterprise resolvers are definedhere.

Background workers

The backgroundworkers utilize our internal/workerutil framework to run as background jobs onrepo-updater.

Diving into the code as a backend developer

  1. A good start is to visualize the GraphQLschema and interact with it viathe UI Console. Start from the nodeuser and go to monitors from there.
  2. Check out the interfaces and stubresolvers and understand how they relate to the GraphQLschema.
  3. Do the same for the enterpriseresolvers.
  4. Take a look at the backgroundworkers and look through each of the jobs that run in the background.
  5. Start up Sourcegraph locally, connect to your local db instance, create acode monitor from the UI and follow its life cycle in the db. Start bylooking at cm_queries and cm_trigger_jobs. Depending on the search queryyou defined you might have to wait a long time before the first action isenqueued in cm_action_jobs. You can accelerate the process by backdatingcolumns last_result and next_run to the past.