Life of a ping

Each instance of Sourcegraph periodically sends an HTTP request to sourcegraph.com to check if an update is available and to send some anonymized aggregated statistics.

On sourcegraph.com, this request is handled here by responding with 204 No Content if the instance version is the newest available, or 200 OK with the new version string. In both cases, the request data is serialized into a JSON blob and published to a Pub/Sub topic.

A Dataflow job subscribes to the topic, transforms the message payload into a BigQuery table. This job will do a small pre-transform (defined here) before being converted into the BigQuery table's schema. These records are inserted into the table in batch.

Caution: Data sent into BigQuery must match the table schema exactly. You cannot send extra fields or the insertion will fail, which will cause permanent data loss until the issue is fixed.

BigQuery scheduled jobs are used to populate materialized views into the update checks table for easier analysis and visualization in external tools such as Looker. For example, this scheduled query populates a table with daily usage count for each instance.