Telemetry

Telemetry describes the logging of user events, such as a page view or search, from various components of the Sourcegraph and Cody applications. There are currently two ways to log product telemetry:

  • legacy mechanisms outlined in DEPRECATED: Telemetry, including writing directly to the event_logs database table or using mutation { logEvent }.
  • the new telemetry framework introduced in Sourcegraph 5.2 and later (documented on this page)

All usages of old telemetry mechanisms should be migrated to the new framework.

Why a new framework and APIs?

The new telemetry framework and API aims to address the following issues:

  • The existing event_logs parameters are arbitrarily shaped - to provide stronger guarantees against accidentally exporting sensitive data, the new APIs enforce stricter requirements, such as numeric metadata - see recording events for more details.
  • The shape of existing event_logs have grown organically over time without a clear structured schema. Callsites must construct full events on their own, and we cannot easily prune event objects of potentially sensitive attributes before export.

Events recorded in the new framework and APIs are still translated into the existing event_logs table for admin analytics on a best-effort basis - see event lifecycle for more details.

Event lifecycle

All events stay in the instance that events are recording in until they get exported - users of standalone Sourcegraph instances should no longer report any telemetry directly to the Sourcegraph.com deployment, and should instead report events to their own Sourcegraph instance.

In general, the lifecycle of an event in the new system looks like this:

  1. A telemetry event is recorded. This can happen in clients using SDKs like @sourcegraph/telemetry, or using internal/telemetry/telemetryrecorder in the backend.
  2. Within each telemetry SDK, additional metadata is automatically injected - in clients through processors and the GraphQL mutation, and in the backend through the events adapter.
  3. The telemetry event is translated into the existing event_logs table (for use in admin analytics), and stored in a temporary queue for export - see storing events.
  4. Periodically, events are exported from the cache and exported to Sourcegraph's Telemetry Gateway service, which forwards it to our data warehouse - see exported events and exporting events.

See telemetry export architecture for more details.

Recording events

Recording events can be done via recording APIs available on each of the platforms documented below:

Note that:

  • Recording APIs are intentionally stricter and have a smaller surface area than the full events we end up exporting. This make it clear what properties should be injected in a uniform manner serverside instead of being constructed ad-hoc by callers - see event lifecycle for details.
  • Metadata that gets exported by default only accepts numeric values. This offers a guard against accidentally exporting sensitive data. Arbitrarily shaped metadata can be collected, but not exported, via the additionalMetadata parameter - see sensitive attributes.
  • An escape hatch to export arbitrarily shaped metadata is available via an instance-side allowlist - see sensitive attributes.

Clients

Clients (web apps, extensions, etc) should use @sourcegraph/telemetry, providing client-specific metadata and implementation for exporting to a Sourcegraph instance's mutation { telemetry { recordEvent(...) }} GraphQL mutation. sourcegraph/cody#1192 is a pull request demonstrating how to integrate @sourcegraph/telemetry into a client by extending specific classes and providing backing implementations for various interfaces.

Cody extensions

VS Code

Event-recording development documentation for the VS Code extension is available in sourcegraph/cody/vscode/CONTRIBUTING.md's "Telemetry events" section.

Cody Agent

Sourcegraph web app

A shared event recorder for web app components is available in the platform context type, under (PlatformContext).telemetryRecorder:

import type { PlatformContext } from '@sourcegraph/shared/src/platform/context'

In the web app, if a component has PlatformContext available, the telemetryRecorder instance can be used directly - otherwise, it can be prop-drilled in from the closest parent component with PlatformContext available.

Backend services

In the backend, events are recorded using EventRecorder instances created from the internal/telemetry/telemetryrecorder package. For example:

import (
  "github.com/sourcegraph/sourcegraph/internal/telemetry"
  "github.com/sourcegraph/sourcegraph/internal/telemetry/telemetryrecorder"
)

func doMyThing(db database.DB) error {
  recorder := telemetryrecorder.New(db)

  if err := recorder.Record("myFeature", "myAction", telemetry.EventParameters{
    Version:         0,
    Metadata:        telemetry.EventMetadata{"my_metadata": 12},
    // See 'Sensitive attributes'
    PrivateMetadata: map[string]any{"my_private_metadata": 42},
  }); err != nil {
    return err
  }
}

If you don't care about failures to record telemetry, you can use telemetryrecorder.NewBestEffort(log.Logger, database.DB) to automatically have errors logged and not returned.

Note that not all attributes are exported - see Sensitive attributes for details.

Exported events

See telemetry export architecture for more details on how exporting events works.

A detailed schema is available in the Telemetry Gateway protocol documentation, which also has more details about what kind of data gets exported and what components are generally pruned.

Exported event schema

The full event schema is intentionally a significant superset from the shape of the event-recording APIs. Standardized metadata (users, feature flags, etc) are automatically added at various points in an event's lifecycle - callsites should only be concerned with properties associated with the specific event.

The full event schema that ends up getting exported is defined in telemetrygateway.proto's Event message type. The event forwarded from Telemetry Gateway currently has the following shape:

{
  "metadata": {
    "identifier": {
      // ... telemetrygatewayv1.Identifier
    }
  },
  "event": {
    // ... telemetrygatewayv1.Event
  }
}

A detailed schema is available in the Telemetry Gateway protocol documentation, which also has more details about what kind of data gets exported and what components are generally pruned - also see sensitive attributes above.

Sensitive attributes

There are two core attributes in events that are considered potentially sensitive, and thus not exported from individual Sourcegraph instances:

  • parameters.privateMetadata: this fields allows the recording of arbitrarily shaped metadata, as opposed to the integer values supported in parameters.metadata. Due to the risk of sensitive data and PII exposure, we do not export this field by default
  • marketingTracking: this field tracks a lot of properties around URLs visited and marketing tracking that may contain sensitive data. This is only exported from the Sourcegraph.com instance.

Testing events

In summary, when adding your events in the new telemetry framework, you can verify events are being recorded by:

  1. Checking your events stored directly in event_logs after recording.
  2. Observing the raw payloads that the Telemetry Gateway ends up publishing in logs when running Telemetry Gateway locally.
    1. Note that the internal queue table only stores events until they are exported, and events are stored in raw Protobuf wire format - see storing events.

In integration and unit tests, you can also provide a mocked telemetry recording implementation to assert that various events are recorded as expected. For example, in the backend, you can use package internal/telemetry/telemetrytest, which provides a variety of testing utilities:

import (
  "context"
  "testing"

  "github.com/stretchr/testify/require"

  "github.com/sourcegraph/sourcegraph/internal/telemetry"
  "github.com/sourcegraph/sourcegraph/internal/telemetry/telemetrytest"
)

func TestRecorder(t *testing.T) {
  store := telemetrytest.NewMockEventsStore()
  recorder := telemetry.NewEventRecorder(store)

  err := recorder.Record(context.Background(), "Feature", "Action", nil)
  require.NoError(t, err)

  // stored once
  require.Len(t, store.StoreEventsFunc.History(), 1)
  // called with 1 event
  require.Len(t, store.StoreEventsFunc.History()[0].Arg1, 1)
  // stored event has 1 event
  require.Equal(t, "Feature", store.StoreEventsFunc.History()[0].Arg1[0].Feature)
}