Testing principles

This file documents how we test code at Sourcegraph.

Related pages: How to run tests | Testing Go code | Testing web code | Continuous integration

Philosophy

We rely on automated testing to ensure the quality of our product.

Any addition or change to our codebase should be covered by an appropriate amount of automated tests to ensure that:

  1. Our product and code works as intended when we ship it to customers.
  2. Our product and code doesn’t accidentally break as we make changes over time.

A good automated test suite increases the velocity of our team because it allows engineers to confidently edit and refactor code, especially code authored by someone else.

Engineers should budget an appropriate amount of time for writing tests when making iteration plans.

Types of tests

SOC2/GN-105

In order to ensure we are true to our philosphy, we have various implementations of testing for our code base.

This includes, but is not limited to:

  • Image vulnerability scanning
  • Infrastructure as code static analyses
  • Unit, integration and end-to-end tests as outlined in the testing-pyrmid

Our goal is to ensure that our product and code work, and that all reasonable effort has been taken to reduce the risk of a security-related incident associated to Sourcegraph.

Also see continuous integration and internal infrastructure testing.

Failures on the main branch

A red main build is not okay and must be fixed. Consecutive failed builds on the main branch means that the releasability contract is broken, and that we cannot confidently ship that revision to our customers nor have it deployed in the Cloud environment.

Flaky tests

We do not tolerate flaky tests of any kind. Any engineer that sees a flaky test in continuous integration should immediately disable the flaky test.

Why are flaky tests undesirable? Because these tests stop being an informative signal that the engineering team can rely on, and if we keep them around then we eventually train ourselves to ignore them and become blind to their results. This can hide real problems under the cover of flakiness.

Other kinds of flakes include flaky steps and flaky infrastructure

Testing pyramid

Testing pyramid

Unit tests

Unit tests test individual functions in our codebase and are the most desirable kind of test to write.

Benefits:

  • They are usually very fast to execute because slow operations can be mocked.
  • They are the easiest tests to write, debug, and maintain because the code under test is small.
  • They only need to run on changes that touch code which could make the test fail, which makes CI faster and minimizes the impact of any flakiness.

Tradeoffs:

  • They don’t verify our systems are wired up correctly end-to-end.

Integration tests

Integration tests test the behavior of a subset of our entire system to ensure that subset of our system is wired up correctly.

Benefits:

  • To the extent that fewer systems are under test compared to e2e tests, they are faster to run, easier to debug, have clearer ownership, and less vulnerable to flakiness.
  • They only need to run on changes that touch code which could make the test fail, which makes CI faster and minimizes the impact of any flakiness.

Tradeoffs:

  • They don’t verify our systems are wired up correctly end-to-end.
  • They are not as easy to write as unit tests.

Examples:

  • Tests that call our search API to test the behavior of our entire search system.
  • Tests that validate UI behavior in the browser while mocking out all network requests so no backend is required.

End-to-end tests (e2e)

E2e tests test our entire product from the perspective of a user. We try to use them sparingly. Instead, we prefer to get as much confidence as possible from our unit tests and integration tests.

Benefits:

  • They verify our systems are wired up correctly end-to-end.

Tradeoffs:

  • They are typically the slowest tests to execute because we have to build and run our entire product.
  • They are the hardest tests to debug because failures can be caused by a defect anywhere in our system. This can also make ownership of failures unclear.
  • They are the most vulnerable to flakiness because there are a lot of moving parts.

Examples:

  • Run our Sourcegraph Docker image and verify that site admins can complete the registration flow.
  • Run our Sourcegraph Docker image and verify that users can sign in and perform a search.

Visual testing

Visual testing is useful to catch visual regressions and verify designs for new features. More info about visual testing philosophy

We use Chromatic Storybook to detect visual changes in specific React components. Post a message in #dev-chat that you need access to Chromatic, and someone will add you to our organization (you will also receive an invitation via e-mail). You should sign into Chromatic with your GitHub account. If a PR you author has visual changes, a UI Review in Chromatic will be generated. It is recommended that a designer approves the UI review.

We use Percy to detect visual changes in Sourcegraph features during browser-based tests (client integration tests and end-to-end tests). You may need permissions to update screenshots if your feature introduces visual changes. Post a message in #dev-chat that you need access to Percy, and someone will add you to our organization (you will also receive an invitation via e-mail). Once you’ve been invited to the Sourcegraph organization and created a Percy account, you should then link it to your GitHub account.

Ownership

Conventions