Migrating to Sourcegraph 3.31.x

In Sourcegraph 3.31.x, both the built-in main Postgres (pgsql) and codeintel (codeintel-db) databases have switched to an Alpine-based Docker image - this has been done to resolve vulnerabilities found in Debian but not Alpine. Upon upgrading, Sourcegraph will need to re-index the entire database. This process requires some preparation, so please read through all of the instructions on the rest of the page beforehand.

For customers who previously upgraded to 3.27, there is a possibility that upgrade inadvertently introduced a major glibc change. This may have caused corruption in some indexes. While this is unlikely to have been noticable up til now, upgrading to 3.31 (or more specifically the subsequent reindexing) will fail unless certain steps are taken before hand. Customers who have installed fresh from 3.27 or later should be unaffected by this potential data corruption, but are still likely to see the delay caused by the reindexing following upgrade.


Check for prior index corruption before upgrading

There is a possibility that prior Sourcegraph upgrades inadvertently introduced a major glibc change. This can cause corruption in indexes that have collatable key columns (e.g. any index with a text column). Read more about this here.

If your indexes are corrupted, then there is also a possibility that there is bad data in your databases that would cause the re-indexing process (and thus the 3.31.x upgrade) to fail. In order to do a first-pass check for corrupt indexes, please run the following SQL query against both of the following instances before upgrading to 3.31.x:

  1. pgsql
  2. codeintel-db

For more information about how to access a database container and run queries via psql see our admin documentation for kubernetes or docker-compose

create extension amcheck;

select bt_index_parent_check(c.oid, true), c.relname, c.relpages
from pg_index i
join pg_opclass op ON i.indclass[0] = op.oid
join pg_am am ON op.opcmethod = am.oid
join pg_class c ON i.indexrelid = c.oid
join pg_namespace n ON c.relnamespace = n.oid
where am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
and c.relpersistence != 't'
-- Function may throw an error when this is omitted:
and i.indisready AND i.indisvalid;

If no errors are reported, then it’s probable that your indexes are fine (and thus no bad data is in your databases). You can proceed to “Prepare for downtime”.

If any errors are reported, please contact customer support to help you repair your database.

Prepare for downtime

Sourcegraph will be unavailable until the re-indexing process has completed. If the database containers are restarted/killed during the re-indexing process (for example, as a result of automated deployments), re-indexing will have to start over from scratch. Please plan accordingly, and communicate this downtime to your users.

For systems with large datasets, re-indexing can take 1-2+ hours.

To validate the size of your dataset relative to these estimates, we can help estimate your potential downtime with some additional information.

In order to check your dataset size, please run the following SQL query against both of the following instances:

  1. pgsql
  2. codeintel-db
   pg_size_pretty(total_bytes) AS total,
   pg_size_pretty(index_bytes) AS index,
   pg_size_pretty(toast_bytes) AS toast,
   pg_size_pretty(table_bytes) AS table 
         total_bytes - index_bytes - coalesce(toast_bytes, 0) AS table_bytes 
               relname AS table_name,
               pg_total_relation_size(c.oid) AS total_bytes,
               pg_indexes_size(c.oid) AS index_bytes,
               pg_total_relation_size(reltoastrelid) AS toast_bytes 
               pg_class c 
               LEFT JOIN
                  pg_namespace n 
                  ON n.oid = c.relnamespace 
               relkind = 'r' 
   total_bytes DESC;

Please contact either customer support or your Customer Engineer with this information and we’ll advise further.