Augur Migration

This guide covers migrating from an existing Augur installation to Aveloxis. Aveloxis is designed to coexist with Augur in the same database, so you can run both systems side-by-side.

Overview

Migrating from Augur involves four steps:

Point Aveloxis at your existing Augur database
Run aveloxis migrate to create the Aveloxis schemas
Import your API keys
Import your repos

No data in Augur’s schemas is modified or deleted. Aveloxis creates its own schemas (aveloxis_data and aveloxis_ops) and operates independently.

Step 1: Point at the existing Augur database

Create aveloxis.json with your Augur database connection:

{
  "database": {
    "host": "localhost",
    "port": 5432,
    "user": "augur",
    "password": "your-augur-db-password",
    "dbname": "augur",
    "sslmode": "prefer"
  }
}

Use the same host, port, user, password, and dbname that Augur uses. Aveloxis creates its own schemas and does not touch augur_data or augur_operations.

Step 2: Create the Aveloxis schemas

aveloxis migrate

This creates:

aveloxis_data – 84 tables + 19 materialized views for collected data
aveloxis_ops – 24 tables for operational state (queue, staging, credentials, etc.)

The migration uses CREATE ... IF NOT EXISTS throughout, so it is safe to run repeatedly. It never modifies or reads from augur_data or augur_operations schemas.

Step 3: Import API keys

aveloxis add-key --from-augur

This copies all API tokens from augur_operations.worker_oauth into aveloxis_ops.worker_oauth. Duplicate keys (by token value) are skipped via ON CONFLICT DO NOTHING.

After import, keys are stored in the Aveloxis table and loaded automatically on every run. You do not need the --augur-keys flag going forward.

Tip

If you want to temporarily use Augur’s keys without copying them, pass the --augur-keys flag to serve or collect instead. This reads directly from augur_operations.worker_oauth at startup.

Step 4: Import repos

aveloxis add-repo --from-augur

This reads every repository URL from augur_data.repo and adds it to the Aveloxis collection queue. Each URL is verified via an HTTP HEAD request against the forge before being added:

200 OK – repo is added to the queue
301/302 redirect – the canonical URL is used instead
404/410 – repo is skipped (dead, private, or DMCA’d)

This verification ensures you do not import stale or dead repos that would waste API calls.

Note

For large Augur installations (tens of thousands of repos), the import can take several minutes due to URL verification. Progress is logged at INFO level.

Schema coexistence

Aveloxis and Augur use completely separate schemas in the same PostgreSQL database:

Schema	Owner	Purpose
`augur_data`	Augur	Augur’s collected data
`augur_operations`	Augur	Augur’s operational tables
`aveloxis_data`	Aveloxis	Aveloxis collected data (84 tables + 19 matviews)
`aveloxis_ops`	Aveloxis	Aveloxis operational tables (24 tables)

Key points:

Aveloxis never reads from or writes to augur_data or augur_operations (except during explicit --from-augur imports).
Augur never reads from or writes to aveloxis_data or aveloxis_ops.
Both systems can collect from the same repos simultaneously without interference.
Schema names are hardcoded, so there is no risk of accidental cross-contamination.

Contributor ID compatibility

Aveloxis generates deterministic cntrb_id UUIDs using the same scheme as Augur. The UUID encodes:

Byte 0: platform ID (1 for GitHub, 2 for GitLab)
Bytes 1-4: gh_user_id (big-endian)

This means:

The same GitHub user always gets the same cntrb_id in both Augur and Aveloxis
UUIDs are byte-compatible between the two systems
Analytics queries that join on cntrb_id work across both schemas
If you later consolidate data, contributor identities match

This deterministic ID scheme is called GithubUUID internally.

Running both systems side-by-side

You can run Augur and Aveloxis simultaneously against the same database. Common scenarios:

Gradual migration

Keep Augur running for repos already in its queue
Add new repos only to Aveloxis
Compare data quality between the two systems
Once satisfied, stop Augur and let Aveloxis handle everything

Parallel collection for validation

Have both systems collect the same repos
Compare issue counts, PR counts, commit counts across schemas
Verify contributor resolution quality

Resource considerations

When running both systems:

Database connections: Both systems maintain connection pools. Ensure your PostgreSQL max_connections is high enough (Aveloxis uses up to 20 connections).
API rate limits: Both systems consume API rate limits from their respective key pools. Do not share the same API tokens between both systems, or you will see rate limit errors.
Disk space: Both systems maintain their own bare clones. The repo_clone_dir settings should point to different directories.
CPU/memory: Aveloxis is written in Go and typically uses less memory than Augur’s Python workers.

Differences from Augur

After migration, you will notice several improvements:

Area	Augur	Aveloxis
Dead repos	Retried every cycle	Permanently sidelined (data preserved)
Repo renames	Not detected	Detected and URLs auto-updated
Duplicate repos	Not detected	Detected via redirect resolution
Monitoring	Flower (separate service)	Built-in dashboard at `/`
Queue management	Opaque Celery state	SQL-queryable priority queue
Priority override	Not supported	`aveloxis prioritize` or dashboard Boost

Cleanup (optional)

Once you are confident in Aveloxis, you can optionally remove Augur’s schemas:

Warning

This permanently deletes all Augur data. Only do this after verifying that Aveloxis has collected everything you need.

-- DESTRUCTIVE: Only run after verifying Aveloxis data
DROP SCHEMA augur_data CASCADE;
DROP SCHEMA augur_operations CASCADE;
DROP SCHEMA spdx CASCADE;           -- if present

Next steps

Quick Start – verify collection is working
Monitoring – use the dashboard to track progress
Scaling – configure workers and keys for large instances