# Schema-change verification with `aveloxis data-test` `aveloxis data-test` is an operator-driven verification harness that catches data-loss regressions before they reach production. It compares a tagged release binary against the local working-tree binary by collecting the same repo into two scratch PostgreSQL databases and diffing row counts table-by-table. Shipped in **v0.22.8**. Designed primarily for contributors and maintainers who are about to ship a release that touches `schema.sql` or `migrate.go`. ## When to use it - **Contributors** — before submitting a PR that adds a column, changes a foreign-key declaration, alters a constraint, or modifies any DDL in `internal/db/schema.sql` or `internal/db/migrate.go`. - **Maintainers** — before merging a schema-touching PR, as a release gate. The harness exit code is 0 on PASS / FLAG-only and 1 on any FAIL (row loss), so it can drive CI. - **Operators** — before deploying a release with schema changes to a production fleet. The CLAUDE.md changelog for each release notes whether schema changed; if it did, run the harness against the prior released tag first. You do **not** need to run it for changes that don't touch the schema (API-layer changes, collector logic refactors, web GUI changes, etc.). ## What it actually verifies For each table in `aveloxis_data` and `aveloxis_ops`, the harness classifies the result: - **PASS** — equal row counts. The schema change is neutral for this table. - **FLAG** — the new schema captures more rows than the released schema. Usually means the change added new collection coverage (e.g., a previously-missed column now populates a child table). Manual review recommended but generally fine. - **FAIL** — the released schema captured rows that the new schema rejects. **This is the regression signal.** Common causes: - A new FK constraint rejects INSERTs that previously succeeded because no FK was enforced. - A new NOT NULL column on a table the collection path doesn't populate. - A reordered or renamed column the collector still writes by the old name. The harness does NOT do byte-for-byte content comparison. For deep semantic diff (per-row, per-column) on PR/issue tables, see the older `aveloxis shadow-diff` subcommand. Operators may run both: `data-test` first as a coarse filter, `shadow-diff` for deep inspection of any FLAGGED table. ## Prerequisites - A local clone of the aveloxis repository with the tag you want to test against. If the tag is missing locally: `git fetch --tags`. - PostgreSQL access via the operator's existing `aveloxis.json` credentials. The user must have **CREATEDB privilege** on the database server — required because the harness creates and drops scratch databases. Standard for development-mode aveloxis deployments. - API keys already loaded into the operator's primary `aveloxis_ops.api_keys` table. The harness copies keys into both scratch DBs automatically — you don't re-paste tokens. - ~1.5 hours of wall-clock time for a moderate test repo. Most of this is sequential **full-history** collection passes (two of them, ~30–45 min each on augurlabs/augur with a 73-token GitHub pool). The harness always passes `--full` to `aveloxis collect` so both scratch DBs receive complete data with all parent records present. This is essential — see "Why full collection" below. - ~5 GB of free disk space for the two scratch databases. ## Running the harness ```bash aveloxis data-test \ --released-tag 0.22.6 \ --repo https://github.com/augurlabs/augur ``` Required flags: - `--released-tag TAG` — a git tag in the local clone (e.g., `0.22.6`). The harness materializes the tag's source via `git worktree add` and builds it. - `--repo URL` — the test repo to collect. `augurlabs/augur` is the canonical choice: enough issues, PRs, and commits to exercise every collection path without being so large that the cycle takes hours. Optional flags: - `--keep-dbs` — retain `aveloxis_released` and `aveloxis_new` after the run. Default is to drop them. Pass this when you want to inspect a FAILing table directly via `psql` after the report is written. - `--work-dir PATH` — where to put binaries, logs, and the report. Default is a fresh `/tmp/aveloxis-data-test-`. ## Interpreting the report The report at `/report.md` opens with a summary: ```markdown # aveloxis data-test report - Released tag: 0.22.6 (in aveloxis_released) - Local version: working tree (in aveloxis_new) - Test repo: https://github.com/augurlabs/augur - Schemas: aveloxis_data, aveloxis_ops - Generated: 2026-05-17T14:30:00Z ## Summary - FAIL (released has more rows = data loss): 0 - FLAG (new has more rows = likely new coverage): 2 - PASS (equal counts): 71 ``` Followed by three tables — FAIL, FLAG, PASS — each showing per-table row counts and the delta. ### When all rows are PASS Ship the change. No data loss possible from the schema delta. ### When some rows are FLAG Review each flagged table individually. A FLAG means the new schema captures rows the released schema missed. Two scenarios: 1. **Genuinely new coverage** — e.g., a new column was added and the collector now populates it, so a previously-empty child table now has rows. This is a feature, not a bug. Ship. 2. **Insertion-order or constraint timing change** — e.g., a constraint was relaxed and rows that previously failed FK checks now pass. Investigate whether the previously-failed rows were actually correct. Usually yes. If a FLAG is unexpected, run with `--keep-dbs`, then: ```bash psql -d aveloxis_new -c " SELECT * FROM aveloxis_data. WHERE NOT IN (SELECT FROM aveloxis_released.aveloxis_data.); " ``` (That cross-database query requires `dblink` or a foreign-data wrapper — most operators just `\c aveloxis_released` and run separate queries.) ### When any rows are FAIL **Do not ship.** Investigate the root cause: 1. Open the failing table in both DBs via `psql` (use `--keep-dbs` if not already). 2. Find a row present in `aveloxis_released` but not `aveloxis_new`. 3. Read the collector code that writes that table and the migration delta in the new version. 4. The mismatch is usually one of: - **New NOT NULL column** with no DEFAULT, and the collector writes NULLs for that column. Fix: add a DEFAULT in schema.sql or update the collector to write a value. - **New CHECK constraint** that the existing data violates. Fix: loosen the constraint or backfill before constraining. - **New FK that rejects existing references** because the parent row doesn't exist at INSERT time. Fix: make the FK `DEFERRABLE INITIALLY DEFERRED` (the v0.22.7 pattern). - **Renamed column** that the collector still writes by the old name. Fix: update the collector. Once you've fixed the regression, re-run `data-test` to confirm. The harness drops and recreates the scratch DBs idempotently, so reruns are safe. ## Phase-by-phase walkthrough Useful to know if a phase fails partway through. 1. **Resolve binaries.** `git worktree add --detach /released-src ` (fast — reuses local clone's git objects, no remote fetch), then `go build` in the worktree. Builds the local working-tree binary the same way. Failures here usually mean the tag is missing locally — fix with `git fetch --tags`. 2. **Provision scratch DBs.** Connects to the configured PostgreSQL host's `postgres` system database, then DROPs+CREATEs `aveloxis_released` and `aveloxis_new`. Failures here usually mean missing CREATEDB privilege — fix at the database server. 3. **Generate scratch configs.** JSON-edit the operator's `aveloxis.json` to swap `database.dbname`, write to work-dir. This phase doesn't typically fail. 4. **Per-side collection** (sequential — both sides share the API key pool): - `migrate --skip-views` to bring the scratch DB to the binary's schema. - `copyAPIKeys` direct SQL from the operator's primary DB into the scratch DB. - `add-repo` to queue the test repo. - `collect` to run a one-shot collection. This is the ~30 min phase. Subprocess output streams live to the parent so progress is visible. 5. **Row-count diff.** Connects to both scratch DBs and runs `db.RowCountDiff` over `aveloxis_data` + `aveloxis_ops`. Fast (a few seconds). 6. **Write report + cleanup.** Markdown report at `/report.md`. Unless `--keep-dbs`, both scratch DBs are dropped. Cleanup failure is non-fatal — the report is already written. ## Why full collection (`--full`) The harness always invokes `aveloxis collect URL --full`, forcing `since=zero` (every parent issue/PR is fetched, not just recently- modified ones). Without `--full`, the collector uses the default incremental since-filter (typically 21 days), which produces a specific noise pattern that **hides real regressions**: - The events API returns events for issues/PRs that may have been last modified outside the since window. - The collector tries to INSERT those events; the FK constraint rejects them because the parent issue/PR isn't in the local DB. - Hundreds of `issue_events_issue_id_fkey` and `pull_request_events_pull_request_id_fkey` violations get logged as WARN. - The events are silently dropped on BOTH sides. - The diff sees `issue_events: released=0, new=0` — a false PASS. By forcing `--full`, every parent is present before its children are fetched. FK constraints are exercised against fully-populated parent tables. Any real regression introduced by a schema change will surface, instead of being masked by the partial-collection noise. Empirically confirmed in v0.22.10 after a 2026-05-17 diagnostic showed this exact pattern obscuring v0.22.7's FK behavior change. The trade-off is wall-clock time: a full collection takes ~30–45 minutes on augurlabs/augur vs ~20–25 minutes incremental. The signal quality is worth it; the whole point of the harness is to detect FK / data-loss regressions, and partial collection fundamentally cannot detect them. ## Design decisions worth knowing - **Sequential collection, not parallel.** Both sides share the operator's API key pool. Parallel runs would burn 2× the rate budget and produce non-reproducible timings. ~1.5 hours total (with --full) is the cost of reliable results. - **`git worktree`, not `git clone`.** Reuses the local clone's git objects. Works offline. Faster. - **Self-discovering tables, not a per-table fixture.** Adding a new table to `schema.sql` in a future release is automatically covered by the diff. No fixture to maintain. - **Drop scratch DBs by default.** Avoids multi-GB scratch DBs accumulating across runs. `--keep-dbs` is the opt-in. - **Row counts, not row content.** Catches data loss; doesn't catch per-column drift. For column-level diff use `aveloxis shadow-diff` (which covers ~17 PR/issue tables with deep semantic comparison). - **CREATEDB privilege required.** The harness creates and drops scratch DBs. Standard for development-mode deployments. Production fleets running the harness need the same privilege on whichever database server they target. ## CI integration ```yaml - name: Validate schema change run: | aveloxis data-test \ --released-tag ${{ env.LAST_RELEASED_TAG }} \ --repo https://github.com/augurlabs/augur # Exit code 1 on any FAIL → fails the build # Exit code 0 on PASS or FLAG-only → proceeds ``` For PR-level gating, the maintainer can require this job pass before merging any PR labeled `schema-change`. ## Limitations - **Single test repo per invocation.** Multi-repo tests are operator-scripted shell loops for now. - **No auto-fetch of missing tags.** If the operator specifies a tag not yet in the local clone, the worktree step fails. Fix: `git fetch --tags` then re-run. - **No materialized view comparison.** Matviews are derived data; a row-count mismatch on a matview would just reflect upstream base-table differences. The harness focuses on base tables only. - **Row counts only, not content.** Use `aveloxis shadow-diff` for per-column verification when you need that depth. - **Doesn't catch silent column drift.** If a column's *value* changes meaning but the row count stays the same (e.g., a unit changed from milliseconds to seconds), `data-test` won't flag it. This is a content-level concern handled by `shadow-diff` (for the tables it covers) or by domain-specific test suites. ## Related - [`aveloxis shadow-diff`](commands.md) — deep semantic diff for ~17 PR/issue tables. Use after `data-test` for FLAGGED tables. - [CI/CD](ci-cd.md) — how the integration test workflow gates schema migrations. - Architecture: `internal/db/rowcount_diff.go` is the row-count diff primitive. `cmd/aveloxis/data_test_cmd.go` is the orchestrator.