Materialized Views
Aveloxis creates 19 materialized views for compatibility with 8Knot and other Augur analytics tools. These views pre-compute common queries for fast read access.
View list
View |
Purpose |
|---|---|
|
Total PR count per repo |
|
Total distinct commit count per repo |
|
Total issue count per repo (excluding PRs) |
|
Repo list with group names for the 8Knot explorer |
|
Daily commit and committer counts |
|
All contributor actions (commits, issues, PRs, reviews, comments) with ranking |
|
First-time contributor tracking (first action date per contributor per repo) |
|
8Knot compatibility alias for |
|
PR assignment and unassignment events |
|
PR message response tracking |
|
Comprehensive PR metrics (time to close, response times, line/file/commit counts) |
|
Issue assignment events |
|
User-to-repo mapping (which contributors are active in which repos) |
|
Language breakdown from |
|
Full libyear data for all dependencies |
|
Summary libyear statistics per repo |
|
Detailed libyear per dependency per repo |
|
Same as |
|
Legacy issue reporter view (reporter ID and creation timestamp) |
8Knot compatibility
These views are designed to be drop-in replacements for Augur’s materialized views that 8Knot reads from. If you point 8Knot at the aveloxis_data schema, the same queries and dashboards work without modification.
Schema prefix
8Knot queries reference views without a schema prefix. To make this work, set the default search path for your analytics user:
ALTER ROLE analytics_user SET search_path = aveloxis_data, public;
Or set it per-session:
SET search_path = aveloxis_data, public;
Rebuild schedule
Automatic (weekly)
Every Saturday, aveloxis serve automatically rebuilds all 19 materialized views:
Collection workers are paused (no new repos are claimed from the queue)
In-progress repos are allowed to finish their current phase
All 19 views are refreshed
Collection workers are resumed
The pause ensures that data is consistent during the rebuild – no partial collection state is captured in the views.
Manual
You can trigger a rebuild at any time:
aveloxis refresh-views
This refreshes all 19 views immediately. If aveloxis serve is running, collection continues during the manual refresh (the automatic Saturday rebuild is the only one that pauses collection).
CONCURRENTLY vs non-concurrent refresh
PostgreSQL supports two modes for refreshing materialized views:
REFRESH MATERIALIZED VIEW CONCURRENTLY
Does not block reads during the refresh
Requires a unique index on the view
Slower than non-concurrent refresh (builds a new copy, then swaps)
REFRESH MATERIALIZED VIEW (non-concurrent)
Blocks reads during the refresh (acquires
ACCESS EXCLUSIVElock)No unique index required
Faster for large views
Aveloxis behavior
Aveloxis uses CONCURRENTLY for views that have unique indexes, and non-concurrent refresh for views that do not. This provides the best balance of read availability and refresh speed.
Unique indexes
The following views have unique indexes to support concurrent refresh:
View |
Unique Index Columns |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Views without unique indexes are refreshed non-concurrently. During the Saturday rebuild, collection is paused so the brief read lock does not affect analytics queries in practice.
View details
api_get_all_repo_prs
Counts total pull requests per repository.
SELECT repo_id, COUNT(*) AS pr_count
FROM aveloxis_data.pull_requests
GROUP BY repo_id;
api_get_all_repos_commits
Counts distinct commits per repository (not per-file rows, but distinct hashes).
SELECT repo_id, COUNT(DISTINCT cmt_commit_hash) AS commit_count
FROM aveloxis_data.commits
GROUP BY repo_id;
api_get_all_repos_issues
Counts issues per repository, excluding entries that are actually PRs (GitHub conflates issues and PRs).
SELECT repo_id, COUNT(*) AS issue_count
FROM aveloxis_data.issues
WHERE pull_request IS NULL
GROUP BY repo_id;
explorer_contributor_actions
Unions all contributor actions (commits, issue opens, PR opens, reviews, comments) into a single view with an action type column. Includes ranking per contributor per repo.
explorer_pr_response_times
Comprehensive PR metrics including:
Time from open to first response
Time from open to close/merge
Total lines added/removed
Number of files changed
Number of commits
explorer_repo_languages
Aggregates repo_labor data (from scc analysis) to provide language breakdowns:
Lines of code per language per repo
Percentage of total code per language
explorer_libyear_summary
Summary statistics per repo:
Total libyear across all dependencies
Average libyear
Maximum libyear (most outdated dependency)
Count of dependencies
Monitoring refresh progress
The Saturday rebuild logs progress at INFO level:
INFO starting weekly materialized view refresh
INFO pausing collection workers
INFO refreshing api_get_all_repo_prs (CONCURRENTLY)
INFO refreshing api_get_all_repos_commits (CONCURRENTLY)
...
INFO refreshing explorer_contributor_recent_actions
INFO materialized view refresh complete (took 4m23s)
INFO resuming collection workers
Checking view freshness
You can check when views were last refreshed by querying PostgreSQL catalog tables:
SELECT
schemaname,
matviewname,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || matviewname)) AS size
FROM pg_matviews
WHERE schemaname = 'aveloxis_data'
ORDER BY matviewname;
Troubleshooting
“could not refresh materialized view concurrently”
This error means the view does not have a unique index. This should not happen with Aveloxis’s built-in views, but if you create custom views, ensure they have unique indexes before using CONCURRENTLY.
Slow refresh
If the Saturday rebuild takes too long:
Check PostgreSQL
work_memandmaintenance_work_memsettings. Increase to at least 256 MB and 1 GB respectively.Check if
VACUUM ANALYZEneeds to run on the underlying tables.Large tables (millions of rows in
commitsormessages) naturally take longer.
Views out of date
If analytics show stale data:
# Manual refresh
aveloxis refresh-views
Or wait for the next Saturday automatic rebuild.