Quick Start
Get Aveloxis collecting open source community health data in five steps.
Prerequisites
Before starting, ensure you have:
Aveloxis installed (see Installation)
A running PostgreSQL 14+ instance
At least one GitHub or GitLab personal access token
Step 1: Create a config file
cp aveloxis.example.json aveloxis.json
Edit aveloxis.json with your database credentials:
{
"database": {
"host": "localhost",
"port": 5432,
"user": "aveloxis",
"password": "your-password",
"dbname": "aveloxis",
"sslmode": "prefer"
}
}
Important
Local development over HTTP: If you plan to use the web GUI locally (without HTTPS), set "dev_mode": true in the "web" section of aveloxis.json. Without this, session cookies are marked Secure and browsers will not send them over plain HTTP, causing login to fail silently. Do not enable dev_mode in production.
If you do not have a database yet, create one:
-- Run in psql as a superuser
CREATE DATABASE aveloxis;
CREATE USER aveloxis WITH ENCRYPTED PASSWORD 'password';
GRANT ALL PRIVILEGES ON DATABASE aveloxis TO aveloxis;
ALTER DATABASE aveloxis OWNER TO aveloxis;
Or use Docker:
docker run -d --name aveloxis-db -p 5432:5432 \
-e POSTGRES_DB=aveloxis \
-e POSTGRES_USER=aveloxis \
-e POSTGRES_PASSWORD=aveloxis \
postgres:16
Step 2: Create the database schema
aveloxis migrate
This creates 108 tables and 19 materialized views across two PostgreSQL schemas (aveloxis_data and aveloxis_ops). It is safe to run repeatedly – all DDL uses CREATE ... IF NOT EXISTS.
Step 3: Store your API keys
# GitHub token
aveloxis add-key ghp_your_github_token --platform github
# GitLab token (optional)
aveloxis add-key glpat-your_gitlab_token --platform gitlab
Keys are stored in aveloxis_ops.worker_oauth and loaded automatically on every run. You can add multiple keys for better throughput via round-robin rotation.
Step 4: Add repos to the collection queue
Add a single repo
aveloxis add-repo https://github.com/chaoss/augur
Add multiple repos
aveloxis add-repo \
https://github.com/torvalds/linux \
https://github.com/chaoss/grimoirelab \
https://gitlab.com/fdroid/fdroidclient
Add all repos from a GitHub organization
aveloxis add-repo https://github.com/chaoss
When you pass an organization URL (no repo name), Aveloxis queries the GitHub/GitLab API to discover all repositories in that organization and adds them all to the queue.
Platform is auto-detected from the URL. GitLab nested subgroups are supported:
https://gitlab.com/group/subgroup/project
Step 5: Start the scheduler
aveloxis serve --monitor :5555
This starts the long-running scheduler that:
Continuously polls the queue for repos due for collection
Runs the full staged pipeline (API collection, processing, facade, commit resolution, analysis)
Serves a web monitoring dashboard
Check the monitoring dashboard
Open your browser to:
http://localhost:5555
The dashboard shows:
Queue statistics – total repos, queued, currently collecting
Repo table – every repo with status, priority, due time, and last run results
Boost button – push any repo to the front of the queue
Auto-refreshes every 10 seconds
Verify data in the database
After the first repo finishes collecting, you can verify data with psql:
-- Connect to your database
psql -U aveloxis -d aveloxis
-- Check collected repos
SELECT repo_id, repo_owner, repo_name, primary_language
FROM aveloxis_data.repos;
-- Count issues
SELECT r.repo_name, COUNT(*) AS issue_count
FROM aveloxis_data.issues i
JOIN aveloxis_data.repos r ON r.repo_id = i.repo_id
GROUP BY r.repo_name;
-- Count pull requests
SELECT r.repo_name, COUNT(*) AS pr_count
FROM aveloxis_data.pull_requests pr
JOIN aveloxis_data.repos r ON r.repo_id = pr.repo_id
GROUP BY r.repo_name;
-- Count commits (one row per file per commit)
SELECT r.repo_name, COUNT(DISTINCT cmt_commit_hash) AS commit_count
FROM aveloxis_data.commits c
JOIN aveloxis_data.repos r ON r.repo_id = c.repo_id
GROUP BY r.repo_name;
-- Check contributors
SELECT COUNT(*) AS total_contributors
FROM aveloxis_data.contributors;
-- Check collection queue status
SELECT status, COUNT(*)
FROM aveloxis_ops.collection_queue
GROUP BY status;
What happens next
Once aveloxis serve is running, it continuously:
Collects repos in priority order from the queue
Re-collects repos after
days_until_recollect(default: 1 day)Refreshes materialized views every Saturday
Runs contributor breadth discovery every 6 hours
Refreshes org membership every 4 hours
You can add more repos at any time without restarting:
aveloxis add-repo https://github.com/kubernetes/kubernetes
To push a specific repo to the front of the queue:
aveloxis prioritize https://github.com/kubernetes/kubernetes
Next steps
Configuration – fine-tune workers, batch sizes, and clone directories
Augur Migration – import repos and keys from an existing Augur database
Commands Reference – full CLI documentation
Collection Pipeline – understand what Aveloxis collects and how