Starter Template: Automated Tool-Usage Detector (Python + SQL + Dashboard)
Fork a GitHub-ready Python+SQL template that pulls billing and API usage, normalizes metrics, and surfaces underused SaaS with alerts.
Hook: Stop paying for what nobody uses — a forkable starter to detect wasted SaaS spend
If you manage infrastructure or developer tooling in 2026, your inbox is full of invoices and your finance team is tired of surprises. The core problem is rarely an individual overpriced license — it’s the hidden layers of cost and friction across dozens of SaaS products. This article gives you a GitHub-ready template (Python + SQL + dashboard + alerts) that pulls billing and API usage, normalizes metrics, and surfaces underused platforms with alerting rules — ready to fork for any organization.
Why this matters in 2026: trends you need to account for
- Usage-based pricing and AI tokens: Late 2024–2025 accelerated a shift from flat per-seat licensing to usage- or token-based pricing for AI services and analytics. That makes small, sporadic usage patterns harder to predict but easier to burn cash on when left unchecked.
- Tool sprawl and FinOps focus: In 2025 more engineering orgs adopted FinOps practices for cloud; now, in 2026, the next frontier is SaaS spend observability and operationalization of deprovisioning workflows.
- APIs are better but fragmented: Many vendors added richer billing and reporting endpoints in late 2025, but each exports different metrics, units, and identity models — forcing normalization.
- Policy-first automation: Teams are shifting to automated governance: detect underused tools, open a ticket, and cut or renegotiate subscriptions — with human review gates.
What you’ll get from this template
- A lightweight, forkable repo layout and Docker-ready services
- Python connectors for billing & usage APIs (examples: Stripe, AWS cost, GitHub, Datadog, and a generic SaaS connector)
- Normalized SQL schema and example PostgreSQL queries to detect underuse
- An orchestration example (Prefect flow + GitHub Actions schedule)
- A Grafana/Metabase dashboard spec and concrete alerting rules (Slack + PagerDuty)
- Operational playbook: rate limits, secrets, tests, on-call workflows
High-level architecture (fork and run)
Design goals: minimal infra (managed Postgres), cron-style orchestration, and a dashboard + alert layer. Components:
- Ingest workers — Python scripts that fetch billing/usage from vendor APIs, normalize, and write to Postgres.
- Normalization DB — PostgreSQL with a small star schema for costs, usage events, identities, and metadata.
- Orchestrator — Prefect flows triggered by GitHub Actions or scheduler to run ingest, backfill, and checks.
- Dashboard — Grafana or Metabase connected to Postgres for visualizing cost/usage and underuse detectors.
- Alerting — Grafana alerts + webhook forwarding to Slack/PagerDuty and an automated ticket opener in Jira/GitHub Issues.
Repository layout (recommended)
usage-detector/
├─ README.md
├─ docker-compose.yml
├─ infra/
│ └─ postgres/ (init scripts)
├─ connectors/
│ ├─ stripe_connector.py
│ ├─ aws_cost_connector.py
│ ├─ generic_saas_connector.py
│ └─ github_connector.py
├─ normalization/
│ └─ sql_schema.sql
├─ orchestrator/
│ ├─ flows.py (Prefect)
│ └─ ci.yml (GitHub Actions)
├─ dashboards/
│ └─ grafana/ (JSON model)
├─ alerts/
│ └─ alert_rules.yml
└─ docs/
└─ runbook.md
Normalization schema (core tables)
Normalize vendor outputs into a consistent set of columns. Keep the table structure small to make queries fast.
-- normalization/sql_schema.sql
CREATE TABLE vendors (
vendor_id SERIAL PRIMARY KEY,
vendor_key TEXT UNIQUE, -- e.g. "stripe", "aws", "datadog"
display_name TEXT,
metadata JSONB
);
CREATE TABLE resources (
resource_id SERIAL PRIMARY KEY,
vendor_id INT REFERENCES vendors(vendor_id),
resource_key TEXT, -- vendor-provided id or SKU
name TEXT,
type TEXT, -- "service", "seat", "api-key"
metadata JSONB
);
CREATE TABLE cost_events (
event_id BIGSERIAL PRIMARY KEY,
vendor_id INT REFERENCES vendors(vendor_id),
resource_id INT REFERENCES resources(resource_id),
usage_date DATE, -- the date the cost applies to
cost_amount NUMERIC(12,4),
currency TEXT,
raw_payload JSONB,
ingested_at TIMESTAMP DEFAULT now()
);
CREATE TABLE usage_metrics (
metric_id BIGSERIAL PRIMARY KEY,
vendor_id INT REFERENCES vendors(vendor_id),
resource_id INT REFERENCES resources(resource_id),
metric_name TEXT, -- e.g. "api_calls", "active_users", "seats_used"
metric_value NUMERIC,
metric_units TEXT,
metric_date DATE,
raw_payload JSONB,
ingested_at TIMESTAMP DEFAULT now()
);
CREATE INDEX ON usage_metrics(metric_name, metric_date);
CREATE INDEX ON cost_events(usage_date);
Python connector pattern (copy & adapt)
Each connector follows the same pattern: paginate API, map vendor fields to normalized schema, upsert resource metadata, and insert cost/usage rows.
# connectors/generic_saas_connector.py
import os
import requests
import psycopg2
from datetime import date
DB_DSN = os.getenv('DB_DSN')
API_KEY = os.getenv('SaaS_API_KEY')
BASE_URL = os.getenv('SaaS_BASE_URL')
def get_vendor_id(conn, vendor_key, display_name):
with conn.cursor() as cur:
cur.execute("SELECT vendor_id FROM vendors WHERE vendor_key=%s", (vendor_key,))
row = cur.fetchone()
if row:
return row[0]
cur.execute("INSERT INTO vendors (vendor_key, display_name) VALUES (%s,%s) RETURNING vendor_id", (vendor_key, display_name))
return cur.fetchone()[0]
def upsert_resource(conn, vendor_id, resource_key, name, rtype, metadata):
with conn.cursor() as cur:
cur.execute(
"SELECT resource_id FROM resources WHERE vendor_id=%s AND resource_key=%s",
(vendor_id, resource_key)
)
row = cur.fetchone()
if row:
return row[0]
cur.execute(
"INSERT INTO resources (vendor_id,resource_key,name,type,metadata) VALUES (%s,%s,%s,%s,%s) RETURNING resource_id",
(vendor_id, resource_key, name, rtype, metadata)
)
return cur.fetchone()[0]
def fetch_and_write():
conn = psycopg2.connect(DB_DSN)
vendor_id = get_vendor_id(conn, 'generic_saas', 'Generic SaaS')
# Example: paginated endpoint returning usage items
page = 1
while True:
r = requests.get(f"{BASE_URL}/usage?page={page}", headers={"Authorization": f"Bearer {API_KEY}"}, timeout=30)
r.raise_for_status()
data = r.json()
if not data['items']:
break
with conn.cursor() as cur:
for item in data['items']:
resource_key = item['id']
resource_id = upsert_resource(conn, vendor_id, resource_key, item.get('name'), item.get('type', 'service'), item)
# Insert usage metric
cur.execute(
"INSERT INTO usage_metrics (vendor_id,resource_id,metric_name,metric_value,metric_units,metric_date,raw_payload) VALUES (%s,%s,%s,%s,%s,%s,%s)",
(vendor_id, resource_id, 'api_calls', item.get('calls', 0), 'count', item.get('date'), item)
)
# If cost present
if item.get('cost') is not None:
cur.execute(
"INSERT INTO cost_events (vendor_id,resource_id,usage_date,cost_amount,currency,raw_payload) VALUES (%s,%s,%s,%s,%s,%s)",
(vendor_id, resource_id, item.get('date'), item.get('cost'), item.get('currency', 'USD'), item)
)
conn.commit()
page += 1
conn.close()
if __name__ == '__main__':
fetch_and_write()
Connector tips and 2026 considerations
- Respect rate limits: Use exponential backoff and persistent cursors. Many vendors added stricter quotas in 2025 to protect AI models.
- Timezones & billing windows: Normalize dates to UTC and store vendor-reported windows in raw_payload for reconciliation.
- Identity mapping: Map vendor user ids or team ids to your SSO IDs where possible to attribute seats to teams.
- Idempotency: Use event ids and upserts to make connectors safe to re-run.
Underuse detection logic — a practical algorithm
Goal: identify resources that cost > X/month but have low activity. Basic scoring formula:
underuse_score = (normalized_monthly_cost / max(1, normalized_monthly_active_units))
Where:
- normalized_monthly_cost — vendor cost normalized to USD for the 30-day window
- normalized_monthly_active_units — a combined metric of active users, API calls (scaled), or hours used
Example SQL: detect platforms with high cost per active unit
-- Example: compute 30-day cost and active_units per resource
WITH last_30_days AS (
SELECT date_trunc('day', current_date) - s.a AS day
FROM generate_series(0,29) s(a)
),
cost_30 AS (
SELECT resource_id, SUM(cost_amount) AS cost_30
FROM cost_events
WHERE usage_date >= current_date - INTERVAL '30 days'
GROUP BY resource_id
),
active_30 AS (
SELECT resource_id, SUM(metric_value) FILTER (WHERE metric_name IN ('active_users','seats_used')) +
SUM(metric_value) FILTER (WHERE metric_name='api_calls')/10000.0 AS active_units
FROM usage_metrics
WHERE metric_date >= current_date - INTERVAL '30 days'
GROUP BY resource_id
)
SELECT r.resource_id, r.name, v.display_name, coalesce(c.cost_30,0) AS cost_30, coalesce(a.active_units,0) AS active_units,
coalesce(c.cost_30,0) / GREATEST(coalesce(a.active_units,0),1) AS cost_per_active_unit
FROM resources r
JOIN vendors v ON r.vendor_id = v.vendor_id
LEFT JOIN cost_30 c ON c.resource_id = r.resource_id
LEFT JOIN active_30 a ON a.resource_id = r.resource_id
ORDER BY cost_per_active_unit DESC
LIMIT 50;
Interpretation
- Sort by cost_per_active_unit to prioritize targets.
- Filter out infra or reserved services where low activity is expected.
- Present top N to business owners for human review before deprovisioning.
Alerting rules and automation playbook
Automate detection → enrich with metadata → notify → open a ticket. Use thresholds and anomaly detection together.
- Static threshold rule: cost_per_active_unit > $1000 for two consecutive 7-day windows → create Slack alert to owner channel.
- Anomaly rule: sudden 3x drop in active units while cost remains steady → open a ticket (label: investigate underuse).
- Human-in-the-loop: every automated action must include a review step before cutting access or cancelling subscriptions. Use a GitHub Issue template with checkboxes for negotiation/backup export.
# alerts/alert_rules.yml (Grafana style simplified)
- name: High cost per active unit (30d)
query: |
SELECT resource_id, cost_30 / GREATEST(active_units,1) AS cost_per_active_unit FROM analytics_cost_per_unit WHERE date = current_date
condition: cost_per_active_unit > 1000
for: 2d
notify:
- slack: '#finops-alerts'
- pagerduty: 'P123456'
action: '/scripts/open_issue.sh --resource_id={resource_id} --template=underused'
Dashboard essentials (Grafana / Metabase)
Minimum dashboards to include in the template:
- Overview: 30/90/365-day cost trends by vendor
- Usage vs cost: scatter plot of cost_per_active_unit
- Top candidates: table of resources with owner, cost_30, active_units, score, and recent anomalies
- Alert history: list of triggered alerts and resolution status
Orchestration example: Prefect flow + GitHub Actions
Use Prefect for local dev and graceful retries; schedule via GitHub Actions for small orgs to avoid extra infra.
# orchestrator/flows.py (simplified)
from prefect import flow, task
from connectors.generic_saas_connector import fetch_and_write
@task(retries=3, retry_delay_seconds=60)
def run_connectors():
fetch_and_write()
@flow
def ingest_flow():
run_connectors()
if __name__ == '__main__':
ingest_flow()
# .github/workflows/ci.yml (schedule)
name: scheduled-ingest
on:
schedule:
- cron: '0 2 * * *' # daily at 02:00 UTC
workflow_dispatch: {}
jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install deps
run: pip install -r requirements.txt
- name: Run ingest
env:
DB_DSN: ${{ secrets.DB_DSN }}
SaaS_API_KEY: ${{ secrets.SAAS_API_KEY }}
SaaS_BASE_URL: ${{ secrets.SAAS_BASE_URL }}
run: python orchestrator/flows.py
Security and operational best practices
- Secrets management: never check API keys. Use GitHub Secrets or Vault and ephemeral tokens where possible.
- Least-privilege API keys: create read-only keys for billing/usage endpoints.
- Testing: include unit tests for mapping logic and integration tests with recorded vendor responses (VCR-style cassette files).
- Monitoring the monitor: track connector success rates, lag, and schema drift. Create an internal heartbeat metric written to Postgres and monitored by Grafana.
Case study (hypothetical, realistic)
Team X has 120 SaaS subscriptions across engineering, marketing, and data. After deploying this starter template and running for 6 weeks, they discovered:
- 10 platforms with cost_per_active_unit > $5,000. After review, 6 were unused sandboxes and were cancelled — saving $22k/month.
- 3 AI services spiking due to a broken CI job. Automated alerts caught the issue and reduced runaway token spend by 80% in 48 hours.
- By attributing seat usage to teams, they renegotiated 2 large licenses to team-based access, saving another $10k/year.
Operationalizing detection + human review returned ROI in under one month for this hypothetical org.
Advanced strategies (if you want to level up)
- Attribution models: attribute costs to projects using tag maps, usage patterns, and SSO identities for chargeback/showback.
- Anomaly detection with ML: train a simple seasonal model (Prophet or a lightweight LSTM) on usage metrics to detect drop-offs or spikes beyond static thresholds.
- Automated negotiation workflow: where policies allow, raise a renewal note and capture historical usage to present to vendor reps for discounts.
- Centralized license pool: for organizations with many one-off tools, create a supervised pool of licenses accessible via approvals to reduce duplication.
2026 future predictions and how to plan
- More granular vendor telemetry: Expect vendors to release more fine-grained usage endpoints (per-feature usage) — design schemas to accept arbitrary metric_name/metric_units now.
- Shift-left on SaaS procurement: Teams will gate purchases with a spend tag and automated enrollment into the usage detector before approvals.
- Policy APIs: In 2026 we’ll see broader support for policy-as-code for SaaS purchases, letting you automatically enforce rules from the template.
Onboarding checklist: fork & go
- Fork the repo and populate secrets (DB_DSN, vendor API keys) in GitHub Secrets or your CI/CD secrets store.
- Run docker-compose up to provision Postgres and Grafana for local testing.
- Enable one connector (start with the largest vendor) and run a backfill for 90 days to seed trends.
- Install the dashboard JSON and configure alert webhooks.
- Run the Prefect flow via GitHub Actions schedule; confirm heartbeats and connector success metrics.
- Invite finance and tool owners to review the Top Candidates dashboard weekly and define an SLA for response.
Actionable takeaways
- Start small: onboard 1–3 high-cost vendors first, measure impact, then scale connectors.
- Normalize early: store raw payloads plus normalized metrics to make audits straightforward.
- Automate detection, humanize remediation: auto-open tickets but require human approval before cancelling or revoking access.
- Measure ROI: track monthly saved cost vs operational hours spent — present this at the next FinOps review.
Where to find the template
The full GitHub-ready template is structured for forking: connectors you can copy, a normalization schema, orchestration examples, dashboard JSONs, and alert rules. Clone, run the onboarding checklist, and adapt metadata fields to your org.
Final notes and call-to-action
In 2026, saving on SaaS is no longer a purely procurement exercise — it’s an operational and engineering problem. This starter template turns scattered billing signals into a reproducible pipeline: ingest, normalize, detect, alert, and remediate. Fork it, run it, and share your custom connectors back to the community to accelerate others.
Ready to fork and run? Download the template on GitHub, run the onboarding checklist, and open an issue in the repo with your biggest blocker — we’ll add a connector or an example to the starter within a week.
Related Reading
- Mini‑Me Matching: Pet‑and‑Owner Jewelry Sets for Winter Walks
- Medicare Advantage Fraud Headlines: What Seniors Need to Know and How to Protect Themselves
- How to Unlock Splatoon and Lego Items in Animal Crossing: New Horizons (One Checklist)
- When to Consider Transitional Care Facilities: A 2026 Decision Guide
- Citrus and Climate: How Rare Fruit Collections Could Affect Food Prices for Travelers
Related Topics
Unknown
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Free Alternatives Catalog: Replace Expensive Marketing Tools with Cloud-Free Options
Audit Your Stack: A DevOps Playbook to Detect Underused SaaS with Logs & Billing
Comparison: Best free hosting setups for short-form AI video apps — limits you should know
Privacy-first dataset licensing checklist for sourcing creator content for AI
Mini-project: Build a recommendation engine for micro-apps using small LLMs and curated creator datasets
From Our Network
Trending stories across our publication group