System Architecture
The pipeline is a split-agent, disk-persisted design. Raw data is always written to disk before reports are generated — this prevents context-window overflows and makes individual stages re-runnable independently.
Data Pull
JSON / MD on disk
Report Gen
Per engineer
00_MASTER.html
context-enrichment.md
Weekly Checklist
This is the confirmed working deploy protocol as of March 15, 2026. The pipeline is fully automated — no manual data entry. Follow these steps top to bottom every Monday morning.
Background Bash commands lose macOS keychain authentication. Running collect_github.py in background Bash returns 0 PRs for every engineer silently. Always run it interactively through a Claude agent (ask Claude Code: "run python3 scripts/collect_github.py").
- Confirm you are in the right directory: cd "/Users/lokicheema/Desktop/Hemut Files/Dashboard"
- Confirm no orphaned collection processes: ps aux | grep collect_github | grep -v grep — if any found, kill them: pkill -f collect_github.py
- Check engineers.json — any new hires or terminations this week? Update before running.
- Check if GitHub data was already pulled today: python3 -c "import json; d=json.load(open('data/github.json')); print(d.get('generated_at',''))" — if today's date shows, skip to Step 2.
Ask Claude Code to run this command. Do NOT run it as a background Bash command or in a separate terminal.
python3 scripts/collect_github.py
- Output shows timestamped progress per engineer: [HH:MM:SS] ✓ sumedh — 250 PRs, 4 repos, 8.2s
- After completion, verify data was written:
# Check 1: file was updated today
ls -lh data/github.json
# Check 2: data has real PR counts (must be > 0)
python3 -c "import json; d=json.load(open('data/github.json')); print('Total PRs:', sum(v.get('total_prs',0) for v in d['engineers'].values()))"
- Expected output: Total PRs: [some number above 0]. If 0, see Known Gotchas section (Issue G3).
- Expected file size: >100 KB. If 19 KB, data is empty — see Issue G4.
Check generated_at in data/github.json. If it matches today's date AND no new repos were added to engineers.json, skip directly to Step 2 with --skip-github.
Standard weekly run (after fresh GitHub collection, or after any score_prs.py change):
cd "/Users/lokicheema/Desktop/Hemut Files/Dashboard"
./run.sh --skip-github --rebuild-cache --deploy
Fast re-run (github.json already pulled today, no scoring logic changes):
./run.sh --skip-github --deploy
# saves ~5 min and ~$0.08 vs --rebuild-cache
Fix-only re-run (after fixing generate_dashboard.py bugs, no new data needed):
./run.sh --skip-github --deploy
# no --rebuild-cache — re-uses scored data, only re-generates HTML
What the pipeline does (8–12 min total)
- Step 2.1–2.7: CSV processing, PR scoring, data merge (~2 min)
- Step 2.8: GPT-4o-mini scores PRs — silent for 3–5 min, this is normal
- Step 2.9: Perplexity coaching queries — silent for 2–3 min, this is normal
- Step 2.10: HTML generation (~1 min, generates 101+ files)
- Step 2.11: Vercel deploy (~1 min)
PR scoring (step 2.8) and coaching generation (step 2.9) run batch API calls with no intermediate output. Wait up to 5 min before concluding something is wrong. Only abort if no output for >15 min total.
This step is mandatory after every deploy. The dashboard is used in real 1:1 performance conversations — wrong numbers mean wrong assessments. See Section 13 for the full QA protocol and what to check.
- Spin up 5 Haiku agents in parallel (one per batch)
- Batch 1: sumedh.html, ashish.html, a2z2.html, avetis.html
- Batch 2: federico.html, pranav.html, aaryan.html, harshiv.html
- Batch 3: andrewfe.html, kalpan.html, darius.html, gabriele.html
- Batch 4: tarun.html, tyler.html, justin.html, evan.html
- Batch 5: sprint.html, team.html, spend.html
- If QA found rendering bugs (float formatting, wrong counts, etc.): fix generate_dashboard.py
- Re-deploy without re-scoring: ./run.sh --skip-github --deploy
- If QA found data bugs (wrong PR counts, wrong spend): check source data files in data/ first
- Re-run QA after every fix until all batches pass
Agent 1 Prompt Template — Data Pull
Agent 1 pulls all raw GitHub data for one engineer and saves it to disk. It makes live API calls and writes structured JSON/MD. Never combine Agent 1 and Agent 2 into a single agent — see Known Issues #4.
Variables to Fill
Repos to Always Check
# All 16 repos in scope — always check all of them
Hemut-Prod
Voice-AI-Element
zones
RFP-Module
Web_Scrapers
hemut-sonar
tracking_backend
QuoteFlow_Backend
hemutchat
Hemut-News
SharedInfraLayer
MobileAppUpdated
Helm
Hemut-MCP
Hemut-Library
Hemut-Website
Skip Logins (Always Exclude)
# These are bots/service accounts — never count as engineer activity
hemut-swe # Andrew FE Zhang's service account (zhang@hemut.com)
dependabot[bot]
use-tusk[bot]
Core Agent 1 Commands
# 1. Pull all merged PRs in window (run per repo)
gh pr list \
--repo Hemut2025/[REPO] \
--state merged \
--author [GITHUB_HANDLE] \
--json number,title,additions,deletions,mergedAt,body,labels,createdAt,headRefName \
--limit 300 \
| jq '[.[] | select(.mergedAt >= "[START_DATE]" and .mergedAt <= "[END_DATE]")]'
# 2. Per-PR detail: files changed
gh api repos/Hemut2025/[REPO]/pulls/[PR_NUMBER]/files
# 3. Per-PR detail: reviews received
gh api repos/Hemut2025/[REPO]/pulls/[PR_NUMBER]/reviews
# 4. Per-PR detail: commits
gh api repos/Hemut2025/[REPO]/pulls/[PR_NUMBER]/commits
# 5. CI runs for this engineer
gh api "repos/Hemut2025/[REPO]/actions/runs?actor=[GITHUB_HANDLE]&per_page=100"
# 6. Direct commits (not via PR)
gh api "repos/Hemut2025/[REPO]/commits?author=[GITHUB_HANDLE]&since=[START_DATE]T00:00:00Z&until=[END_DATE]T23:59:59Z&per_page=100"
# 7. Pre-window activity check (CRITICAL for tenure correction)
gh api "repos/Hemut2025/[REPO]/commits?author=[GITHUB_HANDLE]&until=[START_DATE]T00:00:00Z&per_page=1" \
--jq '.[0].commit.author.date'
# 8. Reviews given by this engineer (not received)
gh api "repos/Hemut2025/[REPO]/pulls/[PR_NUMBER]/reviews" \
| jq '[.[] | select(.user.login == "[GITHUB_HANDLE]")]'
# 9. Check rate limit before and after heavy batches
gh api rate_limit --jq '.resources.core | {remaining, reset}'
>100 PRs: sample every 5th PR for files/reviews/commits detail. 50–100 PRs: sample every 3rd PR. <50 PRs: pull detail for all PRs.
Output JSON Structure
{
"engineer": {
"name": "[ENGINEER_NAME]",
"github": "[GITHUB_HANDLE]",
"role": "[ROLE]",
"monthly_salary": [MONTHLY_SALARY],
"known_data": {}
},
"repos_checked": [], // list of all 16 repos, even those with 0 PRs
"errors": {}, // any repos that 404'd or errored
"prs": [], // full PR objects with files/reviews/commits embedded
"ci_runs": [], // CI run objects per repo
"reviews_given": [], // PRs this engineer reviewed (authored by others)
"direct_commits": [], // commits not attached to a PR
"pre_window_first_commit": null, // ISO date string if exists; null = new hire
"pull_timestamp": "", // ISO datetime when Agent 1 ran
"window": {
"start": "[START_DATE]",
"end": "[END_DATE]"
}
}
Completion Signal
# Agent 1 must end with this exact line:
AGENT 1 COMPLETE — [NUMBER] PRs across [REPOS] repos. Saved to data/[XX]_[Name]-raw-data.json
File saves to: Outputs/2026-03-14_Engineer-Deep-Dives/data/[REPORT_NUMBER]_[firstname-lastname]-raw-data.json
Example: data/01_sumedh-kane-raw.json
Agent 2 Prompt Template — Report Generation
Agent 2 reads only from disk — zero live API calls. It reads the JSON produced by Agent 1 and generates the full HTML deep-dive report.
Agent 2 is a pure read-from-disk → write-HTML transformer. Any live calls violate the split-agent contract and make the stage non-repeatable.
Input / Output
# INPUT
data/[XX]_[Name]-raw-data.json # or .md for older format files
# OUTPUT
[XX]_[Name]-Deep-Dive.html # saved to Outputs/2026-03-14_Engineer-Deep-Dives/
Required Report Sections
Required 12 Charts (Chart.js CDN)
# Chart.js CDN (always use this exact URL)
https://cdn.jsdelivr.net/npm/chart.js
# The 12 required charts:
1. Weekly velocity line (13 weeks + team avg + frontier reference lines)
2. Additions vs deletions stacked area chart
3. Work type donut (Feature / Bug / Refactor / Infra / Chore / Test / Docs)
4. Time-of-day heatmap (7 days × 24 hours, derived from commit timestamps)
5. 90-day commit calendar (13×7 HTML grid — NOT Chart.js, pure HTML/CSS)
6. AI spend vs PR output dual-axis bar+line
7. PR size histogram (bins: XS <50, S 50-200, M 200-500, L 500-2K, XL >2K lines)
8. Depth/complexity trend over 13 weeks
9. Cost efficiency trend ($/PR vs $9.15 frontier threshold)
10. Codebase coverage radar (one axis per repo)
11. PR lifecycle funnel (opened → reviewed → merged, with day counts)
12. Review quality scatter (reviews given vs review depth)
Design Constants
/* CSS variables — use these in every report */
--bg: #0f1724;
--surface: #1B2A4A;
--gold: #C9A84C;
--green: #2ecc71;
--red: #e74c3c;
/* Fonts (Google Fonts CDN) */
IBM Plex Sans — body, UI
IBM Plex Mono — code, metrics, chart labels
/* Google Fonts URL */
https://fonts.googleapis.com/css2?family=IBM+Plex+Mono:wght@400;500;600&family=IBM+Plex+Sans:wght@300;400;500;600;700&display=swap
Tenure Correction Rule — CRITICAL
Check pre_window_first_commit in the JSON.
If NOT null → engineer existed before the window. Any gap at the start of the 90-day window is a real dark period. Use the full 13-week denominator.
If null → engineer is new. Adjust all velocity metrics: PRs / (days_since_first_PR_in_window / 7), NOT PRs / 13 weeks. Reporting 0 weeks of activity as "inactive" is a critical error.
Enrichment Agent Prompt
The Enrichment Agent runs once per sprint via MCP tools. No gh CLI needed. It provides the context layer — Linear ticket linkage, Sentry errors, Slack signals, and Otter meeting mentions — that GitHub data alone cannot provide.
Output Files
data/linear-data.md — teams, cycles, issue counts, ticket linkage per engineer
data/sentry-data.md — open issues per project, error counts, top issues
data/meetings-calendar-data.md — relevant Otter transcripts with engineer mentions
data/context-enrichment.md — combined synthesis of all three sources above
Linear Pulls (via MCP)
# MCP tool: mcp__linear__list_teams
# → Get all team IDs and names
# MCP tool: mcp__linear__list_cycles (per team)
# → For each engineering team: SWAG, ASH, CRODIE, GUDA, ADT
# → Identify current active cycle
# MCP tool: mcp__linear__list_issues (per team, limit 50)
# → For each team: pull 50 most recent issues
# → Current cycle completion rate = completedIssueCount / totalIssueCount
# Ticket linkage per engineer:
# Count issues where gitBranchName CONTAINS engineer's github_handle
# This is the primary ticket-linkage metric
# Save as: data/linear-data.md
Sentry Pulls (via MCP)
# MCP tool: mcp__sentry__search_issues
Organization: hemut
# Projects to check:
command-backend
command-frontend
hemut-quoteflow-backend
# Query: unresolved errors in the last 90 days
Query: "is:unresolved"
Date range: last 90 days
# Save: count of unresolved issues, top 10 by event count
# Save as: data/sentry-data.md
Slack Pulls (via MCP)
# MCP tool: mcp__slack__slack_search_public_and_private
# Primary channel query
Query: "in:#hemutdevs after:[DATE_30_DAYS_AGO]"
# What to look for and save:
# - Sprint retrospective updates or announcements
# - Engineer shoutouts or performance callouts
# - Blockers or incident reports
# - Any anomalies (sudden silence, repeated issues)
# - Messages mentioning specific engineers by @handle or name
# Save relevant messages with: sender, timestamp, channel, full text
# Save as: data/context-enrichment.md (Slack section)
Otter Pulls (via MCP)
# Step 1: List recent transcripts
# MCP tool: mcp__otter__otter_list_transcripts
# → Get last 20 transcripts
# Step 2: For each transcript with engineer names in summary
# MCP tool: mcp__otter__otter_get_transcript
# → Pull full transcript text
# Key patterns to extract and save:
# - Offer discussions (comp, equity, timeline mentions)
# - Performance mentions (positive or corrective)
# - Technical decisions or architectural discussions
# - Sprint planning decisions that affect workload
# Save as: data/meetings-calendar-data.md
Master Synthesis Agent Prompt
The Synthesis Agent reads ALL data files and produces the single master HTML. It does no live data pulls — everything comes from disk. This is the most context-heavy agent in the pipeline; give it a full context window.
Input Files (read all before generating)
# Raw data files (all 26)
data/01_sumedh-kane-raw.json
data/02_Ashish-Lathkar-raw-data.json
data/03_... through data/26_...
# Enrichment context
data/context-enrichment.md
data/linear-data.md
data/sentry-data.md
data/meetings-calendar-data.md
# Benchmark reference
data/elite-benchmarks.md ← frontier benchmarks (update manually quarterly)
# Engineer configuration
/Users/lokicheema/Desktop/Hemut\ Files/Dashboard/engineers.json
Required Output Sections
Section 04 — Systemic Issues (always include all 4)
- Ticket linkage gaps — engineers without Linear branch name linkage (invisible in product tracking)
- Review concentration — is one person reviewing most PRs? Bus factor risk.
- AI tooling gaps — engineers with $0 AI spend vs peers. Are they blocked or unaware?
- PR description discipline — % of PRs with <20 chars description. Affects reviewability and on-call debugging.
Output File
Outputs/2026-03-14_Engineer-Deep-Dives/00_MASTER-Engineering-Synthesis.html
engineers.json Maintenance
This file is the single source of truth for all engineer metadata. Everything downstream — Agent 1 prompts, report generation, the sprint dashboard, billing attribution — reads from it.
/Users/lokicheema/Desktop/Hemut Files/Dashboard/engineers.json
This is NOT inside the Outputs folder. Don't move it.
Schema — Fields Per Engineer
{
"id": "sumedh", // short key, lowercase, no spaces
"name": "Sumedh Kane",
"github_handle": "sumedhkane03", // primary handle (verify by checking commits)
"github_handle_alt": null, // secondary handle if exists
"cursor_email": "sumedh@hemut.com", // email for Cursor/Claude billing attribution
"claude_key_prefix": "sk-ant-...", // prefix in Anthropic billing to match spend
"role": "Founding Engineer",
"monthly_salary": 5000, // integer USD, 0 if unpaid/terminated
"tier": 1, // 1-5 performance tier (1 = top)
"status": "active", // active | terminated | unresolved | external
"notes": "" // alt emails, billing quirks, anything unusual
}
Weekly Maintenance Checklist
- New hire this week? Add entry with all fields filled.
- Termination this week? Change status to "terminated", set monthly_salary to 0.
- Salary change? Update monthly_salary.
- GitHub handle confirmed? Update github_handle and add old one to github_handle_alt.
- Tier change based on corrected metrics? Update tier.
Known Handle Quirks — Permanent Reference
# Federico Lora
Commits from: lora.fed.03@gmail.com (personal) + federico@hemut.com
Both emails → handle: lorafed
# Must search BOTH emails in billing attribution
# Pranav Guda — TWO ACCOUNTS
Primary: PranavGuda (pranav@hemut.com)
Dev/merge: hemoot (developer@hemut.com)
# Pull data for BOTH handles and merge in Agent 1
# Andrew FE Zhang — service account confusion
hemut-swe = zhang@hemut.com = Andrew FE Zhang's service account
Personal handle: zhemut
# Add hemut-swe to skip_logins. Use zhemut for pull.
# TWO ANDREW ZHANGS — completely different people
zhemut = Andrew FE Zhang (founding eng, $5K/mo, zhemut GitHub)
zhangandrew2 = A2Z2 (summer intern, $500/mo, separate person)
# Never merge their data. Always confirm by email.
# Darius Mahjoob — bundled account history
Pre-Oct 2025 activity: under shared Hemut account, unattributable
Individual GitHub history starts: Oct 2025 under DMahjoob
# Use Linear join date as true start date proxy for pre-Oct work
# Bao Tran
GitHub: BaoT1301
Email: bao@hemut.com
# Separate from Pranav — different person entirely
Rate Limit Management
GitHub's authenticated rate limit is 5,000 requests/hour. A full 26-engineer run uses ~900–1,300 calls — well within budget. These numbers let you estimate safely.
Call Budget Per Engineer
| Engineer Type | PR Count | Sampling | Est. API Calls |
|---|---|---|---|
| Small | <20 PRs | All PRs, full detail | 15–25 calls |
| Medium | 20–100 PRs | Every 3rd PR for detail | 40–80 calls |
| Large | >100 PRs | Every 5th PR for detail | 60–100 calls |
| Zero/Inactive | 0 PRs | Repo checks only | 5–10 calls |
Total Budget Breakdown (Full 26-Person Run)
| Group | Count | Calls each | Total |
|---|---|---|---|
| Active engineers (medium/large) | 14–16 | ~60–80 | ~900–1,200 |
| Partial/zero engineers | 10–12 | ~5–10 | ~100 |
| TOTAL | 26 | — | ~1,000–1,300 |
Parallelization Rules
# SAFE: Run up to 6 Agent 1s simultaneously
# UNSAFE: Running 7+ — will hit rate limits mid-pull
# If rate limited:
# → Agent will receive HTTP 403 with "rate limit exceeded"
# → Do NOT cancel. Wait for retry. Limit resets every hour on the hour.
# → Check reset time: gh api rate_limit --jq '.resources.core.reset'
# → Convert unix timestamp to local time to know when to retry
# Stagger if needed: start each Agent 1 with a 30-second gap
# This distributes the burst load
# Check remaining limit before starting a batch:
gh api rate_limit --jq '.resources.core | {remaining, reset}'
Sampling Rules
>100 PRs → sample every 5th PR for files/reviews/commits detail
50–100 PRs → sample every 3rd PR
<50 PRs → pull detail for all PRs
Sampling applies only to the per-PR deep-detail calls (files, reviews, commits). Always pull the full PR list metadata for every PR.
Weekly Sprint Dashboard Update
The Sprint Dashboard is the primary weekly tool — updated every sprint, even when you're not running full deep-dives. This is the 45-minute version of the workflow.
Files to Update
Outputs/2026-03-14_AI-Spend-Analytics/Sprint-Dashboard.html
Outputs/2026-03-14_AI-Spend-Analytics/Engineer-90Day-Breakdown.html
Pull Sprint PR Data
# PRs merged this sprint (run per repo, replace SPRINT_START with ISO date)
gh pr list \
--repo Hemut2025/Hemut-Prod \
--state merged \
--json author,title,additions,deletions,mergedAt,labels \
--limit 500 \
| jq '[.[] | select(.mergedAt >= "SPRINT_START")]'
# Aggregate by author (total PRs per engineer this sprint)
| jq 'group_by(.author.login) | map({login: .[0].author.login, count: length})'
# CI pass rate this sprint (per repo)
gh api "repos/Hemut2025/Hemut-Prod/actions/runs?per_page=100" \
--jq '[.workflow_runs[] | {actor: .actor.login, conclusion: .conclusion, created_at: .created_at}]'
# CI pass rate formula:
# success_runs / (success_runs + failure_runs) — exclude skipped/cancelled
Pull AI Spend Data
# Anthropic Console:
console.anthropic.com → Billing → Usage → Export CSV
# Match rows by cursor_email or claude_key_prefix from engineers.json
# Cursor seats:
cursor.sh → Team Settings → Billing → Export usage
# Match by cursor_email from engineers.json
What to Update in SPRINT_DATA
// In Sprint-Dashboard.html, find the SPRINT_DATA constant and update:
const SPRINT_DATA = {
engineers: [
{
id: "sumedh",
sprint: {
prs: 12, // ← PRs merged this sprint
aiSpend: 48.50, // ← AI spend this sprint in USD
ciPassRate: 0.94 // ← 0.0–1.0
},
ninety: {
prs: 87, // ← rolling 90-day PR total
aiSpend: 312.00, // ← rolling 90-day AI spend
dpr: 6.7 // ← dollars per PR (aiSpend / prs)
},
sprintNote: "" // ← new signals or flags this sprint
},
// ... repeat for all engineers ...
]
}
Sprint-Dashboard.html and Engineer-90Day-Breakdown.html are independent HTML files with their own embedded data objects. Updating one does not update the other.
Output Folder Structure
All engineering analytics live in one canonical folder. Do not rename or move it — the folder name is the baseline date, not the current date.
Do NOT rename the folder each sprint. Overwrite files in place. The folder name 2026-03-14 is the baseline/creation date, not the update date. Use git history if you need to recover a prior version.
Known Issues & Lessons Learned
Hard-won lessons from Sprint 13. Each one cost time to discover. Don't repeat them.
The original Sprint Dashboard tracked only Hemut-Prod and a few other repos. Missing repos caused 5× undercounts for Gabriele, 2× for Darius, 1.5× for Ashish and A2Z2. Engineers appeared low-output when they were actually active — just in repos the dashboard didn't know about.
Engineers who joined mid-window showed "dark periods" at the start of the 90-day window that were actually just their pre-hire period. Reporting a new hire as "50% inactive" is misleading and incorrect.
Before Hemut issued individual GitHub accounts, the team used a shared account. Darius's work before Oct 2025 is under the bundled account and cannot be individually attributed via GitHub. His GitHub history under DMahjoob starts October 2025.
A single agent that both pulls data AND generates the report fails at >50 PRs. The combined context (all PR data + chart generation + HTML writing) exceeds the model's effective context window, causing mid-run errors that lose all previously pulled data. You then have to start from scratch.
Running >6 Agent 1s simultaneously causes GitHub rate limit hits mid-pull, causing agents to either fail or produce partial data. Partial data is worse than no data because it's hard to detect.
The hemut-swe GitHub account (zhang@hemut.com) belongs to Andrew FE Zhang but is a service/merge account. If you pull data for both zhemut and hemut-swe, you double-count his output. If you pull for neither, you miss his work. Use only zhemut.
Several engineers have 0-character descriptions on routine PRs but excellent, detailed descriptions on complex ones. Reporting "0 avg description length" misrepresents their behavior. The bimodal distribution is actually a reasonable practice (don't over-document trivial changes).
The Sprint Dashboard used a different CI pass rate calculation than the Agent 1 pulls, making the two documents contradict each other. The dashboard counted all run statuses; the agent excluded skipped/cancelled runs. This made the same engineer appear at different pass rates in different views.
Quick Reference — All 26 Handles
Print this and keep it next to your keyboard. Sorted by report number. Verify handles against engineers.json before each run.
| zhangandrew2 | = A2Z2 Andrew Zhang — SWE INTERN, T3, works on fleet/Hemut-Prod, $500/mo |
| zhemut | = Andrew FE Zhang — FOUNDING ENGINEER, T1, works on RFP Module, $5,000/mo |
engineers.json is the source of truth — always look up the handle there. Never resolve from memory. The handle swap caused the worst data integrity failure in dashboard history (146 PRs attributed to wrong person).
| # | Name | GitHub Handle | $/mo | Status | |
|---|---|---|---|---|---|
| 01 | Sumedh Kane | sumedhkane03 | sumedh@hemut.com | $5,000 | Active |
| 02 | Ashish Lathkar | Ashishlathkar77 | ashish@hemut.com | $5,000 | Active |
| 03 | A2Z2 Andrew Zhang | zhangandrew2 | a2z2@hemut.com | $500 | Active |
| 04 | Avetis Avagyan | Avetis-Av | avetis@hemut.com | $5,000 | Active |
| 05 | Federico Lora | lorafed | federico@hemut.com | $5,000 | Active (DARK) |
| 06 | Pranav Guda | PranavGuda / hemoot | pranav@hemut.com | $5,000 | Active |
| 07 | Aaryan Jadhav | AaryanJ45 | aaryan@hemut.com | $500 | Active |
| 08 | Harshiv Thakkar | harshiv49 | harshiv@hemut.com | $5,000 | Active |
| 09 | Andrew FE Zhang | zhemut | andrewzhang@hemut.com | $5,000 | Active |
| 10 | Kalpan Bariya | kalpan-hemut | kalpan@hemut.com | $5,000 | Active |
| 11 | Darius Mahjoob | DMahjoob | darius@hemut.com | $500 | Active (remote) |
| 12 | Gabriele Ghione | gabrieleghioneusc | gabriele@hemut.com | $1,000 | Active |
| 13 | Tarun Vadapalli | TeleVision05 | tarun@hemut.com | $5,000 | Active |
| 14 | Tyler Kim | tjsook | tyler@hemut.com | $0 | Unresolved |
| 15 | Justin Jiang | justinjiang37 | justin.jiang@hemut.com | $1,000 | Active |
| 16 | Evan Adami | EvanAd7 | evan@hemut.com | $0 | Part-time |
| 17 | Bao Tran | BaoT1301 | bao@hemut.com | $0 | Unresolved |
| 18 | Sahana | sahanashetty11 | sahana@hemut.com | $0 | TERMINATED |
| 19 | Xavier | 5029xavier | tian.alex98@gmail.com | $0 | External |
| 20 | Ani | aniruthsivakumar00 | ani@daydreams.digital | $0 | External |
| 21 | Alex Kruchten | — | alex@hemut.com | $5,000 | Active (Bain) |
| 22 | Len Moran | — | len@hemut.com | $0 | Active (UCLA) |
| 23 | Aidan Tack | — | tack@hemut.com | $1,500 | Active |
| 24 | George Cole | gt-c | george@hemut.com | $0 | Unresolved |
| 25 | Jan | jmbassi | jan@hemut.com | $0 | Unresolved |
| 26 | Nishant Singh | — | nishant@hemut.com | $0 | Active |
Status Key
Known Gotchas — Read Before Debugging
These are the failure modes discovered the hard way. Each one is a time sink if you hit it cold. Check this section before spending more than 5 minutes debugging any pipeline failure.
On this machine, gh auth status exits code 1 due to a keyring warning about the hemoot account — even when the active token is fully functional. If you see "not authenticated" during collection, this is almost certainly a false negative.
Two simultaneous processes write to the same checkpoint files (data/github_{id}.json). The second process overwrites the first's valid data with blank data. Result: 0 PRs for all engineers, but the file is still present so you don't notice immediately.
The commits field in gh pr list --json triggers GitHub's GraphQL 500K node limit for repos with large PR histories. The entire query silently returns empty data — no error, no warning, just 0 results. This was the single largest data integrity failure in dashboard history.
If collection appears to complete but the PR verification check shows 0, check the file size: ls -lh data/github.json. Expected size is >100 KB. A 19 KB file means collection ran but wrote empty data — typically caused by a race condition or auth failure that was not surfaced as an error.
This means the data merge step did not wire Linear data into github_scored.json. The merge step runs but may not populate the output file with Linear fields if it fails silently.
Step 2.8 (GPT-4o-mini PR scoring) is silent for 3–5 minutes. Step 2.9 (Perplexity coaching) is silent for 2–3 minutes. These look like hangs but are batch API calls that produce no intermediate output.
This means a format_currency() call was missed at an HTML injection point in generate_dashboard.py. Raw Python floats leaking into HTML look deeply unprofessional and would be visible to engineers in 1:1s.
The scoring cache persists between runs to save API cost. After any change to score_prs.py logic, the cache contains scores computed with the old broken logic.
QA Protocol — Mandatory After Every Deploy
Wrong numbers in this dashboard mean wrong performance assessments that affect real people's comp and retention. A coaching item that cites a fabricated PR count or a spend number with floating-point noise will be read by an engineer as a factual statement about their work. Run QA after every single deploy.
QA Process — 5 Parallel Haiku Agents
Run these 5 batches simultaneously. Use Claude Haiku (not Sonnet/Opus) — it is fast, cheap, and sufficient for number-comparison verification. Each agent checks 3–4 engineer pages against the source data files.
| Agent | Pages to Check | Source Data Files |
|---|---|---|
| Batch 1 | sumedh.html, ashish.html, a2z2.html, avetis.html |
data/github_scored.json, data/ai_spend.json |
| Batch 2 | federico.html, pranav.html, aaryan.html, harshiv.html |
data/github_scored.json, data/ai_spend.json |
| Batch 3 | andrewfe.html, kalpan.html, darius.html, gabriele.html |
data/github_scored.json, data/ai_spend.json |
| Batch 4 | tarun.html, tyler.html, justin.html, evan.html |
data/github_scored.json, data/ai_spend.json |
| Batch 5 | sprint.html, team.html, spend.html |
data/github_scored.json, data/ai_spend.json, data/linear_data.json |
What to Verify Per Engineer Page
- Total PRs: matches data/github_scored.json → engineer → total_prs
- AI spend: shows as $X,XXX.XX (no raw floats like $3,447.2700000000023)
- Linear done/total: shows real numbers, not 0/0 with a "not synced" warning
- Weighted rank: matches github_scored.json → weighted_rank
- Model mix: shows real percentages (not all 0%) — should sum to ~100%
- Coaching items: cite real PR counts from source data, no fabricated course names, no engineer's own name as action subject
- Sprint table: count in header (N) matches KPI tile count
⚠️ ENGINEER PAGE RULE — Never show on individual engineer pages
Individual engineer pages (public/engineers/*.html) must NOT display:
• Monthly or annual salary
• Tier number (T1/T2/T3)
• Output rank (#1 of 25, #14 of 25, etc.)
• Peer compensation comparisons
These metrics are shown to engineers in 1:1s. Displaying them hurts morale. Keep these on sprint.html, team.html, spend.html (manager-only views) instead.
Fast QA Prompt (copy-paste for Haiku agent)
# Paste this to a Claude Haiku agent (one per batch)
Use Haiku model. Read:
1. public/engineers/sumedh.html
2. public/engineers/ashish.html
3. public/engineers/a2z2.html
4. data/github_scored.json
For each engineer, verify:
- Total PRs in HTML matches github_scored.json total_prs
- AI spend is formatted as $X,XXX.XX (no raw floats)
- Linear done/total shows non-zero values
- Weighted rank matches github_scored.json weighted_rank
- Model mix percentages are not all 0%
Output: one line per engineer in format:
[NAME]: PASS or FAIL — [brief note if FAIL]
After QA — Fix and Redeploy
# Fix rendering bug → fast redeploy (no re-scoring needed)
./run.sh --skip-github --deploy
# Fix data bug → check source files in data/ first, then redeploy
# If score_prs.py was changed: --rebuild-cache required
./run.sh --skip-github --rebuild-cache --deploy
What's Built — Current System State (March 15, 2026)
Reference for understanding what each script does and where to look when something breaks.
Pipeline Scripts
| Script | What it does | Run.sh step | Notes |
|---|---|---|---|
collect_github.py |
Pulls all PR data from GitHub API for all engineers | Step 1 (manual, agent) | REBUILT. Parallel batches (5 workers), per-engineer checkpointing, timestamped progress, --resume flag, auth uses gh auth token |
score_prs.py |
Scores each PR by complexity using GPT-4o-mini | Step 2.8 | Caches results. Use --rebuild-cache after any logic change. Deployment PRs (staging→main) score fixed 1.5/XS. |
merge_all_sources.py |
Merges GitHub, Linear, Sentry, AI spend into one data object per engineer | Step 2.7 | Produces github_scored.json. Linear field is completed (not done). |
generate_coaching.py |
Generates Perplexity + GPT-4o-mini coaching items per engineer | Step 2.9 | NEW (Mar 15). 5-layer surgical prompt. Cached by gap signature. Labels items [INDIVIDUAL ACTION] vs [PROCESS FIX]. Must pass total_prs and ticket_linked_pct explicitly. |
generate_dashboard.py |
Generates all HTML files in public/ | Step 2.10 | 101+ HTML files. format_currency() must be applied at every dollar injection point. |
Output Files in public/
public/
├── index.html ← dashboard home / nav hub
├── engineers/
│ ├── sumedh.html ← per-engineer sprint page (25 files)
│ ├── ashish.html
│ └── ... (one per engineer) ...
├── sprint.html ← team-wide sprint view
├── team.html ← org tier map, spend treemap
├── spend.html ← AI spend analytics
├── sentry.html ← production error health
├── linear.html ← issue completion rates
├── code-quality.html ← PR quality metrics
├── WEEKLY-RUNBOOK.html ← this document (also in Dashboard root)
└── archive/ ← superseded legacy files
NOTE: public/**/*.html is gitignored — regenerated on every pipeline run.
Save committed docs (runbook, playbooks) to Dashboard/ root, not public/.
Quarterly Artifacts (Manual — Do NOT Auto-Regenerate)
The 14 engineer deep-dive HTML reports and 00_MASTER-Engineering-Synthesis.html are quarterly artifacts saved to the Dashboard root. Never include them in the weekly pipeline. Never save them to public/ (gitignored). Their value is depth and curation — the weekly pipeline would overwrite them with thinner auto-generated versions.
Dashboard/
├── 00_MASTER-Engineering-Synthesis.html ← quarterly org narrative
├── 01_Sumedh-Kane-Deep-Dive.html ← quarterly deep-dive (14 files)
├── 02_Ashish-Lathkar-Deep-Dive.html
└── ... (one per engineer) ...
Key Data Files
| File | Contents | Updated by |
|---|---|---|
data/github.json | Raw PR data for all engineers | collect_github.py (manual step 1) |
data/github_scored.json | Scored + merged data (source of truth for HTML generation) | score_prs.py + merge_all_sources.py |
data/ai_spend.json | Per-engineer AI spend from Anthropic billing CSV | process_csv.py (Step 2.2) |
data/linear_data.json | Linear ticket completion rates per engineer | collect_linear.py |
data/coaching_cache.json | Cached Perplexity + GPT coaching outputs (keyed by gap signature) | generate_coaching.py |
engineers.json | Single source of truth for all engineer metadata | Manual — update on every hire/termination |
Vercel Deployment
# Deploy is handled by run.sh --deploy flag
# Manual deploy if needed:
cd "/Users/lokicheema/Desktop/Hemut Files/Dashboard"
vercel --prod --yes
# Live URL:
https://public-delta-tawny.vercel.app/
# Runbook URL:
https://public-delta-tawny.vercel.app/WEEKLY-RUNBOOK.html
# After updating WEEKLY-RUNBOOK.html — copy to public/ and deploy:
cp WEEKLY-RUNBOOK.html public/WEEKLY-RUNBOOK.html
vercel --prod --yes