HubSpot Contact Scoring Auditor
AI-powered audit that analyzes your HubSpot lead scoring model's accuracy, identifies false positives and negatives, and recommends specific calibration adjustments.
AI-powered audit that analyzes your HubSpot lead scoring model's accuracy, identifies false positives and negatives, and recommends specific calibration adjustments. 24-node n8n workflow with 4 agents (Fetcher, Assembler, Analyst, Formatter). Monthly Schedule Trigger (1st of month 10:00 UTC) or manual Webhook. Fetcher paginates HubSpot API for contacts with lead_score and associated deals with outcomes. Assembler pre-computes confusion matrix, per-segment accuracy, score distributions. Analyst (the analysis model) produces 5-dimension scoring model audit with Model Health classification. Formatter (the analysis model) generates Notion audit report and Slack summary with conditional urgency routing. SINGLE-MODEL AGGREGATE: dual the analysis model, $0.08-$0.10/run. This came from a marketing ops team that realized their lead scoring model had not been recalibrated in 18 months. The auditor evaluates scoring accuracy against actual conversion data and flags where the model has drifted.
Last updated March 16, 2026
CRM migrations and system integrations introduce data quality debt that compounds over months. Duplicate records, orphaned contacts, and inconsistent field mappings degrade every downstream workflow. Automated governance catches drift before it corrupts pipeline reporting.
Four Agents. Five Dimensions. Monthly Scoring Model Governance.
Step 1 — Fetcher
Schedule + Code
Schedule Trigger fires monthly (1st of month 10:00 UTC) or manual Webhook for on-demand audits. Config Loader reads LOOKBACK_DAYS, SCORE_FIELD, CRITICAL_ACCURACY_THRESHOLD, MIN_CONTACTS, NOTION_DATABASE_ID, SLACK_CHANNEL. Fetcher paginates HubSpot API for contacts with lead_score property and associated deals with outcomes (won/lost/open).
Step 2 — Assembler
Code-only
What does Assembler actually decide? Pre-computes all math before LLM: confusion matrix (true positives, true negatives, false positives, false negatives), overall accuracy, per-segment accuracy by industry and persona, score distribution analysis, and threshold calibration metrics. Data Threshold Gate enforces minimum 50 contacts before proceeding to Analyst.
Step 3 — Analyst
Tier 2 Classification
This step exists because raw data alone is not enough. the analysis model receives ONE aggregate call with pre-computed metrics. Produces a 5-dimension scoring model audit: false_positives (overvalued contacts), false_negatives (undervalued contacts), threshold_calibration (optimal MQL/SQL cutoff), segment_blind_spots (systematic failures by industry/persona), feature_decay (scoring criteria that no longer correlate). Classifies Model Health: HEALTHY (>75%), NEEDS_TUNING (60-75%), CRITICAL (<60%).
Step 4 — Formatter
Tier 2 Classification
Without this step, upstream analysis sits idle. the analysis model generates Notion audit report page (executive summary, per-dimension analysis, calibration recommendations) and Slack summary message. Conditional urgency routing: accuracy below CRITICAL_ACCURACY_THRESHOLD (default 60%) triggers an urgent calibration alert in Slack in addition to the standard summary.We price by pipeline complexity, not integration count. A $349 blueprint reflects 3x more prompt engineering than a $199 one.
That's the full pipeline. Here's what it intentionally does NOT do — and why those boundaries exist.
What It Does NOT Do
Does not score individual leads — that is what Inbound Lead Qualifier does
Does not monitor account health — that is what Account Health Intelligence Agent does
Does not coach sales reps — that is what Sales Rep Performance Coach does
Does not modify HubSpot lead scores or contact data — read-only audit with Notion and Slack output
Does not scrape external websites — all data from HubSpot API
Does not analyze individual deal outcomes — provides aggregate scoring model accuracy assessment
With those boundaries clear, here's everything that ships when you purchase.
The Complete Customer Success Bundle
7 files.
The technical specifications below are ITP-measured, not estimated.
Tested. Measured. Documented.
Every metric is Independent Test Protocol (ITP)-measured. The HubSpot Contact Scoring Auditor turns your lead scoring data into a monthly accuracy audit — pre-computing confusion matrix and per-segment accuracy, then generating a 5-dimension scoring model audit with Model Health classification and calibration recommendations at $0.08-$0.10/run.
Workflow Nodes
24
Blueprint Quality Standard
12/12 PASS
Agent Architecture
4 agents: Fetcher (code-only), Assembler (code-only), Analyst (Sonnet 4.6), Formatter (Sonnet 4.6)
Required Credentials
Anthropic API, HubSpot (OAuth2), Notion (httpHeaderAuth), Slack (httpHeaderAuth)
Bundle Contents
7 files
Cost per Run
$0.08-$0.10 (ITP-measured)
ITP Milestones
8/8 variations, 14/14 milestones PASS
n8n Compatibility
2.7.5
Tested on n8n v2.7.5, March 2026
HubSpot Contact Scoring Auditor v1.0.0 — Technical Reference━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━Architecture: 24 n8n nodes, 4 agents (Fetcher → Assembler → Analyst → Formatter)Trigger: Monthly schedule (1st of month, 10:00 UTC) or manual WebhookInput: HubSpot API — contacts with lead_score + associated deal outcomesIntelligence: Sonnet 4.6 (Analyst 5-dimension audit + Formatter report/alert)Output: Notion (audit report page) + Slack (summary + conditional urgent alert)Cost: $0.08-$0.10/run (ITP-measured average)ITP: 8 variations, 14/14 milestones PASSBQS: 12/12 PASSTool A: HubSpot (input — contacts + deal outcomes via OAuth2 API)Tool B: Notion (output — audit report page via httpHeaderAuth API)Tool C: Slack (output — summary + calibration alert via Bot Token)Intelligence: 5-dimension audit taxonomy + AGGREGATE pattern (single Analyst call)Cost Value: 0.09
What You'll Need
Platform
n8n 2.7.5+
Est. Monthly API Cost
$0.08-$0.10/month (monthly runs) + HubSpot/Notion/Slack included tiers
Credentials Required
- ▪Anthropic API
- ▪HubSpot (OAuth2)
- ▪Notion (httpHeaderAuth, Bearer prefix)
- ▪Slack (httpHeaderAuth, Bearer prefix, chat:write scope)
Services
- ▪HubSpot account (OAuth2 with contacts and deals scopes, lead scoring enabled)
- ▪Notion workspace (integration token with Bearer prefix)
- ▪Slack workspace (Bot Token with chat:write scope)
- ▪Anthropic API key
Setup Track
Quick Start
~15 min
All credentials live, n8n running
Full Setup
1–2 hrs
Needs API config + tables
From Scratch
2–4 hrs
No n8n, no credentials
HubSpot Contact Scoring Auditor v1.0.0
$249
one-time purchase
What you get:
- ✓ITP-tested 24-node n8n workflow — import and deploy
- ✓Monthly Schedule Trigger (1st of month 10:00 UTC) or manual Webhook for on-demand audits
- ✓HubSpot API pagination for contacts with lead_score and associated deal outcomes
- ✓Pre-computed confusion matrix, per-segment accuracy, and score distribution analysis
- ✓5-dimension scoring audit: false_positives, false_negatives, threshold_calibration, segment_blind_spots, feature_decay
- ✓Model Health classification: HEALTHY (>75%), NEEDS_TUNING (60-75%), CRITICAL (<60%)
- ✓Specific calibration adjustment recommendations grounded in data
- ✓Notion audit report page with executive summary and per-dimension analysis
- ✓Slack summary with conditional urgency routing (accuracy below threshold triggers urgent alert)
- ✓AGGREGATE architecture: single Analyst + Formatter calls — $0.08-$0.10/run regardless of contact count
- ✓Dual the analysis model: no Opus required
- ✓ITP 8 variations, 14/14 milestones, $0.08-$0.10/run measured
- ✓All sales final after download
Frequently Asked Questions
How does it differ from Account Health Intelligence Agent?+
Different units and taxonomies. AHIA monitors per-account health from HubSpot engagement signals — at_risk, declining, stable, growing. HCSA audits the lead scoring model itself for accuracy and recommends calibration adjustments.
What are the five audit dimensions?+
false_positives — high-scored contacts that lost or never converted, revealing what the model overvalues. false_negatives — low-scored contacts that won, revealing what the model undervalues. threshold_calibration — optimal score cutoff for MQL/SQL routing.
What is Model Health classification?+
HEALTHY (accuracy above 75%) means the scoring model is performing well with minor adjustments recommended. NEEDS_TUNING (60-75%) means measurable accuracy gaps exist and specific recalibration is recommended. CRITICAL (below 60%) means the model is misrouting leads consistently and triggers an urgent calibration alert in Slack.
How does it relate to Inbound Lead Qualifier?+
Meta-level analysis. ILQ scores individual inbound leads against an ICP using lead_utility_score. HCSA audits the scoring model that ILQ and similar tools rely on.
Why is it so cheap at $0.08-$0.10/run?+
AGGREGATE architecture. The Assembler pre-computes all math (confusion matrix, accuracy, per-segment breakdowns, score distributions) in code-only nodes before any LLM call. The Analyst receives ONE call with aggregate metrics — not per-contact analysis.
How many contacts can it handle?+
The Fetcher paginates the HubSpot API to collect all contacts with a lead_score property and associated deals. The Assembler computes metrics in code — no LLM token limit applies to the data processing. The Analyst receives aggregate metrics, not raw contact data.
Does it use web scraping?+
No. All data comes from the HubSpot API: contact records with lead_score properties and associated deal records with outcomes. No web_search, no external data sources, no scraping.
What triggers the urgent calibration alert?+
When the overall scoring model accuracy falls below the CRITICAL_ACCURACY_THRESHOLD (default 60%), the Formatter sends an additional urgent calibration alert to Slack alongside the standard summary. This threshold is configurable in the Config Loader.
Is there a refund policy?+
All sales are final after download. Review the Blueprint Dependency Matrix and prerequisites before purchase. Questions?
How does this compare to HubSpot's built-in lead scoring?+
HubSpot's native scoring uses point-based rules you configure manually. This blueprint audits your existing scoring by comparing scores against actual deal outcomes — it tells you whether your scoring rules are predicting reality.
Related Blueprints
Account Health Intelligence Agent
Weekly AI health briefs for every account.
Inbound Lead Qualifier
Qualify inbound form leads with a 3-agent ILQ scoring pipeline — web research, 4-criteria scoring, and automatic Pipedrive routing.
Sales Rep Performance Coach
Weekly AI coaching briefs for every sales rep.