HubSpot Contact Scoring Auditor

AI-powered audit that analyzes your HubSpot lead scoring model's accuracy, identifies false positives and negatives, and recommends specific calibration adjustments.

AI-powered audit that analyzes your HubSpot lead scoring model's accuracy, identifies false positives and negatives, and recommends specific calibration adjustments. 24-node n8n workflow with 4 agents (Fetcher, Assembler, Analyst, Formatter). Monthly Schedule Trigger (1st of month 10:00 UTC) or manual Webhook. Fetcher paginates HubSpot API for contacts with lead_score and associated deals with outcomes. Assembler pre-computes confusion matrix, per-segment accuracy, score distributions. Analyst (the analysis model) produces 5-dimension scoring model audit with Model Health classification. Formatter (the analysis model) generates Notion audit report and Slack summary with conditional urgency routing. SINGLE-MODEL AGGREGATE: dual the analysis model, $0.08-$0.10/run. This came from a marketing ops team that realized their lead scoring model had not been recalibrated in 18 months. The auditor evaluates scoring accuracy against actual conversion data and flags where the model has drifted.

Last updated March 16, 2026

CRM migrations and system integrations introduce data quality debt that compounds over months. Duplicate records, orphaned contacts, and inconsistent field mappings degrade every downstream workflow. Automated governance catches drift before it corrupts pipeline reporting.

triggerMonthly01FetcherHubSpot API02AssemblerConfusion Matrix03Analyst5-Dim Audit04FormatterReport + AlertNotionAudit ReportSlackSummary

Four Agents. Five Dimensions. Monthly Scoring Model Governance.

Fetcher

Step 1Fetcher

Schedule + Code

Schedule Trigger fires monthly (1st of month 10:00 UTC) or manual Webhook for on-demand audits. Config Loader reads LOOKBACK_DAYS, SCORE_FIELD, CRITICAL_ACCURACY_THRESHOLD, MIN_CONTACTS, NOTION_DATABASE_ID, SLACK_CHANNEL. Fetcher paginates HubSpot API for contacts with lead_score property and associated deals with outcomes (won/lost/open).

Assembler

Step 2Assembler

Code-only

What does Assembler actually decide? Pre-computes all math before LLM: confusion matrix (true positives, true negatives, false positives, false negatives), overall accuracy, per-segment accuracy by industry and persona, score distribution analysis, and threshold calibration metrics. Data Threshold Gate enforces minimum 50 contacts before proceeding to Analyst.

Analyst

Step 3Analyst

Tier 2 Classification

This step exists because raw data alone is not enough. the analysis model receives ONE aggregate call with pre-computed metrics. Produces a 5-dimension scoring model audit: false_positives (overvalued contacts), false_negatives (undervalued contacts), threshold_calibration (optimal MQL/SQL cutoff), segment_blind_spots (systematic failures by industry/persona), feature_decay (scoring criteria that no longer correlate). Classifies Model Health: HEALTHY (>75%), NEEDS_TUNING (60-75%), CRITICAL (<60%).

Formatter

Step 4Formatter

Tier 2 Classification

Without this step, upstream analysis sits idle. the analysis model generates Notion audit report page (executive summary, per-dimension analysis, calibration recommendations) and Slack summary message. Conditional urgency routing: accuracy below CRITICAL_ACCURACY_THRESHOLD (default 60%) triggers an urgent calibration alert in Slack in addition to the standard summary.We price by pipeline complexity, not integration count. A $349 blueprint reflects 3x more prompt engineering than a $199 one.

That's the full pipeline. Here's what it intentionally does NOT do — and why those boundaries exist.

What It Does NOT Do

×

Does not score individual leads — that is what Inbound Lead Qualifier does

×

Does not monitor account health — that is what Account Health Intelligence Agent does

×

Does not coach sales reps — that is what Sales Rep Performance Coach does

×

Does not modify HubSpot lead scores or contact data — read-only audit with Notion and Slack output

×

Does not scrape external websites — all data from HubSpot API

×

Does not analyze individual deal outcomes — provides aggregate scoring model accuracy assessment

With those boundaries clear, here's everything that ships when you purchase.

The Complete Customer Success Bundle

7 files.

CHANGELOG.mdVersion history
README.mdSetup and configuration guide
TDD.mdTechnical Design Document
hubspot_contact_scoring_auditor_v1.0.0.jsonn8n workflow (main pipeline)
itp-results.mdInspection test results
system_prompts/analyst_system_prompt.mdAnalyst system prompt
system_prompts/formatter_system_prompt.mdFormatter system prompt

The technical specifications below are ITP-measured, not estimated.

Tested. Measured. Documented.

Every metric is Independent Test Protocol (ITP)-measured. The HubSpot Contact Scoring Auditor turns your lead scoring data into a monthly accuracy audit — pre-computing confusion matrix and per-segment accuracy, then generating a 5-dimension scoring model audit with Model Health classification and calibration recommendations at $0.08-$0.10/run.

Workflow Nodes

24

Blueprint Quality Standard

12/12 PASS

Agent Architecture

4 agents: Fetcher (code-only), Assembler (code-only), Analyst (Sonnet 4.6), Formatter (Sonnet 4.6)

Required Credentials

Anthropic API, HubSpot (OAuth2), Notion (httpHeaderAuth), Slack (httpHeaderAuth)

Bundle Contents

7 files

Cost per Run

$0.08-$0.10 (ITP-measured)

ITP Milestones

8/8 variations, 14/14 milestones PASS

n8n Compatibility

2.7.5

Tested on n8n v2.7.5, March 2026

HubSpot Contact Scoring Auditor v1.0.0 — Technical Reference━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━Architecture: 24 n8n nodes, 4 agents (Fetcher → Assembler → Analyst → Formatter)Trigger:      Monthly schedule (1st of month, 10:00 UTC) or manual WebhookInput:        HubSpot API — contacts with lead_score + associated deal outcomesIntelligence: Sonnet 4.6 (Analyst 5-dimension audit + Formatter report/alert)Output:       Notion (audit report page) + Slack (summary + conditional urgent alert)Cost:         $0.08-$0.10/run (ITP-measured average)ITP:          8 variations, 14/14 milestones PASSBQS:          12/12 PASSTool A:       HubSpot (input — contacts + deal outcomes via OAuth2 API)Tool B:       Notion (output — audit report page via httpHeaderAuth API)Tool C:       Slack (output — summary + calibration alert via Bot Token)Intelligence: 5-dimension audit taxonomy + AGGREGATE pattern (single Analyst call)Cost Value:   0.09

What You'll Need

Platform

n8n 2.7.5+

Est. Monthly API Cost

$0.08-$0.10/month (monthly runs) + HubSpot/Notion/Slack included tiers

Credentials Required

  • Anthropic API
  • HubSpot (OAuth2)
  • Notion (httpHeaderAuth, Bearer prefix)
  • Slack (httpHeaderAuth, Bearer prefix, chat:write scope)

Services

  • HubSpot account (OAuth2 with contacts and deals scopes, lead scoring enabled)
  • Notion workspace (integration token with Bearer prefix)
  • Slack workspace (Bot Token with chat:write scope)
  • Anthropic API key

Setup Track

Quick Start

~15 min

All credentials live, n8n running

Full Setup

1–2 hrs

Needs API config + tables

From Scratch

2–4 hrs

No n8n, no credentials

HubSpot Contact Scoring Auditor v1.0.0

$249

one-time purchase

What you get:

  • ITP-tested 24-node n8n workflow — import and deploy
  • Monthly Schedule Trigger (1st of month 10:00 UTC) or manual Webhook for on-demand audits
  • HubSpot API pagination for contacts with lead_score and associated deal outcomes
  • Pre-computed confusion matrix, per-segment accuracy, and score distribution analysis
  • 5-dimension scoring audit: false_positives, false_negatives, threshold_calibration, segment_blind_spots, feature_decay
  • Model Health classification: HEALTHY (>75%), NEEDS_TUNING (60-75%), CRITICAL (<60%)
  • Specific calibration adjustment recommendations grounded in data
  • Notion audit report page with executive summary and per-dimension analysis
  • Slack summary with conditional urgency routing (accuracy below threshold triggers urgent alert)
  • AGGREGATE architecture: single Analyst + Formatter calls — $0.08-$0.10/run regardless of contact count
  • Dual the analysis model: no Opus required
  • ITP 8 variations, 14/14 milestones, $0.08-$0.10/run measured
  • All sales final after download

Frequently Asked Questions

How does it differ from Account Health Intelligence Agent?+

Different units and taxonomies. AHIA monitors per-account health from HubSpot engagement signals — at_risk, declining, stable, growing. HCSA audits the lead scoring model itself for accuracy and recommends calibration adjustments.

What are the five audit dimensions?+

false_positives — high-scored contacts that lost or never converted, revealing what the model overvalues. false_negatives — low-scored contacts that won, revealing what the model undervalues. threshold_calibration — optimal score cutoff for MQL/SQL routing.

What is Model Health classification?+

HEALTHY (accuracy above 75%) means the scoring model is performing well with minor adjustments recommended. NEEDS_TUNING (60-75%) means measurable accuracy gaps exist and specific recalibration is recommended. CRITICAL (below 60%) means the model is misrouting leads consistently and triggers an urgent calibration alert in Slack.

How does it relate to Inbound Lead Qualifier?+

Meta-level analysis. ILQ scores individual inbound leads against an ICP using lead_utility_score. HCSA audits the scoring model that ILQ and similar tools rely on.

Why is it so cheap at $0.08-$0.10/run?+

AGGREGATE architecture. The Assembler pre-computes all math (confusion matrix, accuracy, per-segment breakdowns, score distributions) in code-only nodes before any LLM call. The Analyst receives ONE call with aggregate metrics — not per-contact analysis.

How many contacts can it handle?+

The Fetcher paginates the HubSpot API to collect all contacts with a lead_score property and associated deals. The Assembler computes metrics in code — no LLM token limit applies to the data processing. The Analyst receives aggregate metrics, not raw contact data.

Does it use web scraping?+

No. All data comes from the HubSpot API: contact records with lead_score properties and associated deal records with outcomes. No web_search, no external data sources, no scraping.

What triggers the urgent calibration alert?+

When the overall scoring model accuracy falls below the CRITICAL_ACCURACY_THRESHOLD (default 60%), the Formatter sends an additional urgent calibration alert to Slack alongside the standard summary. This threshold is configurable in the Config Loader.

Is there a refund policy?+

All sales are final after download. Review the Blueprint Dependency Matrix and prerequisites before purchase. Questions?

How does this compare to HubSpot's built-in lead scoring?+

HubSpot's native scoring uses point-based rules you configure manually. This blueprint audits your existing scoring by comparing scores against actual deal outcomes — it tells you whether your scoring rules are predicting reality.

Read the full guide →

Related Blueprints