HubSpot Contact Scoring Auditor

AI-powered audit that analyzes your HubSpot lead scoring model's accuracy, identifies false positives and negatives, and recommends specific calibration adjustments.

AI-powered audit that analyzes your HubSpot lead scoring model's accuracy, identifies false positives and negatives, and recommends specific calibration adjustments. 24-node n8n workflow with 4 agents (Fetcher, Assembler, Analyst, Formatter). Monthly Schedule Trigger (1st of month 10:00 UTC) or manual Webhook. Fetcher paginates HubSpot API for contacts with lead_score and associated deals with outcomes. Assembler pre-computes confusion matrix, per-segment accuracy, score distributions. Analyst (Sonnet 4.6) produces 5-dimension scoring model audit with Model Health classification. Formatter (Sonnet 4.6) generates Notion audit report and Slack summary with conditional urgency routing. SINGLE-MODEL AGGREGATE: dual Sonnet 4.6, $0.08-$0.10/run.

triggerMonthly01FetcherHubSpot API02AssemblerConfusion Matrix03Analyst5-Dim Audit04FormatterReport + AlertNotionAudit ReportSlackSummary

Four Agents. Five Dimensions. Monthly Scoring Model Governance.

Fetcher

Step 1Fetcher

Schedule + Code

Schedule Trigger fires monthly (1st of month 10:00 UTC) or manual Webhook for on-demand audits. Config Loader reads LOOKBACK_DAYS, SCORE_FIELD, CRITICAL_ACCURACY_THRESHOLD, MIN_CONTACTS, NOTION_DATABASE_ID, SLACK_CHANNEL. Fetcher paginates HubSpot API for contacts with lead_score property and associated deals with outcomes (won/lost/open).

Assembler

Step 2Assembler

Code-only

Pre-computes all math before LLM: confusion matrix (true positives, true negatives, false positives, false negatives), overall accuracy, per-segment accuracy by industry and persona, score distribution analysis, and threshold calibration metrics. Data Threshold Gate enforces minimum 50 contacts before proceeding to Analyst.

Analyst

Step 3Analyst

Tier 2 Classification

Sonnet 4.6 receives ONE aggregate call with pre-computed metrics. Produces a 5-dimension scoring model audit: false_positives (overvalued contacts), false_negatives (undervalued contacts), threshold_calibration (optimal MQL/SQL cutoff), segment_blind_spots (systematic failures by industry/persona), feature_decay (scoring criteria that no longer correlate). Classifies Model Health: HEALTHY (>75%), NEEDS_TUNING (60-75%), CRITICAL (<60%).

Formatter

Step 4Formatter

Tier 2 Classification

Sonnet 4.6 generates Notion audit report page (executive summary, per-dimension analysis, calibration recommendations) and Slack summary message. Conditional urgency routing: accuracy below CRITICAL_ACCURACY_THRESHOLD (default 60%) triggers an urgent calibration alert in Slack in addition to the standard summary.

What It Does NOT Do

×

Does not score individual leads — that is what Inbound Lead Qualifier does

×

Does not monitor account health — that is what Account Health Intelligence Agent does

×

Does not coach sales reps — that is what Sales Rep Performance Coach does

×

Does not modify HubSpot lead scores or contact data — read-only audit with Notion and Slack output

×

Does not scrape external websites — all data from HubSpot API

×

Does not analyze individual deal outcomes — provides aggregate scoring model accuracy assessment

The Complete Customer Success Bundle

7 files — workflow JSON, system prompts, TDD, and complete documentation.

hubspot_contact_scoring_auditor_v1_0_0.jsonThe 24-node n8n workflow
README.md10-minute setup guide with HubSpot, Notion, Slack, and Anthropic configuration
docs/TDD.mdTechnical Design Document with 5-dimension audit taxonomy and AGGREGATE pattern
system_prompts/analyst_system_prompt.mdAnalyst prompt (5-dimension scoring audit, Model Health classification, calibration recommendations)
system_prompts/formatter_system_prompt.mdFormatter prompt (Notion audit report blocks, Slack summary, conditional urgent alert)
CHANGELOG.mdVersion history

Tested. Measured. Documented.

Every metric is ITP-measured. The HubSpot Contact Scoring Auditor turns your lead scoring data into a monthly accuracy audit — pre-computing confusion matrix and per-segment accuracy, then generating a 5-dimension scoring model audit with Model Health classification and calibration recommendations at $0.08-$0.10/run.

Workflow Nodes

24

Blueprint Quality Standard

12/12 PASS

Agent Architecture

4 agents: Fetcher (code-only), Assembler (code-only), Analyst (Sonnet 4.6), Formatter (Sonnet 4.6)

Required Credentials

Anthropic API, HubSpot (OAuth2), Notion (httpHeaderAuth), Slack (httpHeaderAuth)

Bundle Contents

7 files

Cost per Run

$0.08-$0.10 (ITP-measured)

ITP Milestones

8/8 variations, 14/14 milestones PASS

n8n Compatibility

2.7.5

HubSpot Contact Scoring Auditor v1.0.0 — Technical Reference━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━Architecture: 24 n8n nodes, 4 agents (Fetcher → Assembler → Analyst → Formatter)Trigger:      Monthly schedule (1st of month, 10:00 UTC) or manual WebhookInput:        HubSpot API — contacts with lead_score + associated deal outcomesIntelligence: Sonnet 4.6 (Analyst 5-dimension audit + Formatter report/alert)Output:       Notion (audit report page) + Slack (summary + conditional urgent alert)Cost:         $0.08-$0.10/run (ITP-measured average)ITP:          8 variations, 14/14 milestones PASSBQS:          12/12 PASSTool A:       HubSpot (input — contacts + deal outcomes via OAuth2 API)Tool B:       Notion (output — audit report page via httpHeaderAuth API)Tool C:       Slack (output — summary + calibration alert via Bot Token)Intelligence: 5-dimension audit taxonomy + AGGREGATE pattern (single Analyst call)Cost Value:   0.09

What You'll Need

Platform

n8n 2.7.5+

Est. Monthly API Cost

$0.08-$0.10/month (monthly runs) + HubSpot/Notion/Slack included tiers

Credentials Required

  • Anthropic API
  • HubSpot (OAuth2)
  • Notion (httpHeaderAuth, Bearer prefix)
  • Slack (httpHeaderAuth, Bearer prefix, chat:write scope)

Services

  • HubSpot account (OAuth2 with contacts and deals scopes, lead scoring enabled)
  • Notion workspace (integration token with Bearer prefix)
  • Slack workspace (Bot Token with chat:write scope)
  • Anthropic API key

Setup Track

Quick Start

~15 min

All credentials live, n8n running

Full Setup

1–2 hrs

Needs API config + tables

From Scratch

2–4 hrs

No n8n, no credentials

HubSpot Contact Scoring Auditor v1.0.0

$249

one-time purchase

What you get:

  • Production-ready 24-node n8n workflow — import and deploy
  • Monthly Schedule Trigger (1st of month 10:00 UTC) or manual Webhook for on-demand audits
  • HubSpot API pagination for contacts with lead_score and associated deal outcomes
  • Pre-computed confusion matrix, per-segment accuracy, and score distribution analysis
  • 5-dimension scoring audit: false_positives, false_negatives, threshold_calibration, segment_blind_spots, feature_decay
  • Model Health classification: HEALTHY (>75%), NEEDS_TUNING (60-75%), CRITICAL (<60%)
  • Specific calibration adjustment recommendations grounded in data
  • Notion audit report page with executive summary and per-dimension analysis
  • Slack summary with conditional urgency routing (accuracy below threshold triggers urgent alert)
  • AGGREGATE architecture: single Analyst + Formatter calls — $0.08-$0.10/run regardless of contact count
  • Dual Sonnet 4.6: no Opus required
  • ITP 8 variations, 14/14 milestones, $0.08-$0.10/run measured
  • All sales final after download

Frequently Asked Questions

How does it differ from Account Health Intelligence Agent?+

Different units and taxonomies. AHIA monitors per-account health from HubSpot engagement signals — at_risk, declining, stable, growing. HCSA audits the lead scoring model itself for accuracy and recommends calibration adjustments. AHIA tells you which accounts need attention; HCSA tells you whether your scoring model is routing leads correctly.

What are the five audit dimensions?+

false_positives — high-scored contacts that lost or never converted, revealing what the model overvalues. false_negatives — low-scored contacts that won, revealing what the model undervalues. threshold_calibration — optimal score cutoff for MQL/SQL routing. segment_blind_spots — industries or personas where the model systematically fails. feature_decay — scoring criteria that were predictive but no longer correlate with outcomes.

What is Model Health classification?+

HEALTHY (accuracy above 75%) means the scoring model is performing well with minor adjustments recommended. NEEDS_TUNING (60-75%) means measurable accuracy gaps exist and specific recalibration is recommended. CRITICAL (below 60%) means the model is misrouting leads at scale and triggers an urgent calibration alert in Slack.

How does it relate to Inbound Lead Qualifier?+

Meta-level analysis. ILQ scores individual inbound leads against an ICP using lead_utility_score. HCSA audits the scoring model that ILQ and similar tools rely on. ILQ uses the scorer; HCSA scores the scorer. Together they close the loop: ILQ qualifies leads, HCSA ensures the qualification model remains accurate over time.

Why is it so cheap at $0.08-$0.10/run?+

AGGREGATE architecture. The Assembler pre-computes all math (confusion matrix, accuracy, per-segment breakdowns, score distributions) in code-only nodes before any LLM call. The Analyst receives ONE call with aggregate metrics — not per-contact analysis. The Formatter also receives one call. Two Sonnet 4.6 calls total regardless of contact count. 12 monthly runs = $0.96-$1.20/year in LLM costs.

How many contacts can it handle?+

The Fetcher paginates the HubSpot API to collect all contacts with a lead_score property and associated deals. The Assembler computes metrics in code — no LLM token limit applies to the data processing. The Analyst receives aggregate metrics, not raw contact data. Practical limit is your HubSpot API rate limit, not the LLM context window.

Does it use web scraping?+

No. All data comes from the HubSpot API: contact records with lead_score properties and associated deal records with outcomes. No web_search, no external data sources, no scraping. This makes the pipeline fast, reliable, and deterministic.

What triggers the urgent calibration alert?+

When the overall scoring model accuracy falls below the CRITICAL_ACCURACY_THRESHOLD (default 60%), the Formatter sends an additional urgent calibration alert to Slack alongside the standard summary. This threshold is configurable in the Config Loader.

Is there a refund policy?+

All sales are final after download. Review the Blueprint Dependency Matrix and prerequisites before purchase. Questions? Contact support@forgeworkflows.com before buying. Full terms at forgeworkflows.com/legal.

Related Blueprints