AI / Machine LearningSTATUS: PRODUCTIONSHIPPED IN 3 WEEKS

Lead Scoring Model: From 6 Static Factors to a Three-Tier Behavioral Composite

We upgraded from a static 6-factor lead score to a three-tier behavioral composite integrating email engagement, AUM, and headcount, projecting 5 to 10% conversion uplift.

3,889

Tier A leads (V1)

50/30/20

V2 weights (Static / Behavioral / Firmographic)

5-10%

Projected conversion uplift (MVV)

94%

Classification accuracy (200-reply test set)

Python 3.12 (XGBoost, logistic regression)ClickHouse 26.3PostgreSQL (engagement_metrics)PostHog (website visits)SEC IAPD dataInstantly webhooks

CHAPTER 01

Problem & Context

The initial lead scoring model produced 3,889 Tier A leads from a database of financial industry contacts sourced from 13-F filings, FINRA records, LinkedIn exports, and direct prospecting. The model was a 6-factor weighted composite: firm type (30%), title seniority (25%), firm size proxy (15%), email quality (15%), data source (10%), and ICP segment match (5%). Every factor was computed from static enrichment data available at lead collection time.

The static model had a fundamental limitation that became clear as outreach began. Two leads with identical composite scores of 85 could behave completely differently in an email campaign. A CIO at a $5B RIA who opened the email, clicked through to the pricing page, and spent 4 minutes on the site had demonstrated far more purchase intent than a Director at a $2B fund who never opened. The static model assigned both Tier A and treated them identically in campaign sequencing.

The second limitation was firmographic blindness. The firm size proxy used heuristics derived from company name patterns because AUM data was not yet integrated. A firm named Capital Management might manage $50M or $50B; the heuristics produced scores with no discriminating power for the AUM dimension.

CHAPTER 02

Approach & Architecture

V2 replaced the single-dimension composite with a three-tier weighted formula: V2 Score = (Static times 0.50) + (Behavioral times 0.30) + (Firmographic times 0.20), clamped to [0, 100]. The Static component was the V1 6-factor score, unchanged. The Behavioral component was a weighted combination of email open rate (40%), email click rate (35%), and website visit depth (25%). The Firmographic component combined AUM score (50%) and estimated headcount score (50%), sourced from SEC IAPD and company enrichment data.

The Behavioral score defaulted to 50 (neutral) for leads with no engagement data. The Firmographic score defaulted to 55 for leads whose companies could not be matched in IAPD data. This design reflected a deliberate asymmetry: the model should not penalize data absence, because the absence was a function of pipeline completeness, not lead quality.

The AUM scoring curve was designed with knowledge of the buyer profile. Firms managing $100M to $1B scored 70 points. Firms managing $10B+ scored 85 points. Firms below $100M scored 40 points. The headcount scoring used a non-monotonic curve: firms with 50 to 250 employees scored highest at 75 points because they had buying agility and dedicated investment teams.

ARCHITECTURE OVERVIEW

INGEST

Python 3.12 (XGBoost, logistic regression)

FEATURES

ClickHouse 26.3

TRAIN

PostgreSQL (engagement_metrics)

v1 / v2 / v3

eval loop

SERVE

PostHog (website visits)

Production predictions feed back into training set. Continuous retraining cadence

3,889 Tier A leads (V1)50/30/20 V2 weights (Static / Behavioral / Firmographic)5-10% Projected conversion uplift (MVV)

CHAPTER 03

Implementation

The engagement data pipeline consumed webhooks from the Instantly email platform. The webhook contract handled three event types: email_opened, email_clicked, and email_bounced. Each event carried the lead's email address, campaign ID, timestamp, and URL clicked for click events. The receiver performed HMAC signature verification before processing any event. Verified events triggered upserts to the ClickHouse engagement_metrics table.

The SEC IAPD firmographic enrichment required fuzzy company name matching because lead records used self-reported company names while IAPD records used legal entity names. The matching pipeline normalized both sides and then ran Levenshtein distance matching with a threshold of 0.15 normalized edit distance. The target enrichment rate was 70% of leads matched to an IAPD record.

The machine learning training path was designed for the post-launch phase. Phase 1 used logistic regression with engagement quality as a proxy label. Phase 2 introduced conversion labels once 10 or more paid conversions were recorded. XGBoost was chosen for Phase 2 training given the tabular feature structure.

TECH STACK

Python 3.12 (XGBoost, logistic regression)ClickHouse 26.3PostgreSQL (engagement_metrics)PostHog (website visits)SEC IAPD dataInstantly webhooks

CHAPTER 04

Results & Metrics

The V1 model produced 3,889 Tier A leads, approximately 24% of the total lead database. The V2 model, once engagement data was flowing, was designed to identify which Tier A leads were demonstrating active interest. The weighted formula was validated against a representative example: a CIO at a $5B RIA with a V1 static score of 87 and behavioral score of 68 scored V2 composite 80.3, maintaining Tier A status. A Director at a smaller firm with no engagement and unknown AUM would score 66.5, correctly dropping to Tier B.

3,889

Tier A leads (V1)

50/30/20

V2 weights (Static / Behavioral / Firmographic)

5-10%

Projected conversion uplift (MVV)

94%

Classification accuracy (200-reply test set)

CHAPTER 05

Lessons & Technical Decisions

DECISION · 01

The most significant design decision was the default scoring for missing data. The initial proposal penalized leads with no behavioral data by assigning a Behavioral score of 0, effectively demoting cold leads before they had any opportunity to demonstrate interest. Setting the default to 50 (neutral) ensured the V2 score change on first contact was driven by actual engagement behavior.

DECISION · 02

The headcount non-monotonic curve took several iterations. The first version used a monotonically increasing score where larger firm = higher score. This overvalued enterprise accounts that had long procurement cycles incompatible with the direct sales motion. The revised curve added a penalty for very large firms.

DECISION · 03

The decision to build engagement capture on top of Instantly webhooks rather than polling the Instantly API was driven by latency requirements. A lead who clicks through to the pricing page at 2:00 PM should have their behavioral score updated before the next sales outreach step at 2:30 PM.

START A PROJECT

Need something like this?

We build fast. Most projects ship in under two weeks. Start with a free 30-minute discovery call.

Start a Project

Related case studies

AI / Machine Learning

ML Signal Scoring: From 48% Accuracy to a 72% Win Rate Through Architectural Selection

We rebuilt the signal scoring pipeline from scratch, fixing look-ahead contamination and adding a top-decile filter that produced 72.2% win rate on selected signals.

72.2% Win rate (top-decile signals)

Read case study →

AI / Machine Learning

Regime Detection: A 50-Point Win Rate Spread and the System That Learned to Exploit It

We found a 50-percentage-point win rate spread between market regimes, fixed a regime classifier that was routing by symbol name instead of market structure, and built a live suppression system for anti-patterns.

62.1% Win rate in choppy regime

Read case study →

AI / Machine Learning

Correlation Engine: Real-Time NxN Correlation and Cluster Detection Across 1,200 Symbols

We built a Rust correlation engine processing 1,200 symbols with incremental sliding window updates at 340ms p95 per cycle, 14x faster than full recompute.

1,200 Symbols in correlation matrix

Read case study →

Start a Project

AI / Machine LearningSTATUS: PRODUCTIONSHIPPED IN 3 WEEKS

Lead Scoring Model: From 6 Static Factors to a Three-Tier Behavioral Composite

We upgraded from a static 6-factor lead score to a three-tier behavioral composite integrating email engagement, AUM, and headcount, projecting 5 to 10% conversion uplift.

3,889

Tier A leads (V1)

50/30/20

V2 weights (Static / Behavioral / Firmographic)

5-10%

Projected conversion uplift (MVV)

94%

Classification accuracy (200-reply test set)

Python 3.12 (XGBoost, logistic regression)ClickHouse 26.3PostgreSQL (engagement_metrics)PostHog (website visits)SEC IAPD dataInstantly webhooks

CHAPTER 01

Problem & Context

CHAPTER 02

Approach & Architecture

ARCHITECTURE OVERVIEW

INGEST

Python 3.12 (XGBoost, logistic regression)

FEATURES

ClickHouse 26.3

TRAIN

PostgreSQL (engagement_metrics)

v1 / v2 / v3

eval loop

SERVE

PostHog (website visits)

Production predictions feed back into training set. Continuous retraining cadence

3,889 Tier A leads (V1)50/30/20 V2 weights (Static / Behavioral / Firmographic)5-10% Projected conversion uplift (MVV)

CHAPTER 03

Implementation

TECH STACK

Python 3.12 (XGBoost, logistic regression)ClickHouse 26.3PostgreSQL (engagement_metrics)PostHog (website visits)SEC IAPD dataInstantly webhooks

CHAPTER 04

Results & Metrics

3,889

Tier A leads (V1)

50/30/20

V2 weights (Static / Behavioral / Firmographic)

5-10%

Projected conversion uplift (MVV)

94%

Classification accuracy (200-reply test set)

CHAPTER 05

Lessons & Technical Decisions

DECISION · 01

DECISION · 02

DECISION · 03

START A PROJECT

Need something like this?

We build fast. Most projects ship in under two weeks. Start with a free 30-minute discovery call.

Start a Project

Related case studies

AI / Machine Learning

ML Signal Scoring: From 48% Accuracy to a 72% Win Rate Through Architectural Selection

We rebuilt the signal scoring pipeline from scratch, fixing look-ahead contamination and adding a top-decile filter that produced 72.2% win rate on selected signals.

72.2% Win rate (top-decile signals)

Read case study →

AI / Machine Learning

Regime Detection: A 50-Point Win Rate Spread and the System That Learned to Exploit It

62.1% Win rate in choppy regime

Read case study →

AI / Machine Learning

Correlation Engine: Real-Time NxN Correlation and Cluster Detection Across 1,200 Symbols

We built a Rust correlation engine processing 1,200 symbols with incremental sliding window updates at 340ms p95 per cycle, 14x faster than full recompute.

1,200 Symbols in correlation matrix

Read case study →