Uplift Targeting EnginePersuadables, Not Conversions
A causal uplift-modeling engine that predicts the incremental effect of an intervention per user, the persuadables, not who converts anyway. Meta-learners on experiment data, evaluated with Qini and policy value, shipped as a treat/don't-treat decision API.
System architecture
Build spec
- Learners
- S / T / X / R, from scratch on XGBoost
- Data
- Hillstrom 64k · Criteo 25.3M rows
- Eval
- Qini · uplift@decile · policy value
- Cross-check
- econml NonParamDML, 0.96 Spearman
- Result
- Policy 0.127 vs random 0.121 at 30% budget
Problem
Most ML predicts outcomes (will this user convert), which over-targets the sure things and the lost causes and wastes budget. The question that actually pays is causal: if I spend one unit of budget, on whom does it change the outcome? A classifier cannot answer that.
Approach
S, T, X, and R meta-learners are implemented from scratch over XGBoost base learners, each estimating per-user uplift (CATE). A data layer loads real experiment data (Hillstrom, Criteo) or simulates an RCT with known ground-truth effect, and evaluation grades models with Qini curves, uplift-at-decile, and policy value versus random and treat-all. A cross-check validates the hand-rolled R-learner against econml's NonParamDML on the same split. A budget-aware FastAPI /score endpoint maps a budget fraction to a threshold drawn from a held-out uplift distribution, with a Streamlit UI for single-user scoring and budget sweeps.
Impact
On the 64k-row Hillstrom email experiment the uplift policy beats random targeting at a fixed 30% budget (0.127 vs 0.121), and the from-scratch R-learner matches econml at 0.96 rank correlation. On a simulated RCT with known truth all learners recover the ranking (Spearman up to 0.951) at near-zero ATE error, proving the evaluator is correct before it is trusted on real data.
Decisions & tradeoffs
Implement the learners from scratch
The meta-learner mechanics are hand-rolled over XGBoost rather than imported wholesale, which is what interviewers probe. econml and causalml stay as optional cross-check dependencies, not the core.
Cross-check against econml
A per-user causal target can never be observed, so it cannot be graded directly. Training the scratch R-learner and econml's NonParamDML on the same split and agreeing at 0.96 rank correlation is the evidence the estimator is correct.
Budget-derived decision threshold
Rather than a fixed cutoff, the API derives a threshold from a stashed held-out uplift distribution. The treat/don't-treat decision directly answers how many users the budget can afford.
System notes
- S, T, X, and R meta-learners hand-rolled over XGBoost bases, the mechanics interviewers probe
- The from-scratch R-learner is cross-checked against Microsoft econml at 0.96 Spearman, proving correctness not just plausibility
- The API derives its treat threshold from a held-out uplift distribution stashed in the model bundle, so the cutoff answers how many you can afford to treat
- The evaluator is validated on a simulated RCT with known effect before it is trusted on hidden-truth real data
Stack
Meta-learners · XGBoost · econml · Qini curves · FastAPI · Streamlit