Estimate... 1777.6
AI SE...... 34.725
T-stat..... 51.191
p.val...... < 2.22e-16
Original number of observations.............. 2707
Original number of treated obs............... 698
Matched number of observations............... 12
Matched number of observations (unweighted). 12
Number of obs dropped by 'exact' or 'caliper' 2695
Results: Nearest neighbor
Inverse variance
Estimate... -526.95
AI SE...... 223.06
T-stat..... -2.3623
p.val...... 0.01816
Original number of observations.............. 2707
Original number of treated obs............... 698
Matched number of observations............... 2707
Matched number of observations (unweighted). 2711
Results: Nearest neighbor
Mahalanobis
Estimate... -492.82
AI SE...... 223.55
T-stat..... -2.2046
p.val...... 0.027485
Original number of observations.............. 2707
Original number of treated obs............... 698
Matched number of observations............... 2707
Matched number of observations (unweighted). 2708
Results: Nearest neighbor
Propensity score
Estimate... -201.03
AI SE...... 275.76
T-stat..... -0.72898
p.val...... 0.46601
Original number of observations.............. 2707
Original number of treated obs............... 698
Matched number of observations............... 2707
Matched number of observations (unweighted). 14795
Why such large differences between linear (unweighted) regression and other approaches?
Problem is due to common support. Without weighting, the treated group looks very different than the control group, and standard OLS (without weights) doesn’t do anything to account for this.
So what have we learned?
Key assumptions for causal inference so far
Selection on observables
Common support
Causal effect assuming selection on observables
If we assume selection on observables holds, then we only need to condition on the relevant covariates to identify a causal effect. But we still need to ensure common support.