Assumptions 1 and 2 sometimes grouped into an only through condition.
Conley et al (2010) and “plausible exogeneity”, union of confidence intervals approach
Kippersluis and Rietveld (2018), “Beyond Plausibly Exogenous”
Just says that your instrument is correlated with the endogenous variable, but what about the strength of the correlation?
Recall our schooling and wages equation, \[y = \beta S + \epsilon.\] Bias in IV can be represented as:
\[Bias_{IV} \approx \frac{Cov(S, \epsilon)}{V(S)} \frac{1}{F+1} = Bias_{OLS} \frac{1}{F+1}\]
Single endogenous variable
Single endogenous variable
Single endogenous variable
Many endogenous variables
Recall that the true treatment effect is 5.25
Call:
lm(formula = y ~ d, data = iv.dat)
Residuals:
Min 1Q Median 3Q Max
-3.8321 -0.6666 -0.0163 0.6960 3.2710
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.08349 0.01936 107.6 <2e-16 ***
dTRUE 6.15342 0.02887 213.2 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.015 on 4998 degrees of freedom
Multiple R-squared: 0.9009, Adjusted R-squared: 0.9009
F-statistic: 4.544e+04 on 1 and 4998 DF, p-value: < 2.2e-16
TSLS estimation, Dep. Var.: y, Endo.: d, Instr.: z
Second stage: Dep. Var.: y
Observations: 5,000
Standard-errors: IID
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.49826 0.028810 86.7151 < 2.2e-16 ***
fit_dTRUE 5.23088 0.053642 97.5145 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 1.11415 Adj. R2: 0.880626
F-test (1st stage), dTRUE: stat = 2,677.0, p < 2.2e-16, on 1 and 4,998 DoF.
Wu-Hausman: stat = 614.1, p < 2.2e-16, on 1 and 4,997 DoF.
Call:
lm(formula = d ~ z, data = iv.dat)
Residuals:
Min 1Q Median 3Q Max
-1.00680 -0.33096 -0.02621 0.33441 1.12780
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.451323 0.005678 79.48 <2e-16 ***
z 0.147650 0.002854 51.74 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4015 on 4998 degrees of freedom
Multiple R-squared: 0.3488, Adjusted R-squared: 0.3487
F-statistic: 2677 on 1 and 4998 DF, p-value: < 2.2e-16
Call:
lm(formula = y ~ z, data = iv.dat)
Residuals:
Min 1Q Median 3Q Max
-7.8509 -2.2071 -0.0964 2.1609 8.1848
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.85908 0.04011 121.16 <2e-16 ***
z 0.77234 0.02016 38.32 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.836 on 4998 degrees of freedom
Multiple R-squared: 0.2271, Adjusted R-squared: 0.2269
F-statistic: 1468 on 1 and 4998 DF, p-value: < 2.2e-16
Call:
lm(formula = y ~ d.hat, data = iv.dat)
Residuals:
Min 1Q Median 3Q Max
-7.8509 -2.2071 -0.0964 2.1609 8.1848
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.49826 0.07332 34.08 <2e-16 ***
d.hat 5.23088 0.13651 38.32 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.836 on 4998 degrees of freedom
Multiple R-squared: 0.2271, Adjusted R-squared: 0.2269
F-statistic: 1468 on 1 and 4998 DF, p-value: < 2.2e-16
Assumption: Denote the effect of our instrument on treatment by \(\pi_{1i}\). Monotonicity states that \(\pi_{1i} \geq 0\) or \(\pi_{1i} \leq 0, \text{ } \forall i\).
\[\delta_{IV} = \frac{E[Y_{i} | Z_{i}=1] - E[Y_{i} | Z_{i}=0]}{E[D_{i} | Z_{i}=1] - E[D_{i} | Z_{i}=0]}=E[Y_{i}(1) - Y_{i}(0) | \text{complier}]\]