class: center, middle, inverse, title-slide .title[ # Module 2: Demand for Cigarettes and Instrumental Variables ] .subtitle[ ## Part 3: Application of IV to Demand Estimation ] .author[ ### Ian McCarthy | Emory University ] .date[ ### Econ 470 & HLTH 470 ] --- <!-- Adjust some CSS code for font size and maintain R code font size --> <style type="text/css"> .remark-slide-content { font-size: 30px; padding: 1em 2em 1em 2em; } .remark-code { font-size: 15px; } .remark-inline-code { font-size: 20px; } </style> <!-- Set R options for how code chunks are displayed and load packages --> # Naive estimate Clearly a strong relationship between prices and sales. For example, just from OLS: ``` ## ## Call: ## lm(formula = ln_sales ~ ln_price, data = cig.data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.23899 -0.17057 0.02239 0.18605 1.13866 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 4.689838 0.007209 650.55 <2e-16 *** ## ln_price -0.420307 0.006464 -65.02 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.3073 on 2497 degrees of freedom ## Multiple R-squared: 0.6287, Adjusted R-squared: 0.6285 ## F-statistic: 4228 on 1 and 2497 DF, p-value: < 2.2e-16 ``` --- # Is this causal? - But is that the true demand curve? - Aren't other things changing that tend to reduce cigarette sales? --- # Tax as an IV ```r cig.data %>% ggplot(aes(x=Year,y=total_tax_cpi)) + stat_summary(fun.y="mean",geom="line") + labs( x="Year", y="Tax per Pack ($)", title="Cigarette Taxes in 2010 Real Dollars" ) + theme_bw() + scale_x_continuous(breaks=seq(1970, 2020, 5)) ``` .plot-callout[ <img src="02-smoking3_files/figure-html/cig-tax-callout-1.png" style="display: block; margin: auto;" /> ] --- # Tax as an IV <img src="02-smoking3_files/figure-html/cig-tax-output-1.png" style="display: block; margin: auto;" /> --- # IV Results ``` ## ## Call: ## ivreg(formula = ln_sales ~ ln_price | total_tax_cpi, data = cig.data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.24595 -0.23048 0.02863 0.23548 1.30999 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 4.805691 0.009703 495.29 <2e-16 *** ## ln_price -0.619142 0.011128 -55.64 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.3608 on 2497 degrees of freedom ## Multiple R-Squared: 0.488, Adjusted R-squared: 0.4878 ## Wald test: 3096 on 1 and 2497 DF, p-value: < 2.2e-16 ``` --- # Two-stage equivalence ```r step1 <- lm(ln_price ~ total_tax_cpi, data=cig.data) pricehat <- predict(step1) step2 <- lm(ln_sales ~ pricehat, data=cig.data) summary(step2) ``` ``` ## ## Call: ## lm(formula = ln_sales ~ pricehat, data = cig.data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.10960 -0.17805 0.01867 0.18697 1.14907 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 4.805691 0.008195 586.41 <2e-16 *** ## pricehat -0.619142 0.009399 -65.87 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.3048 on 2497 degrees of freedom ## Multiple R-squared: 0.6348, Adjusted R-squared: 0.6346 ## F-statistic: 4339 on 1 and 2497 DF, p-value: < 2.2e-16 ``` --- # Different specifications <table style="text-align:center"><tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="6">Log Sales per Capita</td></tr> <tr><td style="text-align:left"></td><td colspan="3">OLS</td><td colspan="3">IV</td></tr> <tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td><td>(3)</td><td>(4)</td><td>(5)</td><td>(6)</td></tr> <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Log Price</td><td>-0.953<sup>***</sup></td><td>-0.921<sup>***</sup></td><td>-1.213<sup>***</sup></td><td>-1.072<sup>***</sup></td><td>-1.036<sup>***</sup></td><td>-1.523<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.012)</td><td>(0.008)</td><td>(0.034)</td><td>(0.014)</td><td>(0.010)</td><td>(0.041)</td></tr> <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">State FE</td><td>No</td><td>Yes</td><td>Yes</td><td>No</td><td>Yes</td><td>Yes</td></tr> <tr><td style="text-align:left">Year FE</td><td>No</td><td>No</td><td>Yes</td><td>No</td><td>No</td><td>Yes</td></tr> <tr><td style="text-align:left">Observations</td><td>2,499</td><td>2,499</td><td>2,499</td><td>2,499</td><td>2,499</td><td>2,499</td></tr> <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="6" style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> --- # Test the IV <table style="text-align:center"><tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="3">Log Price</td><td colspan="3">Log Sales</td></tr> <tr><td style="text-align:left"></td><td colspan="3">First Stage</td><td colspan="3">Reduced Form</td></tr> <tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td><td>(3)</td><td>(4)</td><td>(5)</td><td>(6)</td></tr> <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Tax per Pack</td><td>0.444<sup>***</sup></td><td>0.474<sup>***</sup></td><td>0.187<sup>***</sup></td><td>-0.476<sup>***</sup></td><td>-0.491<sup>***</sup></td><td>-0.284<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.006)</td><td>(0.006)</td><td>(0.002)</td><td>(0.007)</td><td>(0.006)</td><td>(0.007)</td></tr> <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">State FE</td><td>No</td><td>Yes</td><td>Yes</td><td>No</td><td>Yes</td><td>Yes</td></tr> <tr><td style="text-align:left">Year FE</td><td>No</td><td>No</td><td>Yes</td><td>No</td><td>No</td><td>Yes</td></tr> <tr><td style="text-align:left">Observations</td><td>2,499</td><td>2,499</td><td>2,499</td><td>2,499</td><td>2,499</td><td>2,499</td></tr> <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="6" style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> --- # Summary 1. Most elasticities of around -0.25% to -0.37% 2. Much larger elasticities when including year fixed effects 3. Perhaps not too outlandish given more recent evidence: [NBER Working Paper](https://www.nber.org/papers/w22251.pdf). --- # Some other IV issues 1. IV estimators are biased. Performance in finite samples is questionable. 2. IV estimators provide an estimate of a Local Average Treatment Effect (LATE), which is only the same as the ATT under some conditions or assumptions. 3. What about lots of instruments? The finite sample problem is more important and we may try other things (JIVE).<br> -- <br> The National Bureau of Economic Researh (NBER) has a great resource [here](https://www.nber.org/econometrics_minicourse_2018/2018si_methods.pdf) for understanding instruments in practice. --- # Quick IV Review 1. When do we consider IV as a potential identification strategy? 2. What are the main IV assumptions (and what do they mean)? 3. How do we test for those assumptions?