class: center, middle, inverse, title-slide # Module 2: Demand for Cigarettes and Instrumental Variables ## Part 1: CDC Data on Smoking and Cigarettes ### Ian McCarthy | Emory University ### Econ 470 & HLTH 470 --- <!-- Adjust some CSS code for font size and maintain R code font size --> <style type="text/css"> .remark-slide-content { font-size: 30px; padding: 1em 2em 1em 2em; } .remark-code { font-size: 15px; } .remark-inline-code { font-size: 20px; } </style> <!-- Set R options for how code chunks are displayed and load packages --> # History of Smoking .center[ ![:scale 700px](https://media.giphy.com/media/TdQfYM6KpYQUM/giphy.gif) ] --- # History of Smoking - Widespread smoking began in late 1800s - Lung cancer becoming more common after 1930s - First evidence of link in 1950s - Surgeon general's report in 1964 - Very important in causal inference! ([Section 5.1.1](https://mixtape.scunning.com/matching-and-subclassification.html#some-background) of Causal Inference Mixtape) --- # Why it matters 1. Extreme public health concerns - Lung cancer prevalence - Fetal and baby health 2. Economic questions - Is it an information problem? - Externalities (second-hand smoke) - Moral hazard due to insurance --- # In our case We want to focus on estimating demand for cigarettes. By this, I mean estimating price elasticity of demand. -- <br> We'll show that standard OLS isn't going to do this very well. <!-- New Section --> --- class: inverse, center, middle name: smoking_data # Cigarette Data <html><div style='float:left'></div><hr color='#EB811B' size=1px width=1055px></html> --- # The Data - Data from [CDC Tax Burden on Tobacco](https://data.cdc.gov/Policy/The-Tax-Burden-on-Tobacco-1970-2018/7nwe-3aj9/data) - Visit GitHub repository for other info: [Tobacco GitHub repository](https://github.com/imccart/CDC-Tobacco) - Supplement with CPI data, also in GitHub repo. --- # Summary stats We're interested in cigarette prices and sales, so let's focus our summaries on those two variables ```r stargazer(as.data.frame(cig.data %>% select(sales_per_capita, price_cpi, cost_per_pack)), type="html") ``` <table style="text-align:center"><tr><td colspan="8" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Statistic</td><td>N</td><td>Mean</td><td>St. Dev.</td><td>Min</td><td>Pctl(25)</td><td>Pctl(75)</td><td>Max</td></tr> <tr><td colspan="8" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">sales_per_capita</td><td>2,499</td><td>95.150</td><td>41.133</td><td>12.500</td><td>63.050</td><td>122.400</td><td>296.200</td></tr> <tr><td style="text-align:left">price_cpi</td><td>2,499</td><td>3.396</td><td>1.641</td><td>1.307</td><td>2.088</td><td>4.520</td><td>9.651</td></tr> <tr><td style="text-align:left">cost_per_pack</td><td>2,499</td><td>2.678</td><td>2.238</td><td>0.287</td><td>0.780</td><td>4.237</td><td>10.376</td></tr> <tr><td colspan="8" style="border-bottom: 1px solid black"></td></tr></table> --- # Cigarette Sales ```r cig.data %>% ggplot(aes(x=Year,y=sales_per_capita)) + stat_summary(fun.y="mean",geom="line") + labs( x="Year", y="Packs per Capita", title="Cigarette Sales" ) + theme_bw() + scale_x_continuous(breaks=seq(1970, 2020, 5)) ``` .plot-callout[ <img src="02-smoking1_files/figure-html/cig-sales-callout-1.png" style="display: block; margin: auto;" /> ] --- # Cigarette Sales <img src="02-smoking1_files/figure-html/cig-sales-output-1.png" style="display: block; margin: auto;" /> --- # Cigarette Prices ```r cig.data %>% ggplot(aes(x=Year,y=price_cpi)) + stat_summary(fun.y="mean",geom="line") + labs( x="Year", y="Price per Pack ($)", title="Cigarette Prices in 2010 Real Dollars" ) + theme_bw() + scale_x_continuous(breaks=seq(1970, 2020, 5)) ``` .plot-callout[ <img src="02-smoking1_files/figure-html/cig-price-callout-1.png" style="display: block; margin: auto;" /> ] --- # Cigarette Prices <img src="02-smoking1_files/figure-html/cig-price-output-1.png" style="display: block; margin: auto;" />