Medicare Advantage Data

Ian McCarthy | Emory University

Medicare Advantage Data

  • Now we need to work with the final dataset, ‘final_ma_data’

Summary stats

Focus on enrollments and star ratings:

R Code
ma.data <- read_rds(here("data/final_ma_data.rds"))
sum.vars <- ma.data %>% select("MA Enrollment" = avg_enrollment, "MA Eligibles" = avg_eligibles, "Star Rating" = Star_Rating)
datasummary(All(sum.vars) ~ Mean + SD + Histogram, data=sum.vars)
Mean SD Histogram
MA Enrollment 365.17 1636.62
MA Eligibles 36797.77 95219.09
Star Rating 3.29 0.80 ▁▁▇▇▇▅▄▁

Clean the data

Limit to plans with:

  • Observed enrollments, \(>\) 10
  • First year of star rating (2009)
  • Observed star rating
ma.data.clean <- ma.data %>%
  filter(!is.na(avg_enrollment) & year==2009 & !is.na(partc_score)) #<<

Calculate raw average rating

ma.data.clean <- ma.data.clean %>%
  mutate(raw_rating=rowMeans(
    cbind(breastcancer_screen,rectalcancer_screen,cv_cholscreen,diabetes_cholscreen,
          glaucoma_test,monitoring,flu_vaccine,pn_vaccine,physical_health,
          mental_health,osteo_test,physical_monitor,primaryaccess,
          hospital_followup,depression_followup,nodelays,carequickly,
          overallrating_care,overallrating_plan,calltime,
          doctor_communicate,customer_service,osteo_manage,
          diabetes_eye,diabetes_kidney,diabetes_bloodsugar,
          diabetes_chol,antidepressant,bloodpressure,ra_manage,
          copd_test,betablocker,bladder,falling,appeals_timely,
          appeals_review),
    na.rm=T)) %>%
  select(contractid, planid, fips, avg_enrollment, state, county, raw_rating, partc_score,
         avg_eligibles, avg_enrolled, premium_partc, risk_ab, Star_Rating,
         bid, avg_ffscost, ma_rate)

Distribution of star ratings

R Code
ma.data.clean %>% 
  ggplot(aes(x=as.factor(Star_Rating))) + 
  geom_bar() +
  labs(
    x="Star Rating",
    y="Count of Plans",
    title="Frequency Distribution of Star Ratings"
  ) + theme_bw()

Enrollments and star ratings

R Code
summary(lm(avg_enrollment~factor(Star_Rating), data=ma.data.clean))

Call:
lm(formula = avg_enrollment ~ factor(Star_Rating), data = ma.data.clean)

Residuals:
   Min     1Q Median     3Q    Max 
  -622   -382   -202    -49  69095 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)               71.75      41.77   1.718  0.08589 .  
factor(Star_Rating)2      26.17      49.24   0.531  0.59513    
factor(Star_Rating)2.5   182.83      45.70   4.001 6.33e-05 ***
factor(Star_Rating)3     422.24      48.54   8.699  < 2e-16 ***
factor(Star_Rating)3.5   449.20      52.61   8.539  < 2e-16 ***
factor(Star_Rating)4     562.33      57.53   9.774  < 2e-16 ***
factor(Star_Rating)4.5   266.88      83.96   3.179  0.00148 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1503 on 18979 degrees of freedom
Multiple R-squared:  0.01415,   Adjusted R-squared:  0.01384 
F-statistic: 45.41 on 6 and 18979 DF,  p-value: < 2.2e-16

Potential endogeneity

  • The star rating is a measure of quality
  • Quality may be endogenous to enrollment (how?)

Premiums and quality

R Code
ma.data.clean %>% 
  ggplot(aes(x=partc_score, y=premium_partc)) + 
  geom_point() +
  labs(
    x="Part C Score",
    y="Premium",
    title="Premiums and Quality"
  ) + theme_bw()

Premiums and quality

R Code
summary(lm(premium_partc~factor(Star_Rating), data=ma.data.clean))

Call:
lm(formula = premium_partc ~ factor(Star_Rating), data = ma.data.clean)

Residuals:
   Min     1Q Median     3Q    Max 
-58.58 -23.56  -5.75  14.37 338.52 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)               3.850      1.847   2.084   0.0372 *  
factor(Star_Rating)2      1.901      2.015   0.943   0.3455    
factor(Star_Rating)2.5   19.710      1.898  10.386   <2e-16 ***
factor(Star_Rating)3     22.581      1.965  11.490   <2e-16 ***
factor(Star_Rating)3.5   38.672      2.048  18.879   <2e-16 ***
factor(Star_Rating)4     54.726      2.231  24.531   <2e-16 ***
factor(Star_Rating)4.5   49.494      2.787  17.757   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 33.46 on 12892 degrees of freedom
  (6087 observations deleted due to missingness)
Multiple R-squared:  0.1348,    Adjusted R-squared:  0.1344 
F-statistic: 334.9 on 6 and 12892 DF,  p-value: < 2.2e-16