Showing posts with label Missing Data. Show all posts
Showing posts with label Missing Data. Show all posts

Tuesday, July 11, 2017

Adjusting for Opioid OD Rates for PA Counties

Two posts I talked about how ago I spoke about how educational attainment (defined as having a bachelors degree or higher) and and the % white in Pennsylvania counties predict 86% of Trump's vote in the state.  Once in a while new county level data is released and I add it to the regression mode.  The Pennsylvania Health Cost Containment Council came out with the opioid overdose rate for people age 15 years and up for every county in the state in cases per 100,000.  Cambria county was ranked 5th in the state while Philadelphia was ranked first even after adjusting for population.

After looking at the table of rates for each county I noticed that there were 25 counties with NR as the rate.  These are for counties with less than 10 OD cases in a county.  Counties with small populations would have really high od rates while it would be really low for ones with a large populations.  This would make the rates less stable.  Another reason why the the rates for counties with low numbers of ODs are not reported is to protect the privacy of those who have OD.

The missing rates could be imputed using an imputation technique for the 25 missing counties.  The simplest method would be to use the midpoint of the range for a missing rate (0 to 10 cases) and compute the rate by dividing by the population of interest.  For these counties the differences in the rates would be determined by differences in population.  This is a crude method of replacing the missing rates with plausible values.  There are more sophisticated imputation methods based on other descriptive values for the counties but one cannot really draw firm conclusions about these imputed values.  

Because of the missing data I thought it best not to add it to the model for Trump's % of the vote.  Below are a sampling of the rankings of rates for the 42 counties with rates.  In parenthesis are the total OD cases on which the rates are based.

1. Philadelphia 47.3 (603)
2. Lackawanna 41.3 (74)
3. Delaware 40.4 (186)
4. Beaver 40.2 (57)
5. Cambria 39.2 (45)
6. Westmoreland 38.2 (116)
20. Blair 30.7 (32)
40. Indiana 18.9 (14)

**Related Posts** 

Education and Race Account for Outliers in Trump's Vote at the State and PA County Level

How is Washington DC an outlier? Let's count the ways. (Repost from Data Driven Journalism)

Ivory Tower Science and the Rest of Us