After looking at the table of rates for each county I noticed that there were 25 counties with NR as the rate. These are for counties with less than 10 OD cases in a county. Counties with small populations would have really high od rates while it would be really low for ones with a large populations. This would make the rates less stable. Another reason why the the rates for counties with low numbers of ODs are not reported is to protect the privacy of those who have OD.
The missing rates could be imputed using an imputation technique for the 25 missing counties. The simplest method would be to use the midpoint of the range for a missing rate (0 to 10 cases) and compute the rate by dividing by the population of interest. For these counties the differences in the rates would be determined by differences in population. This is a crude method of replacing the missing rates with plausible values. There are more sophisticated imputation methods based on other descriptive values for the counties but one cannot really draw firm conclusions about these imputed values.
Because of the missing data I thought it best not to add it to the model for Trump's % of the vote. Below are a sampling of the rankings of rates for the 42 counties with rates. In parenthesis are the total OD cases on which the rates are based.
1. Philadelphia 47.3 (603)
2. Lackawanna 41.3 (74)
3. Delaware 40.4 (186)
4. Beaver 40.2 (57)
5. Cambria 39.2 (45)
6. Westmoreland 38.2 (116)
20. Blair 30.7 (32)
40. Indiana 18.9 (14)
**Related Posts**