I was thinking how strong is the correlation between how the players do in the competition and how many home runs they hit during the regular season. I looked at how many home runs they hit during the first round (because there are the largest number of players (8) and they have about the same number of pitches thrown to them) and compared them to the total number of home runs hit that year. I looked at the years 2012 and 2011. 2013 is not finished yet. The results are summarized in the table at the bottom.
It is hard to tell just by looking at those numbers if there is any correlation between the derby and season totals. This is why we do correlational studies and do scatterplots to see the overall trend between the two with the best fit straight line. The plot below seems to suggest that there is a negative association between the first round derby totals and the season HR totals. However the R-squared statistic says that it accounts for only 6.5% of the variability of the data and it is not statistically significant (p greater than 0.05).
I then looked at the numbers from the first round of the HR derby from 2012. This graph suggests an even weaker positive correlation between the derby and season totals accounting for 0.7% of the variability and it likewise was not significant. Three batters were in both years derbys: Cano (2011 winner), Fielder (2012 winner), and Kemp.
Yes the sample sizes are small for these correlations but combining the data for the two years does not yield a significant result. With the percentage of the variability accounted for being so low it is unlikely that a meaningful relationship would be found. Perhaps if we look at a different era it will look different.
I looked at 1998 the year Mark McGwire broke Roger Maris' season home run record with 70 and many players were taking steroids as Jose Canseco (not there) revealed. In this home run derby there were 10 players with Ken Griffey Jr. winning with 19 HR and Mark McGwire hitting 4 HR. The graph for 1998 shows a slight positive relationship accounting for 3.8% of the variability. This relationship is still not statistically significant. A much larger sample size would be needed to prove that an effect size this small exists. Combining all three years produces almost no correlation accounting for 0.6% of the variance like the 2012 correlation. There does not appear to be a real relationship between the number of home run derby and regular season HR's. The conditions are too different or the sample is biased. Players performances fluctuate from day to day. It's impossible to tell the impact of performance enhancing drugs from this analysis.
|In Latin their name is Ericii|
Nate Silver of fivethirtyeight.com began working with baseball statistics before modelling poll data to predict elections. After working at the New York Times where he stumped the pundits on the election results, he has been hired by ESPN/ABC to do statistical modeling in the area of politics and sports. He says that it's better to act as the fox than as a hedgehog because the hedgehog does the same thing over and over again while the fox is more clever.
This analogy that Silver used caught my attention as the plural Latin word for hedgehog is ericii. It has derivatives in Spanish (Erizos) and French (Herissons). In Italian the word is Ricci (I know Christina's publicist might not be happy about me revealing this). I feel that I cover a wide variety of topics on this blog and approach them from a variety of angles. Alex Rodriguez was exactly in the middle of the graph in 1998 before becoming baseball's highest paid player at $25 million a year and is now facing a big suspension for substance abuse. He may be more of a hedgehog.