If you've been watching the playoffs on FOX, you'll notice that rather than present a given player's regular-season statistics, they've been mostly showing us their statistics either for all playoff games in their career, or just for the 2007 post-season. Is that trivia, or is it an actual statistic? For instance, David Ortiz hits better in the post-season than during the regular season. OK, one number is higher than the other, but that could just be random variation. Does he really hit better during the playoffs?
Why does this even matter? There is conventional wisdom in baseball that certain players hit better in clutch situations -- for instance, when men on base. This is why RBIs (runs-batted-in) are treated as a statistic, rather than as trivia. Some young Turks (i.e., Billy Beane of the Oakland A's) have argued vigorously that RBIs don't tell you anything about the batter -- they tell you about the people who bat in front of him (that is, they are good at getting on base). Statistically, it is said, few to no ballplayers hit better with men on and 2 outs.
So what about in the post-season?
I couldn't find Ortiz's lifetime post-season stats, so I compared this post-season, during which he's been phenomenally hot (.773 on-base percentage through the weekend -- I did this math last night during the game, so I didn't include last night's game), compared with the 2007 regular season, during which he was just hot (.445 on-base percentage).
There are probably several ways to do the math. I used a formula to compare two independent proportions (see the math below). I found that his OBP is significantly better this post-season than during the regular season. So that's at least one example...
Here's the math.
You need to calculate a t statistic, which is the difference between the two means (.773 and .445) divided by the standard deviation of the difference between those two means. The first part is easy, but the latter part is complicated by the fact that we're dealing with ratios. That formula is:
square root of: (P1*(1-P1)/N1 + P2*(1-P2)/N2)
where P1 = .773, P2 = .445, N1 = 659 (regular season at-bats - 1), N2 = 22 (post-season at-bats - 1).
t = 2.99, which gives a p value of less than .01.
I was also considering checking just how unusual Colorado's winning streak is, but that's where my knowledge of statistics broke down (maybe we'll learn how to do that next semester). If anybody has comments or corrections on the stats above or can produce other MBL-related math, please post it in the comments.
Why does this even matter? There is conventional wisdom in baseball that certain players hit better in clutch situations -- for instance, when men on base. This is why RBIs (runs-batted-in) are treated as a statistic, rather than as trivia. Some young Turks (i.e., Billy Beane of the Oakland A's) have argued vigorously that RBIs don't tell you anything about the batter -- they tell you about the people who bat in front of him (that is, they are good at getting on base). Statistically, it is said, few to no ballplayers hit better with men on and 2 outs.
So what about in the post-season?
I couldn't find Ortiz's lifetime post-season stats, so I compared this post-season, during which he's been phenomenally hot (.773 on-base percentage through the weekend -- I did this math last night during the game, so I didn't include last night's game), compared with the 2007 regular season, during which he was just hot (.445 on-base percentage).
There are probably several ways to do the math. I used a formula to compare two independent proportions (see the math below). I found that his OBP is significantly better this post-season than during the regular season. So that's at least one example...
Here's the math.
You need to calculate a t statistic, which is the difference between the two means (.773 and .445) divided by the standard deviation of the difference between those two means. The first part is easy, but the latter part is complicated by the fact that we're dealing with ratios. That formula is:
square root of: (P1*(1-P1)/N1 + P2*(1-P2)/N2)
where P1 = .773, P2 = .445, N1 = 659 (regular season at-bats - 1), N2 = 22 (post-season at-bats - 1).
t = 2.99, which gives a p value of less than .01.
I was also considering checking just how unusual Colorado's winning streak is, but that's where my knowledge of statistics broke down (maybe we'll learn how to do that next semester). If anybody has comments or corrections on the stats above or can produce other MBL-related math, please post it in the comments.
No comments:
Post a Comment