It's a standard observation that when a team does poorly, the coach -- or in the case of baseball, the manager -- is fired, even though it wasn't the manager dropping balls, throwing the wrong direction or striking out.

Of course, there are purported examples of team leaders that seem to produce teams better than the sum of the parts that make them up. Bill Belichick seems to be one, even modulo the cheating scandals. Cito Gaston is credited with transforming the Blue Jays from a sub-.500 team into a powerhouse not once but twice, his best claim to excellence being this season, in which he took over halfway through the year.

But what is it they do that matters?

Even if one accepts that managers matter, the question remains: how do they matter? They don't actually play the game. Perhaps some give very good pep talks, but one would hope that the world's best players would already be trying their hardest pep talk or no.

In baseball, one thing the manager controls is the lineup: who plays, and the order in which they bat. While managers have their own different strategies, most lineups follow a basic pattern, the core of which is to put one's best players first.

There are two reasons I can think of for doing this. First, players at the top of the lineup tend to bat more times during a game, so it makes sense to have your best players there. The other reason is to string hits together.

The downside of this strategy is that innings in which the bottom of the lineup bats tend to be very boring. Wouldn't it make sense to spread out the best hitters so that in any given inning, there was a decent chance of getting some hits.

How can we answer this question?

To answer this question, I put together a simple model. I created a team of four .300 hitters and five .250 hitters. At every at-bat, a player's chance of reaching base was exactly their batting average (a .300 hitter reached base 30% of the time). All hits were singles. Base-runners always moved up two bases on a hit.

I tested two lineups: one with the best players at the top, and one with them alternating between the poorer hitters.

This model ignores many issues, such as base-stealing, double-plays, walks, etc. It also ignores the obvious fact that you'd rather have your best power-hitting bat behind people who get on base, making those home-runs count for more. But I think if batting order has a strong effect on team performance, it would still show up in the model.

Question Answered

I ran the model on each of the line-ups for twenty full 162-game seasons. The results surprised me. The lineup with the best players interspersed scored nearly as many runs in the average season (302 1/4) as the lineup with the best players stacked at the top of the order (309 1/2). Some may note that the traditional lineup did score on average 7 more runs per game, but the difference was not actually statistically significant, meaning that the two lineups were in a statistical tie.

Thus, it doesn't appear that stringing hits together is any better than spacing them out.

One prediction did come true, however. Putting your best hitters at the front of the lineup is better than putting them at the end (291 1/2 runs per season), presumably because the front end of the lineup bats more times in a season. Although the difference was statistically significant, it still amounted to only 1 run every 9 games, which is less than I would have guessed.

Thus, the decisions a manager makes about the lineup do matter, but perhaps not very much.

Parting thoughts

This was a rather simple model. I'm considering putting together one that does incorporate walks, steals and extra-base hits in time for the World Series in order to pick the best lineup for the Red Sox (still not sure how to handle sacrifice flies or double-plays, though). This brings up an obvious question: do real managers rely on instinct, or do they hire consultants to program models like the one I used here?

In the pre-Billy Beane/Bill James world, I would have said "no chance." But these days management is getting much more sophisticated.

Of course, there are purported examples of team leaders that seem to produce teams better than the sum of the parts that make them up. Bill Belichick seems to be one, even modulo the cheating scandals. Cito Gaston is credited with transforming the Blue Jays from a sub-.500 team into a powerhouse not once but twice, his best claim to excellence being this season, in which he took over halfway through the year.

But what is it they do that matters?

Even if one accepts that managers matter, the question remains: how do they matter? They don't actually play the game. Perhaps some give very good pep talks, but one would hope that the world's best players would already be trying their hardest pep talk or no.

In baseball, one thing the manager controls is the lineup: who plays, and the order in which they bat. While managers have their own different strategies, most lineups follow a basic pattern, the core of which is to put one's best players first.

There are two reasons I can think of for doing this. First, players at the top of the lineup tend to bat more times during a game, so it makes sense to have your best players there. The other reason is to string hits together.

The downside of this strategy is that innings in which the bottom of the lineup bats tend to be very boring. Wouldn't it make sense to spread out the best hitters so that in any given inning, there was a decent chance of getting some hits.

How can we answer this question?

To answer this question, I put together a simple model. I created a team of four .300 hitters and five .250 hitters. At every at-bat, a player's chance of reaching base was exactly their batting average (a .300 hitter reached base 30% of the time). All hits were singles. Base-runners always moved up two bases on a hit.

I tested two lineups: one with the best players at the top, and one with them alternating between the poorer hitters.

This model ignores many issues, such as base-stealing, double-plays, walks, etc. It also ignores the obvious fact that you'd rather have your best power-hitting bat behind people who get on base, making those home-runs count for more. But I think if batting order has a strong effect on team performance, it would still show up in the model.

Question Answered

I ran the model on each of the line-ups for twenty full 162-game seasons. The results surprised me. The lineup with the best players interspersed scored nearly as many runs in the average season (302 1/4) as the lineup with the best players stacked at the top of the order (309 1/2). Some may note that the traditional lineup did score on average 7 more runs per game, but the difference was not actually statistically significant, meaning that the two lineups were in a statistical tie.

Thus, it doesn't appear that stringing hits together is any better than spacing them out.

One prediction did come true, however. Putting your best hitters at the front of the lineup is better than putting them at the end (291 1/2 runs per season), presumably because the front end of the lineup bats more times in a season. Although the difference was statistically significant, it still amounted to only 1 run every 9 games, which is less than I would have guessed.

Thus, the decisions a manager makes about the lineup do matter, but perhaps not very much.

Parting thoughts

This was a rather simple model. I'm considering putting together one that does incorporate walks, steals and extra-base hits in time for the World Series in order to pick the best lineup for the Red Sox (still not sure how to handle sacrifice flies or double-plays, though). This brings up an obvious question: do real managers rely on instinct, or do they hire consultants to program models like the one I used here?

In the pre-Billy Beane/Bill James world, I would have said "no chance." But these days management is getting much more sophisticated.

## 3 comments:

Hey Josh -- baseball teams most definitely do very sophisticated statistical analyses. See http://freakonomics.blogs.nytimes.com/2008/04/01/bill-james-answers-all-your-baseball-questions/, for example. It's pretty interesting.

Also, its rather odd to use statistics like that for a model... just run it 100,000 times! For a model this simple, you could also just do the algebra to get the true expected values :-)

Hey Tim --

Could have run it 10,000 times, but I wrote it on my home computer without Matlab. Which means I did it in Excel, which is slow!

I figured Bill James would have some interesting answers to these questions, but I didn't get around to looking him up. I was having too much fun writing my own model...something which I realize is probably pase for you, but isn't something I do very often:)

Cuban players are better than american players, aren't they?

Post a Comment