Friday, October 31, 2008

A Lesson on Lineup Optimization, With Help From Everybody's Favorite 16-Bit Baseball Simulator

Lineup optimization is a topic that is on the minds of many managers, gms, and coaches at any level of baseball (and really any sport). In baseball, lineup optimization can be reduced to a statistical science, thanks to the ridiculous amount of data provided by 30 teams playing 162 game every year for the last 105 years (staying in the World Series era). I would like to take some time to explain a few of the findings found by some researchers through the example of one of my favorite video games of all time, Ken Griffey Jr. Presents Major League Baseball for SNES.




START

All data on lineup optimization is taken from The Book, by Tom Tango, Mitchell Lichtman, and Andrew Dolphin.

Today, we'll be looking at the lineup of my favorite team, the Milwaukee Brewers. One of the quirks of Ken Griffey Jr. Presents Major League Baseball is the use of fake names for all the players. This is because of the fact that Nintendo did not acquire the MLBPA license for this game, and thus could not use any names (other than that of Griffey Jr.'s). This will be interesting, because after the lineup is optimized, I will go back and replace the fake names with the real ones and we can compare stats using lineup optimization tools on the web. Rosters are from the 1993 season. Also, note that the Brewers are still in the AL at this time, and so they still use the DH.

Let's take a look at the current lineup.

FORMAT:
NAME AVG/HR/RBI "BAT/POW/SPD/DEF"

Bat = Contact rating, Pow = power, SPD = speed, DEF = defense (basically arm)

RF J. DRAKE .310/9/48 9/4/10/8
CF S. TEMPLAR .258/8/51 7/4/7/9
3B N. SOLO .274/7/79 8/4/5/6
LF J. STEED .267/30/97 7/9/5/6
DH E. PEEL .249/13/60 5/7/4/3
1B K. GALE .264/19/70 6/7/5/5
CA J. BOND .257/7/40 6/4/4/7
SS J. ROCKFORD.244/3/30 5/4/9/8
2B P. MARLOWE .238/2/36 5/4/5/6
BENCH
CA G. IRONSIDE.198/4/25 4/6/5/5
IF S. SLADE .228/5/36 4/4/5/8
IF T. PRISONER.269/11/57 7/7/4/6
IF P. COLUMBO .269/1/33 7/3/5/5
OF R. STEEL .183/6/29 3/8/4/6
IF P. MAGNUM .319/0/1 7/2/6/8

So that's what we have to work with here. Clearly not a whole lot, only one 20+ HR hitter, only 4 10+. Not a whole lot of team speed either, with only two players with above an "8" rating in speed (basically the cutoff needed to beat out any infield hit against a half decent defense). Now that we have our team, we need to think about what we want in our batting order. Here are some points that we should consider

1. The best hitters should bat the most times.
2. The best hitters should bat the most often with men on base.
3. The worst hitters should bat the least times.

That's the simple way to look at it. Point 2 is often why the best hitter is placed in the #3 order in the lineup. However, statistical analysis of MLB over multiple seasons show that this does not give the most leverage to the best hitter. This sounds odd, because you would expect that the #3 hitter would often expect to have runners on base in front of him and the ability to drive them in. However, when you realize that even the league leaders in OBP are rarely above .450 and almost never above .500 (except for one Barry Bonds), the chances are actually very high that the #3 hitter will bat with nobody on base and 2 outs in the first inning, a very low leverage situation. This is why it is better to have the best hitter in the #4 spot, where they are either batting with runners on base in front of them or leading off an inning, another relatively high leverage situation. According to The Book, in order to maximize leverage for a lineup
Your three best hitters should bat somewhere in the #1, #2, and #4 slots. Your fourth- and fifth-best hitters should occupy the #3 and #5 slots. The #1 and #2 slots will have players with more walks than those in the #4 and #5 slots. From slot #6 through #9, put the players in descending order of quality

OK, so now we have to decide how we will measure hitter quality. Simply using BAT + POW + (SPD/5) would give a simple measurement (since speed is relatively indeterminate in a hitters contribution but should not be ignored) would work well, but through personal experience in this game I know that a BAT 5 POW 5 player would be much better than a BAT 9 POW 1 player. This makes sense in the real world of baseball given that players like Adam Dunn and Jack Cust are far more productive than guys like, say, David Eckstein who are known as "contact" hitters but can barely hit the ball out of the infield. However, in this game there is a bit of relation of contact to power, so we can't let POW completely outweigh BAT. I suggest

BAT + 1.33POW + .2SPD + .5DEF

Defense is also quite important, especially in a video game in which the players are much smaller than they are in real life relative to the field and thus have to cover much more ground.

So using this equation, we get the following ranks:

RF J. DRAKE 16.32 20.32
CF S. TEMPLAR 13.72 18.22
3B N. SOLO 14.32 17.32
LF J. STEED 19.97 22.97
DH E. PEEL 15.11 16.61
1B K. GALE 16.31 18.81
CA J. BOND 12.12 15.62
SS J. ROCKFORD12.12 16.12
2B P. MARLOWE 11.32 14.32
BENCH
CA G. IRONSIDE12.98 15.48
IF S. SLADE 10.32 14.32
IF T. PRISONER17.11 20.11
IF P. COLUMBO 11.99 14.49
OF R. STEEL 14.44 17.44
IF P. MAGNUM 10.86 14.86


Now, let's make a lineup. In the first column is pure hitting value. In the second column, defensive value is included. Because defense is regardless of position in this game, we are free to take simply the top 9 overall values. This gives us the following 9:
LF J. STEED 19.97 22.97
RF J. DRAKE 16.32 20.32
IF T. PRISONER17.11 20.11
1B K. GALE 16.31 18.81
CF S. TEMPLAR 13.72 18.22
OF R. STEEL 14.44 17.44
3B N. SOLO 14.32 17.32
DH E. PEEL 15.11 16.61
SS J. ROCKFORD12.12 16.12

First, we need our top 3 batters. They are Steed (19.97), Prisoner (16.32), and Drake (16.32). In order to decide who goes at 1, 2, and 4, let's look at the value of the HR at each position. The average run value of the HR at the 1, 2, and 4 spot respectively is
1.291, 1.349, and 1.436. Therefore, power among these 3 hitters should increase. That means that Drake will leadoff (4 POW), Prisoner will bat 2nd (7 POW), and Steed will cleanup (9 POW).
Now we need the next two best hitters. They are Gale and Peel. Because these two hitters have equal POW numbers, the better BAT player will go in the 5 spot because the average run value of the single, double, and triple are slightly higher. So Gale will bat 5 (6 BAT) and Peel will bat 3 (5 BAT). After that, we merely go in descending order. So #6 is Steel, 7 is Solo, 8 is Templar, and 9 is Rockford. Here is the lineup, along with the real person they correspond to along with their OPS+ for the 93 season.

1. J. DRAKE = Darryl Hamilton 109
2. T. PRISONER = Kevin Seitzer 119
3. E. PEEL = Kevin Reimer 87
4. J. STEED = Greg Vaughn 128
5. K. GALE = John Jaha 103
6. R. STEEL = Tom Brunansky 58
7. N. SOLO = B.J. Surhoff 91
8. S. TEMPLAR = Robin Yount 90
9. J. ROCKFORD = Pat Listach 73

It would appear that I weighted POW slightly high, but this still worked pretty well overall. Since Brunansky appears to be an outlier, let's replace him with the next highest total value on the list, which would be J. BOND, or Dave Nilsson, who had a 93 OPS+, but has the lowest offensive value, so he'll bat 9th and everybody under him will move up. So our final lineup is:

1. J. DRAKE = Darryl Hamilton 109
2. T. PRISONER = Kevin Seitzer 119
3. E. PEEL = Kevin Reimer 87
4. J. STEED = Greg Vaughn 128
5. K. GALE = John Jaha 103
6. N. SOLO = B.J. Surhoff 91
7. S. TEMPLAR = Robin Yount 90
8. J. ROCKFORD = Pat Listach 73

9. J. BOND = Dave Nilsson 93

Finally, let's take a look at the expected runs/game for both the original lineup and the final lineup, using the lineup analysis tool at http://www.baseballmusings.com/cgi-bin/LineupAnalysis.py

Original lineup: 4.436 runs/game
Final lineup: 4.632 runs/game

So we've added .196 runs/game, which over the course of a 162 game season, adds 31.752 runs. At 10 runs/win, that adds a total of 3 wins to our total! That could very well be the difference between making or missing the playoffs, so we should be very happy with our results!

Now wasn't that fun?

Friday, October 24, 2008

Addendum: The people on Jim Rome's show also have no idea what they're talking about.

Mike Sando (or whatever his name is) needs to shut up. Long swings don't hit home runs. Did any of these guys actually play the game? Have they ever really watched a game? Hitting the ball to the right side to score a run either reduces win probability or is a wash. Your goal, as a hitter, with a runner on third base is not the ground ball to the right side that scores a run. The goal is a hit. That runner from third, with less than two outs, is going to score a high enough percentage of the time that the groundout, in most situations either does not effect win probability or

Check out the game log from last night.
B MyersC Pena10_231-0
Carlos Pena grounded out to shortstop (Grounder). Akinori Iwamura scored. B.J. Upton advanced to 3B.
1.381.9968.9 %
.005-0.05
B MyersE Longoria11__32-0
Evan Longoria grounded out to shortstop (Grounder). B.J. Upton scored.
1.210.9470.9 %
.0210.16

Those are the two plays from the first inning in which the Rays scored on groundouts. Total WPA is .026. This means that these two plays, combined, added a 2.6% chance to the Rays win probability. Really, the play that actually mattered this inning was here.
B MyersB Upton101__0-0
B.J. Upton singled to right (Liner). Akinori Iwamura advanced to 3B on error. B.J. Upton advanced to 2B. Error by Jayson Werth.
1.420.8868.4 %
.1011.10

This play was essentially a double, although it was poorly played by Werth in RF. For all intents and purposes of win probability, it was a double. The play had a WPA of .101, meaning it added 10.1% to the Rays chances of winning. Basically, BJ Upton did all the working in making it so that those undesirable groundouts following him had slightly positive results. "Big Ball" will always beat "Small Ball," and don't let people like Rome and Sando let you think otherwise.



Also people are making a big deal about this play:

B MyersJ Bartlett411_34-0
Jason Bartlett sacrificed to pitcher (Bunt Grounder). Cliff Floyd scored. Rocco Baldelli advanced to 2B.
0.871.1991.1 %
.0160.14

Again, WPA is only .016. This play made the Rays 1.6% more likely to win this game. The only reason that this is a marginally good play, in my eyes, is that Jason Bartlett can't hit his way out of a wet paper bag. The thing is, with a 3 run lead, adding one extra run doesn't make a big difference. You want to pile on the runs and create a legitimate rally, and while the bunt scores the run, it's a complete rally killer, whereas letting Bartlett swing away has a very high chance of the run scoring anyway, whether it's on his at bat or the at bats following his.

Once again, Jim Rome has no idea what he's talking about.

Don't get me wrong, I'm a big fan of Jim Rome. He's funny, and occasionally smart, and almost always entertaining. But this time, he's just plain wrong.

Basically, Rome summed up Game 2 of the World Series by saying that the Rays are "built to do it to you with pitching, speed, and defense," and more to the point, that "they don't beat you with the long ball."

Why don't you ask the White Sox and the Red Sox whether or not the Rays beat you with the long ball? Of course, just looking at those 11 playoff games in which the Rays hit 22 home runs (2 HR/g) gives us a ridiculously small sample size. Sure, the Rays 180 team homers isn't quite as great as the 235 put up by the White Sox or even the 214 put up by World Series opponent Phillies,
but it still amounts to 1.125 HR/g and came in 4th in the American League.

No team in major league baseball can sustain winning ways without consistent power production. It simply can't happen. The Angels found that out in a bad way when their ridiculous, unsustainable clutch hitting over the regular season fizzled against the Red Sox. The Rays will not win this series if their bats don't show up and show up soon, because against a team with as much power hitting as the Phillies, they will need to be able to score runs in bunches. If Joe Maddon decides to play "small ball" as sportscasters such as Jim Rome seem to suggest, they will be running themselves into the ground by not giving their power hitters the chances that they need.

Wednesday, October 15, 2008

Inside the mind of a statistically-inclined baseball aficionado.

I love baseball. I'm never happier than when I'm around a baseball field. I've played organized baseball since I was 5 and probably was introduced to the game much earlier than that. I've never been the most talented player on any team I've played on, but I've never let that deter my interest for the game. I currently play for the University of Wisconsin Club Baseball B team, and I play first base and right field.

I follow Major League Baseball intensely, and I am a big fan of the Milwaukee Brewers, but even when they're not playing, I always find something in the game to hold my interest. I now enjoy playing Fantasy Baseball, as well as watching any MLB game that happens to be on the TV. When I first got into Fantasy Baseball, 5 years ago, I was introduced to such crazy stats as "OPS" and "WHIP." At first, I was skeptical. I first thought that these stats were de-humanizing the game and that the game was random enough and based enough on things like "heart" that those stats couldn't possibly mean any more than your typical stats like batting average, and that the most valuable player was simply the one who hit the most home runs or had the highest batting average.

However, after aging a bit and learning much much more about statistics and mathematics, I've come to understand and love these statistics. If you're a skeptic like I used to be, I have two publications that I heartily recommend. First, check out the following website: www.fangraphs.com - this website has projections for every player and also a very cool scoreboard feature which shows the win probability of each team based on run expectancies and other crazy things. To learn more about that, check out my second suggested reading: The Book. To get a sneak peek before you invest, check out www.insidethebook.com - this book uses intense mathematical and statistical analysis to show what strategies and outcomes are better and simply shows the best way for a GM or manager to optimize his team.

Expect a lot of posts here talking about either my own personal baseball experiences, both as a coach and a player, as well as about the MLB and the MLB media. I can't wait to tear the people offering the MVP award to Ryan Howard a new one.