Understanding Sabermetrics: Finding MLB Wins in Data

This article is an excerpt from the Shortform summary of "Moneyball" by Michael Lewis. Shortform has the world's best summaries of books you should be reading.

Like this article? Sign up for a free trial here.

What are MLB sabermetrics? What are the steps to understanding sabermetrics and what kind of statistics are used?

Understanding sabermetrics is complicated. Sabermetrics are a fairly new concept pioneered by Bill James. MLB sabermetrics were made famous by their adoption by Oakland A’s GM Billy Beane, and the book Moneyball that chronicled the A’s first season of using and understanding sabermetrics.

Understanding Sabermetrics: Removing Luck from Baseball Analysis

James begins publishing annual treatises, in which he highlights the uselessness of commonly-cited baseball statistics like errors, and runs batted in (RBIs), and shows the benefits of understanding sabermetrics. Another of James’s key insights is that much of the data that managers, scouts, and GMs use to evaluate players’ skill is based on luck and subjective opinion. In publishing his annual baseball abstracts, James attempts to strip as much luck out of baseball analysis as possible.

James also points out the absurdity of that statistic revered by baseball’s old guard: the RBI. A batter earns an RBI when they score a hit that results in another player scoring a run. By definition, it is a product of luck: a batter has to have the good fortune to be at bat when other players are already on base. The batter will, of course, have had nothing to do with their teammates getting on base before them. It is a matter of luck to even have the opportunity to score an RBI. A batter who hits a single with a runner in scoring position and a batter who hits a single with no one on base have performed the same athletic feat—but one will be credited with an RBI, while the other won’t. 

The Runs Created Formula

James wants to see exactly which plays most contribute to a team’s offensive production. He sets out to construct a model to predict how many runs a team might expect to score, based on their accumulated walks, singles, doubles, stolen bases, and other offense-generating plays. He pores over statistics from previous baseball seasons to see which components of offense are most correlated with runs. This allows him to assign a statistical “weight” to each, in effect determining how many runs (or portions of a run) a walk, hit, or stolen base is worth. This comes to be a defining part of understanding sabermetrics.

He distills it down to a “Runs Created” formula:

Runs Created = (Hits + Walks) x Total Bases/(At Bats + Walks)

After statistical analysis, James finds that this formula is remarkably effective in predicting a team’s run production. It shows that walks are a valuable (yet overlooked) part of offense, while batting average matters hardly at all.

Understanding Sabermetrics: Undervalued Vs. Overvalued Data

In 1995, as Billy Beane is working his way up the ranks of the Oakland organization, the new owners of the A’s demand that the team drastically reduce payroll. The team must now economize, making sure that every dollar spent on players contributes to on-field success. 

This occurs during an era of skyrocketing player salaries, driven by the rise of free agency, which profoundly changes the economics of professional baseball. Rich teams like the New York Yankees and the Boston Red Sox are able to spend unlimited sums in order to acquire the biggest stars on the free agent market. But relatively poor teams like the A’s are forced to take a new approach to player acquisition, looking for players who are undervalued by the market and can be gotten on the cheap.

At this time, many of baseball’s most widely used statistics to evaluate baseball players are coming under fire. These critics (including the A’s general manager as well as a baseball writer named Bill James) note, for example, that team batting average is overrated because it overlooks the importance of walks to a team’s total offensive output.  Likewise, James criticizes the RBI as a product of luck: a batter must have the good fortune to be at bat when other players are already on base to be able to bring them home with a hit. It is a matter of luck to even have the opportunity to score an RBI.

James contrasts this with statistical measures that had been overlooked by professional baseball. On-base percentage, he argues, has a better correlation with run production than batting average, because it accounts for all the ways a player can get on base and contribute offense: including walks.

Despite this, players with high batting averages and low error counts are highly valued, while players with lots of walks and a high on-base percentage (but who aren’t flashy sluggers with high batting averages) are deeply undervalued. Yet teams continue making poor (and expensive) personnel decisions based on flimsy information, despite the availability of better data.

The world of professional baseball is hostile to this statistics-based criticism, refusing to take seriously ideas from which they would clearly benefit. But Bill James does have one devoted reader, who soon puts his ideas into practice at baseball’s highest level—Billy Beane, who is a proponent of understanding sabermetrics.

A Value-Driven Philosophy

As a cash-strapped club, the Oakland A’s adopt a practice of developing players on their own through the draft and then trading them a few years later when they become free agents, after which such players will be too expensive to retain. After the 2001 season, they lose some major stars—through trades and free agency—whose contracts they are no longer able to afford. This is a great time to introduce MLB sabermetrics.

Despite the team’s lack of financial resources, the A’s are remarkably successful on the field under Billy’s leadership. The 2001 A’s finish 102-60. The 2002 iteration of the team goes on to a 103-59 record, good for first place in the American League West Division and second overall in the league, despite having what appears to be an inferior roster. 

They achieve this through shrewd, value-driven management of their most important asset: the players. No one on the team is treated as irreplaceable. DePodesta is able to quantify the net runs each player contributes to the team (by adding runs through offense and preventing them through defense). Every action taken by a player has an expected run value, showing the team’s understanding of sabermetrics.

Taking this a step further, DePodesta calculates that the team will likely need to win 95 games to make the playoffs. To win 95 games, they will need to have a net run differential of approximately +135. And this makes the team’s task a lot clearer. They need to find a combination of players whose net run production will offset the loss of Isringhausen, Damon, and Giambi, plugging in those holes to get the run differential they need.

Understanding sabermetrics is dependent on statistical analysis and a manager’s ability to made calculated decisions based on data. The A’s introduced the idea of using and understanding sabermetrics into the mainstream baseball world, and they were met with criticism despite their successful season.

Understanding Sabermetrics: Finding MLB Wins in Data

———End of Preview———

Like what you just read? Read the rest of the world's best summary of Michael Lewis's "Moneyball" at Shortform.

Here's what you'll find in our full Moneyball summary:

  • How Billy Beane first flamed out as a baseball player before becoming a general manager
  • The unconventional methods the Athletics used to recruit undervalued players
  • How Sabermetrics influences American baseball today

Leave a Reply

Your email address will not be published. Required fields are marked *