Challenging WAR and Other Statistics as Era-Adjustment Tools

This article is a casual version of my paper “Challenging Nostalgia and Performance Metrics in Baseball” published in Chance which showed, among other things, that wins above replacement (WAR) and the wide class of “versus your peers” statistics are incapable of accurately comparing players across eras. In particular, it was shown that WAR exhibits a very strong bias toward baseball players who played in earlier seasons. A collection of resources and an interactive web app within this framework can be viewed here.

How We Came To This Conclusion

In our research, we split baseball data into time periods and show that WAR includes players from the older era in its all-time rankings. Specifically, the older time period is defined by players who started their career in 1950 or before, and the newer group is defined by players who started their career after 1950. The split date of 1950 corresponds to the US Census that is closest to the integration of baseball in 1947. Prior to 1947, Major League Baseball was a largely all-white segregated sports league, but it slowly but surely integrated in America and the has steadily risen in popularity abroad. All the while, the world populations continue to grow as time progresses. Simply put, there are far more people in the baseball-eligible talent pool post-1950 than before.

We find that roughly 20% of the “realistic historic talent pool” belongs to the pre-1950 group. By “realistic historic talent pool” we mean the cumulative population of men ages 20-29 collected every 10 years arising from baseball playing countries (men ages 20-29 serve as a proxy for a concept of talent pool that is otherwise not well-defined). Before 1950, this population is basically just white American men. After 1950, this population includes all American men, as well as men from a plethora of baseball-playing countries. Read the rest of this entry »

A Rule Change Idea Too Fun for MLB

If you’re reading this, you are surely a baseball fan, and as such, you’re probably aware that Major League Baseball is putting lots of options on the table when it comes to rule changes to shake up the game and make it more interesting. We’ve already seen the intentional walk become automatic and the limiting of mound visits. MLB also reached an agreement with the Atlantic League to experiment with some other ideas, such as robot umpires, a three-batter minimum for pitchers, starting extra innings with runners on base, moving the mound back, and banning the shift. Some of these ended up being adopted in the majors on a temporary basis for the pandemic-shortened 2020 season and may end up getting implemented more permanently, depending on how the upcoming CBA negotiations go.

But I have an idea that I think is better than any of these. It’s a small rule change; but it would radically change the game. Too radical even for this change-happy commissioner, I think. And here it is:

On a ball in play, a runner who reaches home can decide to continue on to first base and keep running.

Now, before I explain why I find this rule change so appealing, let me first get the logistics out of the way. How could it be determined if a player has decided to go to first or not? For this part, it would have to operate the same as a batter running to first base. (To be clear, I don’t think it should be a force play at first base, though it would still be fun if it was.) Read the rest of this entry »

RE+: Factoring Player & Team Hitting Ability Into Run Expectancy and the True Value of a Stolen Base

There are 24 different “states” in baseball. The three bases can be filled in eight different ways, and there can be 0, 1, or 2 outs at any given moment. Each of these 24 base-out states has an expected run value associated with them. Each value represents the average number of runs that the team is expected to score by the end of the inning. These values change each season depending on the run environment, but they generally don’t vary much.

2019 Average Run Expectancy by State
STATE 0 outs 1 out 2 outs
000 0.53 0.29 0.11
100 0.94 0.56 0.24
010 1.17 0.72 0.33
001 1.43 1.00 0.38
110 1.55 1.00 0.46
101 1.80 1.23 0.54
011 2.04 1.42 0.60
111 2.32 1.63 0.77

Consider the following situation: Lorenzo Cain is on first base with two outs. Now consider two possible hitters, one being Christian Yelich and the other being Ryan Braun. According to the 2019 averages, the run expectancy in this base-out state was 0.24, regardless of the hitter. While both players had impressive seasons, Yelich is unquestionably the superior player at this point in time.

2019 Player Comparison
Player wOBA ISO
Ryan Braun .354 .220
Christian Yelich .442 .342

As a result of their differences, the run expectancy should be higher when Yelich is at the plate. Consequently, the benefit Milwaukee gets from Cain attempting to steal second base should be adjusted as well. Why is this the case? Given Braun’s inferior power and hitting ability, there is more to gain from Cain putting himself in scoring position, but more importantly, there is less to lose if he were to get caught. On the other hand, Yelich is much more likely to drive the ball. With Yelich at the plate, the increase in run expectancy from a stolen base is slightly smaller than if Braun were hitting. However, the decrease in run expectancy from being caught is significantly greater. This is why we need RE+. Read the rest of this entry »

An Extra Inning Runner Study

The 2020 season brought unprecedented rule changes, one of the most puzzling among them being the “extra inning runner.” Ostensibly in an effort to reduce the spread of COVID-19 and speed up play, commissioner Rob Manfred decreed that once a game progresses past the ninth inning, a runner would be placed at second base to begin the frame.

Manfred’s blatantly obvious motives turned baseball fans — a demographic notorious for their acceptance of changes to the national pastime — against it. If there is any defense to be made for the addition of the extraneous runner, it’s that shorter games helped save pitchers’ arms in what’s already been an utterly brutal season for pitcher injuries.

This seismic rule change also created a correspondingly large shift in how teams strategized after a game surpassed nine innings. Teams, even the more sabermetrically inclined among them, began to employ traditional tactics. In order to determine how clubs played with a free runner, I charted every extra inning of the 2020 season. Read the rest of this entry »

Julio Teheran Might Need to Re-Invent Himself

Julio Teheran’s career has been one largely defined by consistency. Over his seven full seasons in Atlanta (2013-19), he never made fewer than 30 starts or threw under 174 innings, with ERAs between 2.94 and 4.49. Arguably the most defining element of his reliability was how he consistently out-performed his peripheral numbers. In each of those seasons, Teheran considerably out-pitched both his FIP and xFIP, often by close to a full run.

Julio Teheran Previous Full Seasons
2013 3.20 3.69 3.76
2014 2.89 3.49 3.72
2015 4.04 4.40 4.19
2016 3.21 3.69 4.13
2017 4.49 4.95 4.96
2018 3.94 4.83 4.72
2019 3.81 4.66 5.26

On the surface, it would appear that Teheran was already declining significantly over the previous three seasons, even if his ERAs failed to reflect such a story. Like many veteran starters, the easy assumption for such a decline would be diminished stuff, but his 22.4% strikeout rate in 2018 was the best of his career to date, with his 21.5% in 2019 not far behind. Teheran’s decline in Atlanta was predominantly marked by a notable loss of control — jumping from a 5.4 BB% in 2016 to 8.9% in 2017, then 11.6% and 11.0% in 2018 and 2019, respectively. Teheran’s streak of consistent results came to a screeching halt in 2020 to the tune of an 10.05 ERA, 8.62 FIP, and 6.35 xFIP, a recipe that culminated in -0.9fWAR.

But the most worrying sign for Teheran is that this is not a continuation of the previous problem. His walk rate for 2020 was 10.7%, still worse than his career average, but a slight improvement on the previous two seasons. What’s particularly alarming is that his strikeout rate plummeted to just 13.4%, while no pitcher in baseball with more than 100 batters faced had a whiff rate lower than Teheran’s 14.6%. Teheran was only hit slightly harder than previously; while his 38.7 hard hit % was notably higher than 2018’s 36.7% and 2019’s 35.4%, his average EV allowed was only slightly worse than league average at 89.0 mph, identical to his 2019 season. When paired alongside his inability to miss bats though, this high volume of hard contact led to disastrous results. Read the rest of this entry »

Mike Port on Umpiring, Rule Changes, and Analytics

Mike Port’s professional baseball career spanned more than four decades, from 1969 to 2011. When his aspirations to play in the big leagues ended with an injury shortly after signing with San Diego, he accepted a position in the Padres’ minor league system and worked his way up to the role of farm director. In 1977, he began a 14-year stint with the California Angels where he was promoted to general manager in September, 1984. Following 18 months as the first president of the Arizona Fall League, Mr. Port migrated to the East Coast to begin a 12-year run with the Boston Red Sox as an assistant GM and held the acting general manager title during the 2002 season.

He was named Major League Baseball’s Vice President of Umpiring in August 2005 and remained in that position through the 2011 campaign. I conducted a telephone interview with Mr. Port in September of 2020 in which we discussed the general manager’s role and responsibilities (to be included in my upcoming book, Hardball Architects: Volume 2). Our chat drifted into topics such as umpiring, instant replay, and various rule changes that have been implemented in the past decade.


DB: A number of rule changes were implemented for the 2020 season – the three-batter minimum for pitchers, seven-inning double-headers, extra innings starting with a runner on second base, designated hitter in the National League. It remains to be seen which rules will stay on the books.

MP: I was told in early September that it’s “under consideration” for Major League Baseball to make all games seven innings. Certainly they’re going to forego a lot of concession revenue. As one former pitching specialist told me, “They’re playing these seven-inning double headers. Well, that’s still fourteen innings in one day. So, you’re getting the games in, but is it at some expense to the people on your staff?” Read the rest of this entry »

Introducing Probabilistic Pitch Scores and xWhiff Metrics

With the advent of the Statcast era, a lot of research has been done in attempts to measure the effectiveness of a particular pitch based on its flight characteristics. As has been noted in the past, quantifying a pitcher’s stuff and command is no easy task. However, over the past few months I have worked to build my own models in an attempt to evaluate the “filth” of any given pitch, taking more of a probability-based approach. I introduce to you my Probabilistic Pitch Scores and xWhiff metrics.

When evaluating the quality of a particular pitch, I focused my interest on three different binary outcome variables: whether or not the batter swung at a pitch, whether or not the batter whiffed on a pitch, and whether or not a pitch was thrown for a strike. Thus, my goal was to train three different types of classification models corresponding to each of these variables: a swing, a miss, and a called strike. For the actual outcomes of these models, I was less interested in the model’s decision and more interested in the predicted probability. For example, if a batter swings on a pitch with given flight characteristics, what is the probability that he will whiff? These probabilities were utilized as the basis of my metrics.

Read the rest of this entry »

Leverage and Pitcher Quality Through the Eyes of Managers

Much criticism has been levied onto baseball managers and their inability to see past the archetypal dominant closer who closes pitches in save situations. Writers in the statistical community have observed and critiqued the many flaws which come with the save statistic and how it’s perceived by fans, managers, and baseball decision-makers as far back at least 2008 [1]. Accumulating saves is a function of opportunity and degree of difficulty that is certainly not the best way to get at a relief pitcher’s ability to get outs. More objective methods such as ERA and its estimators, like Fielding Independent Pitching (FIP) and Skill-Interactive Earned Run Average (SIERA). are better ways to evaluate a pitcher’s talent, and Win Probability Added (WPA) is better for measuring a pitcher’s importance to winning specific games. This criticism has definitely been heard in the intervening years by people running ball which, can be shown by the number of pitchers who are getting saves on each team and the variance of save totals for a given team.

A team with high variance in their save totals means that there is one player who accumulates a lot of saves and some number who have very few, opposed to lower variance representing a more even distribution of saves among pitchers. This variance metric is heavily negatively correlated (-0.74) with the number of pitchers a team has record a save in a given season. This means the more pitchers recording a save on a team, the more likely the distribution is to be equitable and the insistence on using your best pitcher in only a save situation is lower. Based on this analysis, somewhere between 2008 and 2011 was the peak on the capital “C” Closer in the majors. A rather precipitous drop occurred in 2016 and has continued on a downward trajectory to the point where last year saw the most equitable distribution of saves among teams since 1987, excluding the lockout-shortened 1994 campaign. Read the rest of this entry »

Did Sinkers Make the Comeback That Was Promised?

Back in mid-February — a truly different time for all of us — I applied to work at Statcast as an intern, a position that was unfortunately canceled in mid-March. However, I wrote an answer to a question that I intended to turn into an article at some point during the season. At the time, I figured I might write it in May, but a delayed season meant a delay to my piece as well.

Who would like to consider the curious case of Alex Presley? His career was relatively muted; he played for five teams in eight seasons as a fourth outfielder, only once cracking 100 games played and only once posting an OPS over .800. His Baseball-Reference page has him sporting a White Sox hat despite never having played a regular-season game for the team. He had a .620 OPS across 55 minor league games in 2018 and was released before he could make his way into July. He has the ignoble mark of having the second-lowest career WAR of any MLB player born in Monroe, Louisiana, finishing sixth out of seven players (Chuck Finley laps the field, and Presley finished above only Wayne Cage’s 0.1 career WAR). But Alex Presley was never supposed to be a star; rather, he was the now-forgotten harbinger of the launch angle revolution.

In the past several seasons, launch angle has absolutely been all the rage. Being able to capture new data has made an impact on the scene due in large part to fascinating statistics such as pitch movement, exit velocity, improved defensive statistics like OAA, and launch angle. During that time, teams have been changing their exit velocity drastically as well. In 2015, the earliest year with Statcast data, the measured average launch angle for all of baseball was 10.1. By 2019, it was 12.2.

In 2014, Pirates pitchers, the leaders of the sinkerball revolution, allowed an average launch angle of only 6.9 degrees. Since then, only three teams have been below 7.0 (the Rockies twice and the Cardinals once), and in 2018 and 2019, only three were under 10.0 and none were under 9.0. Additionally, even though the Pirates led the league in Barrel% by over a percentage point in 2014, the top teams in Barrel% in 2019 were much closer to a league-average launch angle. The data makes it clear: Launch angle is certainly going up. Read the rest of this entry »

Controlling Launch Angle To Limit Damage

Successful pitchers limit damage by minimizing the quality of contact they allow. How they can best do that remains up for debate, as pitchers tend to focus on some combination of deception, movement, and location to try and miss barrels. I propose that the most important pitcher-influenced variable to quality of contact is Launch Angle, and understanding and influencing it ought to be a priority for all pitchers. It is clear that Exit Velocity is the single most important predictor of a batter’s success, but that relationship cannot be manipulated much, if at all, by any pitcher. Across baseball, batters’ Exit Velocity distributions are much tighter than their Launch Angle distributions. This means pitchers are likely better able to directly influence Launch Angle than Exit Velocity, which is quite “sticky” around the mean for a given hitter. No amount of talent on the mound can rob Giancarlo Stanton of the strength that produces 120+ mph homers, but that doesn’t mean his production cannot be neutralized. Alex Chamberlain of RotoGraphs recently explored this idea at great length, coming to much the same conclusion.

This, to me, demands a new pitching approach centered around what I call “Launch Angle Deflection,” or the attempt to induce weak contact and get outs by “deflecting” batted balls to extreme (and therefore suboptimal) launch angles. A recent thread by Tom Tango illustrates this quite well, where each line represents an 8-degree “group” of launch angles. At either end of the launch angle spectrum, batted balls closer to the edge produce lower wOBA at all Exit Velocities. Read the rest of this entry »