The world champion Boston Red Sox headline the list of teams with a win-loss record (97-65) more than four games worse than what both fWAR and rWAR expected. The obvious knee-jerk response to this statement for many would be to scoff at WAR for declaring the World Series winners to be worse than they actually were. In fact, the exact opposite is true. Both fWAR (106-56) and rWAR (104-58) rated the Red Sox as the best team in baseball, and predicted the team to be much better than what their record turned out to be. As we all know, the bullish stance taken by WAR on the Red Sox played out as expected in the playoffs, which ended with the Sox defeating the St. Louis Cardinals in a 6-game Series that wasn’t as close as its length made it look.
The feel-good “worst to first” narrative of this year’s Boston team was a media favorite this year, and there was substantial outcry when John Farrell finished second in Manager of the Year voting to a former Red Sox skipper who’d somehow turned up in “flyover country.” Whether Farrell’s contributions as a clubhouse leader were responsible for several Red Sox turning in stellar individual performances is up for debate. The fact that WAR rated the Sox so well, however, reveals that Farrell was in fact working with a fantastic baseball team. Unfortunately for fans of baseball “magic,” the reason for the success of the 2013 Red Sox is clearly that its players (as measured objectively against all of the other players in baseball) were really, really good. The question we’re left with, then, is why the Sox were “only” able to win 97 games.
It’s highly doubtful that anyone would argue the scrappy, bearded ruffians in possession of the best record in baseball lacked the infamous “will to win.” A quick look at Boston’s Pythagorean record (100-62), by which the team also underperformed, made me wonder whether the Red Sox tended to win in blowouts and/or lose close games. Indeed, the team had the highest run differential in the major leagues, and their record in games decided by one run was 21-21, or 99 percentage points lower than their overall winning percentage. In fact, the Sox had a worse record in 1-run games than all but two other teams in the major leagues (out of 15) with winning records. One of those teams was the Detroit Tigers, who also substantially underperformed their WAR expectations (-11 fWAR, -8 rWAR).
The Red Sox’ comparatively bad performance in close games defies conventional wisdom – the team had a well-regarded manager, a great closer (Koji Uehara), and a reputation for grinding out tough wins. It’s possible that the team was merely unlucky, and that this carried over to their record in close games. A closer look at the team’s relief pitching, however, yields some startling results given the reputation of Boston’s bullpen and the fact that the team was so successful. Boston’s relief pitchers saved only 33 games, good for 28th in the major leagues. Moreover, the Red Sox blew 24 saves, making their saves-to-blown saves differential worse by a lot than any other team in baseball except for the historically bad Houston Astros. Of course, that 64 of Boston’s wins came in non-save situations (also the best in baseball) speaks to the team’s tendency to score a lot of runs and win in blowouts. In fact, the Red Sox won nearly as many games (48) by 4 or more runs as the Houston Astros won the entire season (51). Not unrelated to the surprising close-game performance of the bullpen, the underwhelming individual performance/team success ratio of the Red Sox raises the question of whether John Farrell’s strategic decisionmaking was as sound as many believe. Of course, everything worked out in the end for both the Red Sox and the national media. They were the most outstanding team, won the World Series, and so who cares if they could have been even better? At any rate, WAR was not surprised that the Red Sox turned out better than everyone else.
Interestingly, the worst underperformer (relative to fWAR and rWAR) in baseball was another elite American League team, the Detroit Tigers. On its face, the Tigers’ 93-69 regular season record does seem a bit soft, given the team’s explosive scoring ability (2nd in the AL in runs) and excellent pitching (2nd in the AL in runs/game). A quick look at Detroit’s WAR totals and Pythagorean record add credence to the suspicion that something “went wrong,” or considering the team’s respectable playoff run and narrow ALCS loss, “slightly awry.” For example, the Tigers’ combined individual fWAR created an eye-popping expected win-loss record of 104-58, a historically great mark equaled only once in the history of the franchise. While not as extreme, the team’s rWAR prediction (101-61) and Pythagorean expectation (99-63) were notably above the Tigers’ actual win total.
Like the Red Sox, the Tigers’ win-loss record (again, comparatively) can be partially attributed to the team’s performance in close games. Simply put, Detroit was horrendous in games decided by one run, with a record in those contests of 20-26 (134 percentage points below their overall record). Only one team in baseball with a winning record, the Baltimore Orioles (celebrated just a year ago for their “ability” to pull out close wins!), fared worse in one-run games than the 2013 Tigers.
The Colorado Rockies (74-88 actual record) and Los Angeles Angels of Anaheim (78-84) were notable in that both teams, according to fWAR and rWAR, “should” have had winning records in 2013. Alas, WAR’s overestimation of the Angels is not likely due to a massive Brian Kenny-led conspiracy to make Mike Trout look like the best player in baseball, although it is perversely tempting to find the perceived shortcomings of an analytical tool in the fact that its archetypical player (Trout) happens to play for a team it overestimated. WAR values certain things in certain ways, the argument would go, and the perfect sabermetric storm created someone who exploits the formula to perfection without actually being that good. Before I digress further, I’ll note that he most commonly cited “victim” of Troutism (M. Cabrera) played for a team (Detroit) overrated more heavily by both fWAR and rWAR than any other team in baseball.
In reality, the Angels were a fringe underperformer, having come (according to rWAR’s expected 83-79) within one game of a four-win margin of error. The team slightly underperformed its Pythagorean record (81-81) as well, making it likely that simple bad luck had a lot to do with Anaheim not quite meeting the standard set by its players’ individual performances. One interesting team statistic that stood out for the Angels’ season was that despite being a solidly mediocre team (and, according to fWAR and rWAR, above average), Anaheim owned one of the worst home-road splits in the major leagues. Even with the built-in advantage of playing on the west coast, the team was one of only six in all of baseball that did not perform better at home than it did on the road.
The Colorado Rockies underperformed more significantly – had the team played to its WAR expectation (82-80 by both measures), Colorado would have finished the 2013 season with the second-best record in its division. The team boasted the National League batting champion (M. Cuddyer), legitimate MVP-caliber stars (T. Tulowitzki and C. Gonzalez), and some intriguing young pitching talent. Instead, the Rox limped into last place with a 74-88 record and spent their offseason at home for the 18th time in their 21 years as a franchise.
Colorado represented a curious case for me. Their Pythagorean record (76-86), while slightly inflated, basically predicted how well the team performed in 2013. Immediately, I began to wonder if WAR had a history of overestimating the Rockies, and whether that had anything to do with the extremely hitter-friendly Coors Field. I hypothesized that the WAR calculation’s park adjustment for Coors was potentially soft, and that as a result the team (particularly through its hitters) would look better on paper than it actually was. After all, countless condescending articles have been written about Rockies stars over the years, writers disclaiming the players’ greatness with the fact they played most of their careers in Denver. Right now, we can best see the results of the “Coors-inflation” bias in the curiously limp advocacy (even by numbers people) of Larry Walker as a Hall of Fame candidate, despite his favorable overall statistical comparison to all-time greats like DiMaggio and Duke Snider.
I looked, then, at the Rockies’ combined individual WAR totals from the past 10 seasons and compared their expected finishes with their actual and Pythagorean records. Over the span of twenty “WAR seasons” (both fWAR and rWAR) from 2003-2012, the Rockies underperformed fourteen times, 6 of the 14 underwhelming campaigns being greater than a four-win margin of error. Things got even more interesting – when I put rWAR and fWAR side by side, I noticed that Baseball Reference rated the Rockies’ hitters higher than did Fangraphs every year but one from 2003-2012. Over the entire period, Baseball Reference, on average, judged the Rockies to be worth about 2 extra WAR per year. Conversely, Fangraphs rated the Rockies’ pitchers much higher than did B-R. In every year but two, the Rox’ pitcher fWAR was higher than their pitcher rWAR, for an average difference of almost 4 wins per year. The data made me wonder, and future research might address, whether the formulae of both fWAR and rWAR overrate the Rockies, but for the opposite reasons.
I expected the Rockies’ Pythagorean win expectations from 2003-2012 to be roughly on track with their actual records (as in 2013), and what I found deepens the mystery even further. Like both methods of WAR calculation, it seems that Pythagoras seems to consistently overrate the Rockies. Over the ten-season period, the Rockies only performed up to their Pythagorean win expectation twice.
Apart from a general suspicion that “Coors has something to do with it,” I don’t have answers as to why the Rockies tend to underperform the expectations of advanced metrics. It is clear enough, though, that in the interest of getting a more statistically accurate picture of the team’s performance the arbiters of Colorado’s park-weighting numbers should at least consider examining their methods of calculation.
The Chicago Cubs are cursed, and drag the White Sox down with them at every possible opportunity. *
* More on these two teams in an addendum to this post, which was just getting too long.