As a premise, generally speaking, I'm not a Sabremetrics or Moneyball fan, certainly not a proponent of using recompiled data to drive in-game or day to day strategy.
I think a lot of the recompiled data, like WHiP and WAR, is contrived because of huge differences in playing conditions and because of limited sampling size and because real life situations are not accounted for. I also think lots of that kind of stuff is outcome oriented, ie, it supports an existing favored premise, and is not really scientific, ie, it provides a foundation for a neutral premise.
As an example, I believe good players on good teams get their offensive stats skewed downward because those teams usually draw the best opposing pitching. Bad players on good teams get upticks because they are protected in the lineup. Meanwhile it seems to me that good players on bad teams get a statistical boost because they face the worst opposing pitching. Data like WHip and WAR doesn't seem to be able to incorporate that kind of reality.
Nevertheless, I think mining data may help understand situations to some extent, and settling bar bets is one of them.
When Strasburg was shut down, my conversation partner noted that pitchers of yesteryear threw until their "arms fell off," and blamed shutting him down and the Nationals putting their Series hopes at risk on irrational pitch counts vs. bloated salaries, whereas I maintained that starters don't really pitch any less than they used to because it seems to me they go deeper into counts than they ever had to because each at bat is more closely scrutinized because of all the money that rides on games and championships, and because teams recognize what they had not before: that bullpens are the soft underbellies of most teams.
What is probably really at issue for injuries isn't even tracked by statisticians: the number of curve balls a guy throws and the quality of his form and his instruction and his off day regimen. That's for another day.
But if you maintain all those other things as being equal, how does one prove that if pitcher X had 36 starts and went an average of 8 innings per start in the 1950's he really threw about the same number of pitches in the season as pitcher Y in the 2000's in his 31 starts averaging 6.2 subject to some reserve for a variable number in an unknown number of post season appearances? The problem is that reliable counts from that earlier time either don't exist or are hard to find. You use data you do have that you assume closely correlate to the one you want to know, and track that data instead to get a handle on the unknown.
I tried to evaluate this with my own metrics to see if I could crudely assess what was going on in terms of pitch counts, and came up with this information, derived from data on Baseball-reference.com for the period 1951 to 1960 versus the period 2002 to 2011. I only used the National League because it has rules consistency over the era, and compiled league averages for each decade.
I was surprised by how finding that most things about baseball had not changed no matter how much we think they have, despite philosophical changes, technical changes, merging umpiring teams, play at altitude, pitch counting, bandbox ballparks, radar guns, and what have you.
Runs per game per team: 1950's: 4.414 2000's: 4.504
Batting average: 1950's: .2596 2000's: .2604
OBP: 1950's: .3276 2000's: .3299
Walks per game: 1950's: 6.660 2000's: 6.456
If you only had that information, since everything looks about the same, you might be tempted to conclude that more or less the same number of pitches were thrown then as now per game, just currently more spread out among larger staffs with more roles to save starters pitches.
But I also discovered this:
Stolen bases per game: 1950's: .647 2000's: 1.128
Strike outs per game: 1950's: 9.24 2000's: 13.71 (noting strong upward trend lines within both periods)
Ratio of strike outs to walks: 1950's: 1.39 2000's: 2.06 (noting strong upward trend lines within both periods)
I am inclined to believe that information tends to show that pitchers weren't worse or relatively worse compared to hitters in the latter era, given that OBP is virtually spot on, but that hitters are taking more pitches to allow runners to steal and that they are trying harder to work counts to get into opponents' bullpens, despite a side effect of that strategy being more strikeouts, the ramification of that being somewhat attenuated by contact becoming less crucial in a regime of common stealing.
What does the panel think?
I think a lot of the recompiled data, like WHiP and WAR, is contrived because of huge differences in playing conditions and because of limited sampling size and because real life situations are not accounted for. I also think lots of that kind of stuff is outcome oriented, ie, it supports an existing favored premise, and is not really scientific, ie, it provides a foundation for a neutral premise.
As an example, I believe good players on good teams get their offensive stats skewed downward because those teams usually draw the best opposing pitching. Bad players on good teams get upticks because they are protected in the lineup. Meanwhile it seems to me that good players on bad teams get a statistical boost because they face the worst opposing pitching. Data like WHip and WAR doesn't seem to be able to incorporate that kind of reality.
Nevertheless, I think mining data may help understand situations to some extent, and settling bar bets is one of them.
When Strasburg was shut down, my conversation partner noted that pitchers of yesteryear threw until their "arms fell off," and blamed shutting him down and the Nationals putting their Series hopes at risk on irrational pitch counts vs. bloated salaries, whereas I maintained that starters don't really pitch any less than they used to because it seems to me they go deeper into counts than they ever had to because each at bat is more closely scrutinized because of all the money that rides on games and championships, and because teams recognize what they had not before: that bullpens are the soft underbellies of most teams.
What is probably really at issue for injuries isn't even tracked by statisticians: the number of curve balls a guy throws and the quality of his form and his instruction and his off day regimen. That's for another day.
But if you maintain all those other things as being equal, how does one prove that if pitcher X had 36 starts and went an average of 8 innings per start in the 1950's he really threw about the same number of pitches in the season as pitcher Y in the 2000's in his 31 starts averaging 6.2 subject to some reserve for a variable number in an unknown number of post season appearances? The problem is that reliable counts from that earlier time either don't exist or are hard to find. You use data you do have that you assume closely correlate to the one you want to know, and track that data instead to get a handle on the unknown.
I tried to evaluate this with my own metrics to see if I could crudely assess what was going on in terms of pitch counts, and came up with this information, derived from data on Baseball-reference.com for the period 1951 to 1960 versus the period 2002 to 2011. I only used the National League because it has rules consistency over the era, and compiled league averages for each decade.
I was surprised by how finding that most things about baseball had not changed no matter how much we think they have, despite philosophical changes, technical changes, merging umpiring teams, play at altitude, pitch counting, bandbox ballparks, radar guns, and what have you.
Runs per game per team: 1950's: 4.414 2000's: 4.504
Batting average: 1950's: .2596 2000's: .2604
OBP: 1950's: .3276 2000's: .3299
Walks per game: 1950's: 6.660 2000's: 6.456
If you only had that information, since everything looks about the same, you might be tempted to conclude that more or less the same number of pitches were thrown then as now per game, just currently more spread out among larger staffs with more roles to save starters pitches.
But I also discovered this:
Stolen bases per game: 1950's: .647 2000's: 1.128
Strike outs per game: 1950's: 9.24 2000's: 13.71 (noting strong upward trend lines within both periods)
Ratio of strike outs to walks: 1950's: 1.39 2000's: 2.06 (noting strong upward trend lines within both periods)
I am inclined to believe that information tends to show that pitchers weren't worse or relatively worse compared to hitters in the latter era, given that OBP is virtually spot on, but that hitters are taking more pitches to allow runners to steal and that they are trying harder to work counts to get into opponents' bullpens, despite a side effect of that strategy being more strikeouts, the ramification of that being somewhat attenuated by contact becoming less crucial in a regime of common stealing.
What does the panel think?
Comment