Announcement

Collapse
No announcement yet.

Elite pitchers and league run averages

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Elite pitchers and league run averages

    I'm posting this here instead of the stats forum because it does look for trends in pitcher performance from 1901 to 2011. (Also I figure if I posted it in the stats forum no one who needed to would read it.)

    Recently in this forum there's been quite a bit of discussion about whether or not dominant batters dominate more when the league run average is high, and whether dominant pitchers dominate more when the league run average is low. While interesting and thought-provoking, the discussion has been short on data.

    I took the 200 highest ERA+ scores from 1901 to 2011, and I paired each respectively with the MLB run average for that year. I ran a correlation to see if there was a relationship between a low league run average and a high ERA+. (Just a reminder: counterintuitively for me, a high ERA+ means a low ERA, so a negative correlation would be a sign that low league average contributes to dominance.)

    I found there was effectively no relationship. The coefficient correlation between ERA+ dominance and league run average was -0.011. I thought perhaps the proliferation of teams in the recent high scoring era might throw things off, so I randomly deleted half of the entries from 1993 onward. This did strengthen the relationship minimally: the new correlation coefficient was -0.014.

    A couple of caveats: I chose ERA+ not because I think it's the best stat, but I think a very high era+ corresponds to what people who discuss the issue have in mind when they talk about pitching dominance: someone with a teeny weeny ERA compared to everyone else. Gibson in 68, Maddux in 95. It's also conceptually simple and easy to get ahold of on BBREF. (That's why I started with pitchers, because a pretty good stat came to hand easily.)

    More importantly, limiting the study to the top 200 necessarily reduced the amount of variation, so there wasn't a lot of difference to be correlated. However, in the discussions we were talking about the elite of the elite and under what conditions they thrived. It's the human condition: If you want to distinguish among elite results, you have to look carefully. Since the correlation was so low, I'm not worried about its being due to small sample size or lack of variation within the sample.

    Even more importantly, by definition, years without a top-200 pitcher did not appear in the study. However, there did not seem to be any pattern to this. There were a lot more seasons in the dead ball era than in the 20s, but there were more in the 30s than in the twenties. (I think the large number in the dead ball era was due to the Gochnauer effect: There were more dominant seasons because there were a lot of lousy ballplayers who were easy to dominate.) The 60s had a bunch, but so did the 70s and 90s. The gaps and clusters seemed, frankly, to depend on individuals rather than surrounding conditions. Maddox is on top with a 161 one year and a 261 the next. When Grove and Gomez show up, you get higher scores. Here are the data, so you can see for yourselves:


    Code:
            Player	ERA+	Year	LGRUNS
    Cy Young	219	1901	4.99
    Jack Taylor	206	1902	4.43
    Ed Siever	195	1902	4.43
    Rube Waddell	178	1902	4.43
    Noodles Hahn	169	1902	4.43
    Cy Young	164	1902	4.43
    Joe McGinnity	168	1904	3.73
    Rube Waddell	165	1904	3.73
    Chris Mathewson	230	1905	3.9
    Ed Reulbach	209	1905	3.9
    Rube Waddell	179	1905	3.9
    Mordecai Brown	253	1906	3.62
    Jack Pfiester	174	1906	3.62
    Doc White	167	1906	3.62
    Jack Pfiester	216	1907	3.53
    Carl Lundgren	213	1907	3.53
    Mordecai Brown	179	1907	3.53
    Addie Joss	204	1908	3.38
    Cy Young	193	1908	3.38
    Chris Mathewson	168	1908	3.38
    Chris Mathewson	222	1909	3.55
    Mordecai Brown	193	1909	3.55
    Orval Overall	179	1909	3.55
    Harry Krause	174	1909	3.55
    Ed Walsh	167	1909	3.55
    Ed Walsh	189	1910	3.84
    Walter Johnson	183	1910	3.84
    Jack Coombs	182	1910	3.84
    Vean Gregg	189	1911	4.51
    Walter Johnson	173	1911	4.51
    Chris Mathewson	167	1911	4.51
    Walter Johnson	240	1912	4.52
    Smoky Joe Wood	179	1912	4.52
    Jeff Tesreau	173	1912	4.52
    Walter Johnson	259	1913	4.04
    Eddie Cicotte	186	1913	4.04
    Dutch Leonard	279	1914	3.86
    Russ Ford	180	1914	3.86
    Claude Hendrix	173	1914	3.86
    Walter Johnson	164	1914	3.86
    Pete Alexander	225	1915	3.81
    Walter Johnson	191	1915	3.81
    Smoky Joe Wood	188	1915	3.81
    Fred Toney	182	1915	3.81
    Ernie Shore	170	1915	3.81
    Pete Alexander	170	1916	3.56
    Rube Marquard	169	1916	3.56
    Fred Anderson	178	1917	3.58
    Eddie Cicotte	174	1917	3.58
    Walter Johnson	214	1918	3.63
    Stan Coveleski	164	1918	3.63
    Walter Johnson	215	1919	3.87
    Eddie Cicotte	176	1919	3.87
    Pete Alexander	166	1919	3.87
    Pete Alexander	166	1920	4.36
    Red Faber	170	1921	4.85
    Dolf Luque	201	1923	4.81
    Dazzy Vance	174	1924	4.75
    Lefty Grove	165	1926	4.63
    Wilcy Moore	171	1927	4.75
    Ray Kremer	168	1927	4.75
    Dazzy Vance	190	1928	4.72
    Dazzy Vance	189	1930	5.55
    Lefty Grove	185	1930	5.55
    Lefty Grove	217	1931	4.81
    Carl Hubbell	193	1933	4.48
    Lon Warneke	165	1933	4.48
    Lefty Gomez	176	1934	4.9
    Mel Harder	173	1934	4.9
    Carl Hubbell	168	1934	4.9
    Lefty Grove	175	1935	4.9
    Lefty Grove	189	1936	5.2
    Carl Hubbell	169	1936	5.2
    Lefty Gomez	193	1937	4.87
    Monty Stratton	193	1937	4.87
    Johnny Allen	176	1937	4.87
    Lefty Grove	185	1939	4.82
    Ted Lyons	173	1939	4.82
    Bucky Walters	170	1939	4.82
    Bobo Newsom	168	1940	4.68
    Thornton Lee	174	1941	4.49
    Mort Cooper	192	1942	4.08
    Ted Lyons	171	1942	4.08
    Spud Chandler	198	1943	3.91
    Max Lanier	178	1943	3.91
    Dizzy Trout	167	1944	4.17
    Hal Newhouser	195	1945	4.18
    Al Benton	175	1945	4.18
    Hal Newhouser	190	1946	4.01
    Howie Pollet	165	1946	4.01
    Warren Spahn	170	1947	4.35
    Ewell Blackwell	168	1947	4.35
    Harry Brecheen	182	1948	4.57
    Gene Bearden	168	1948	4.57
    Mike Garcia	170	1949	4.61
    Warren Spahn	188	1953	4.61
    John Antonelli	178	1954	4.38
    Billy Pierce	200	1955	4.49
    Herb Score	166	1956	4.45
    Whitey Ford	177	1958	4.28
    Hoyt Wilhelm	173	1959	4.38
    Hank Aguirre	185	1962	4.46
    Dick Ellsworth	167	1963	3.95
    Dean Chance	200	1964	4.04
    Sandy Koufax	186	1964	4.04
    Joe Horlen	184	1964	4.04
    Whitey Ford	170	1964	4.04
    Juan Marichal	169	1965	3.99
    Sandy Koufax	190	1966	3.99
    Juan Marichal	167	1966	3.99
    Phil Niekro	179	1967	3.77
    Bob Gibson	258	1968	3.42
    Luis Tiant	186	1968	3.42
    Sam McDowell	165	1968	3.42
    Juan Marichal	168	1969	4.07
    Tom Seaver	165	1969	4.07
    Steve Carlton	164	1969	4.07
    Tom Seaver	194	1971	3.89
    Wilbur Wood	189	1971	3.89
    Vida Blue	183	1971	3.89
    Steve Carlton	182	1972	3.69
    Luis Tiant	169	1972	3.69
    Gaylord Perry	168	1972	3.69
    Tom Seaver	175	1973	4.21
    Buzz Capra	166	1974	4.12
    Jim Palmer	169	1975	4.21
    John Candelaria	169	1977	4.47
    Ron Guidry	208	1978	4.1
    Jon Matlack	165	1978	4.1
    Nolan Ryan	195	1981	4
    Dave Righetti	174	1981	4
    Dwight Gooden	229	1985	4.33
    John Tudor	185	1985	4.33
    Dave Stieb	171	1985	4.33
    Orel Hershiser	171	1985	4.33
    Roger Clemens	169	1986	4.41
    Jimmy Key	164	1987	4.72
    Allan Anderson	166	1988	4.14
    Bret Saberhagen	180	1989	4.13
    Roger Clemens	211	1990	4.26
    Danny Darwin	169	1990	4.26
    Roger Clemens	165	1991	4.31
    Roger Clemens	174	1992	4.12
    Greg Maddux	166	1992	4.12
    Kevin Appier	164	1992	4.12
    Kevin Appier	179	1993	4.6
    Greg Maddux	170	1993	4.6
    Greg Maddux	271	1994	4.92
    Marvin Freeman	179	1994	4.92
    Roger Clemens	176	1994	4.92
    David Cone	171	1994	4.92
    Steve Ontiveros	167	1994	4.92
    Greg Maddux	260	1995	4.85
    Randy Johnson	193	1995	4.85
    Tim Wakefield	165	1995	4.85
    Kevin Brown	215	1996	5.04
    Juan Guzman	171	1996	5.04
    Roger Clemens	222	1997	4.77
    Pedro Martinez	219	1997	4.77
    Randy Johnson	197	1997	4.77
    Greg Maddux	189	1997	4.77
    Greg Maddux	187	1998	4.79
    Roger Clemens	174	1998	4.79
    Al Leiter	170	1998	4.79
    Tom Glavine	168	1998	4.79
    Kevin Brown	164	1998	4.79
    Pedro Martinez	243	1999	5.08
    Randy Johnson	184	1999	5.08
    Kevin Millwood	167	1999	5.08
    Pedro Martinez	291	2000	5.14
    Randy Johnson	181	2000	5.14
    Jeff D'Amico	171	2000	5.14
    Kevin Brown	167	2000	5.14
    Randy Johnson	188	2001	4.78
    Pedro Martinez	202	2002	4.62
    Randy Johnson	195	2002	4.62
    Derek Lowe	177	2002	4.62
    Pedro Martinez	211	2003	4.73
    Jason Schmidt	180	2003	4.73
    Mark Prior	179	2003	4.73
    Kevin Brown	169	2003	4.73
    Brandon Webb	165	2003	4.73
    Tim Hudson	165	2003	4.73
    Johan Santana	182	2004	4.81
    Randy Johnson	176	2004	4.81
    Jake Peavy	171	2004	4.81
    Roger Clemens	226	2005	4.59
    Andy Pettitte	177	2005	4.59
    Tim Lincecum	168	2008	4.65
    Cliff Lee	167	2008	4.65
    Johan Santana	166	2008	4.65
    Zack Greinke	205	2009	4.61
    Chris Carpenter	182	2009	4.61
    Felix Hernandez	171	2009	4.61
    Tim Lincecum	171	2009	4.61
    Clay Buchholz	187	2010	4.38
    Josh Johnson	180	2010	4.38
    Felix Hernandez	174	2010	4.38
    Roy Halladay	167	2010	4.38
    J Verlander	172	2011	4.28
    Last edited by Jackaroo Dave; 10-25-2012, 03:28 AM.
    Indeed the first step toward finding out is to acknowledge you do not satisfactorily know already; so that no blight can so surely arrest all intellectual growth as the blight of cocksureness.--CS Peirce

  • #2
    Here is another quick study. The issue came up whether elite pitchers or normal pitchers gained more from a favorable change in conditions or lost more in an unfavorable change. In 1903, the league average in runs scored was 4.44. In the next year it dropped to 3.73. There were 52 Major League pitchers who pitched enough innings to qualify for the era title in each year. I ran a correlation between their era in 1903 and the difference between their era's in 03 and 04. Since a higher value would mean a greater drop in era, a positive correlation would mean that those with higher era's benefited more, and the elite pitchers benefited less.

    That is what happened: there was a 0.663 correlation between era in 1903 and era improvement in 1904. The higher the era in 03, the greater the improvement in 04.

    Running another correlation on pitching runs got similar results. This time the lower the difference between pitchers runs, the greater the improvement, with negative differences better than positive. Since the weaker pitchers had lower pitcher run scores in 03, again a positive correlation indicated that the elites were not doing as well as the ordinary pitchers. And again this proved to be the case. There was a 0.587 correlation between pitchers' runs in 03 and the difference between 03 and 04.

    Between 1968 and 1969 there was a similar shift, but in a different direction: League runs rose from 3.42 to 4.07 per game. Conditions changed to favor the batters. But this time the mediocre pitchers were hurt less by the change than the elite. Once more with 52 qualifying pitchers, the correlations were positive, 0.614 for era and 0.517 for pitching runs.

    There are two reasons to believe that the effects are not as strong as the correlation coefficients imply: that from 25 to 44% of the variation in era or pitching runs between pitchers is due to their level of performance in the first year. First of all, the worst pitchers in the second year did not complete enough innings to appear in the study. Secondly, because performance tends to regress to the mean, we would expect the weaker pitchers to do somewhat better the next year and the better pitchers to do somewhat worse absent any effect whatsoever. How much of the observed effect is due to these influences, I do not know. I should, but I don't. At any rate, it is not enough to reverse the relationship. That would entail regressing to the mean and then past it.

    So based on the data covered so far, it seems unlikely that either a favorable change in conditions or an unfavorable one especially benefits elite pitchers relative to their lesser colleagues.
    Last edited by Jackaroo Dave; 10-25-2012, 03:23 AM.
    Indeed the first step toward finding out is to acknowledge you do not satisfactorily know already; so that no blight can so surely arrest all intellectual growth as the blight of cocksureness.--CS Peirce

    Comment

    Ad Widget

    Collapse
    Working...
    X