Given the level of skepticism and otherwise interest in how I get my pitching rankings/ratings, I've decided to post the exact methodology I use to evaluate pitchers in the interests of seeking feedback and full disclosure of the method.
This isn't a "simple" method so it's going to take a bit to explain. If you're interested, please do bare with me - I'll give a concrete example pitcher record to illustrate.
Here is the pitching record I will analyze:
Tom Glavine
Focus your attention on his 2001 season. It's fairly instructive so we'll use it.
STEP ONE - PURE DIPS ANALYSIS
Statistics of interest at this point in the analysis:
IPOuts - 659
H - 213
K - 116
BB - 97
HR - 24
HBP - 2
WP - 2
BK - 0
PkO - 9
A - 40
From the IPOut, H, HR and K data, we know that Glavine put 732 balls into play (IPOuts + H - K - HR).
I track pitcher assists because I am working on the assumption that if a pitcher gets a "touch" on defense, the ball must have been hit poorly and that must be considered a success for the pitcher. The BABIP is probably WAY lower for balls which a pitcher could potentially make a play anyway. Note that pick-offs almost always result in an out...a pitcher-assisted out, so from this point forward, we consider pickoffs as part of pitcher assists.
In 2001 in the NL, pitchers recorded 2841 assists on 67240 balls in play. The Natioanl League batting average on balls in play when not fielded by the pitcher is .312 so given normal skill at preventing BIP hits and an average defense we expect Glavine to allow 216 (.312 * 692 BIP not fielded by Glavine) in play hits.
Doubles make up a little less than 23% of the in play hits, so 50 of his 216 in play hits shuold be 2B.
Triples make up 2.4% of the in play hits in 2001 in the NL, so 5 of his 216 hits should be 3B.
That leaves 161 singles.
We also have to adjust his HRA to account for the impact Turner Field had on HR rates. I calculated a 5-year weighted park-HR adjustment of 1.00296 for 2001 (that's including all of the other parks in which the Braves played), meaning that the net average park in which the Braves played their games slightly increased the odds of a HR relative to the rest of the NL.
If we divide his 24 HRs by 1.00296 we get 23.9 or essentially still 24 so no change there.
With an expectation to allow (216-189) 27 more hits than he actually allowed, he now has 27 fewer defense-independent IPOuts.
That gives him this defense independent line:
IPOuts - 632
1B - 161
2B - 50
3B - 5
HR - 24
BB - 97
K - 116
HBP - 2
WP - 2
BK - 0
PkO - 9
A - 40
STEP TWO - ACCOUNTING FOR REAL IMPACTS ON BALLS IN PLAY
How much of the difference between league average BIP hit rates and Glavine's actual performance is the Braves' defense and how much is Glavine?
The Braves as a team allowed 4419 balls in play, of which 190 were fielded by pitchers for an assist leaving 4229 BIP of interest and 1210 in play hits. But Glavine affected that figure by his own pitching, so we need to take out 692 non-pitcher-assisted BIP and 189 BIP hits leaving us with these team totals:
3537 BIP
1021 In-Play Hits
.289 BABIP on non-pitcher-asisted BIP
If Glavine were league average, pitching in front of this brilliant team defense, he'd record (.289*692 BIP) 200 in-play hits. He allowed 11 fewer than that.
Now we need to "give back" some of the hits we added above.
8 singles and 3 doubles get taken off his existing defense independent line to accoutn for the fact that he shaved 11 hits off the board.
That leaves his hit counts and out counts at:
IPOuts - 643
1B - 153
2B - 47
3B - 5
HR - 24
The rest of the stats stay the same.
STEP THREE - TURNING THE NEW LINE INTO RC
Pitcher Assists are treated as automatic outs (given the usual linear weight for an out).
Now I used less accurate LWs in the original PCA analysis than the ones I just finished calculating that were based on multi-linear regression, but I expect to implement this methodology with accurate LW data in the near future so I'll use the better LW data in this example.
In 2001, the following LW existed in the NL:
In-Play Out (this includes both FC and straight out) - -0.257
K - -0.281
1B - 0.460
2B - 0.771
3B - 1.060
HR - 1.370
BB (including both unintentional and intentional) - 0.291
HBP - 0.328
WP - 0.279
BK - 0.292
To get RA (above average) for Glavine, we just multiply the number of each of these events by their LW:
In-Play Outs: (643 - 116) = 527 * -0.257 = -135.44 +
K: 116 * -0.281 = -32.60 +
1B: 153 * 0.460 = 70.38 +
2B: 47 * 0.771 = 36.24 +
3B: 5 * 1.060 = 5.30 +
HR: 24 * 1.370 = 32.88 +
BB: 97 * 0.291 = 28.23 +
HBP: 2 * 0.328 = 0.66 +
WP: 2 * 0.279 = 0.56
Add those all up and you get 6.21. In 2001, this method asserts that Tom Glavine was 6.21 runs allowed worse than the national league average pitcher (a far cry from his defense-assisted 124 ERA+).
If you want an ERA-like number, you can convert 6.21 RAAA to a total RA metric by knowing that the NL RA/Out is 0.176 (that does include errors) which means in Glavine's defense-neutral 643 IPouts he should have allowed 113 defense-neutral runs (we're expecting a league-average rate of runs produced by errors here) and our estimate is that he actually allowed 119.21 defense-neutral runs for a DNRA (defense-neutral run average) of 5.01 (.186 * 27 outs) against a league average DNRA of 4.75.
His DNRA+ would be (4.75 / 5.01) * 100 or 95. A slightly below average pitcher in 2001.
Two notable flaws with this that I aim to correct in future renditions where data allows.
I didn't adjust for the pitcher's tendency to generate groundballs/flyballs and the expected changes that produces in the average result of a ball in play, especially the change in the double play rate.
I didn't account for the pitcher's impact in extra base hits on balls in play compared to his team's extra base hit rates. I didn't originally have that data, and I still don't have it prior to 1957 so the standing assumption must be that all pitchers allow the same rate of extra base hits per in play hit prior to 1957. In the PBP era I can change the method to account for sensible impacts on ball in play XBH rates, though I expect those tweaks to be small most of the time.
Thoughts?
Comments?
Criticisms? Any ideas you have that might improve the method, I'm open to suggestions.


Reply With Quote


Bookmarks