Another Example of Answering the Question (League Quality)
I've been meditating on the issue of league quality measurement and I have come up with an interpretation that I think I should spend some time explaining just to see what some of the other critical thinkers in baseball have to say about it.
As I understand it, there are three popular approaches to adjusting for timeline/league quality changes.
1) Think about what kinds of events happen in the weakest of leagues and look for them in any league to measure its' quality relative to the strongest of leagues.
Bill James wrote down in one of his abstracts something like a dozen different kinds of things that happen a lot in bad baseball leagues and rarely in good ones. That list included Errors, "rare events" (like triple plays, baserunning outs, mistakes of aggression, base hits on pop-ups (why do you think we call those Texas Leaguers?) etc), passed balls, wild pitches, hit batsmen etc.
2) Find some measure of a player's performance (WOBA, OPS, RC27, wins created)...and get a group of players who played in both one league and another...then see what the average difference in performance is and assume that the cumulative (weighted average) difference between performances of those players represents the difference in quality of the league...the reuslt of doing this is a series of linear adjustments to production rates in each league...we're assuming that those leagues follow roughly the same distributions of performance but that those distributions are shifted linearly to account for the depth of the league.
3) Take your measure of player performance, find the distributions of said performances in each league and force all of those distributions to match either by curve fitting or by use of the odds ratio method or by simple normalization...the underlying assumption being that players at the Xth percentile in one league will perform at the Xth percentile in every league...that differences in league quality will be expressed by differences in the shape of the distribution of performance.
I think there is some truth to all of those methods but that none of them adequately and directly answers the question being asked. Let me phrase the real question we're all trying to answer in new terms that I don't think have been expressed.
Imagine for a moment that we had a perfect measure of the real ability of every baseball player in the world. In the real world we don't have this because all of our measures of performance are poisoned by the environments in which they are taken, the injuries of the players, the specter of random chance, the limitations of of our "sampling" methods etc. But assuming we could put a number to how talented each player is/was, a measure of league quality would tell us by what percentage his real production rate would change when he moved from one league to another. If we knew numerically how good each player was at any given time, then we'd know how talented the league was as a whole and we'd be able to relate the player to his league.
Our problem then is that we can't have an absolutely perfect measure of ability.
This to me implies that league quality is BOTH a linear adjustment AND an exercise in curve fitting...really it's a three step process.
1) Find a good measurement of real-world production (our initial guess at how talented a player is at any moment in time).
2) Perform some form of curve fitting...I personally believe the way to do this is to do a total normalization wherein the shape of the distribution is altered until it is normal and then the properties of the bell curve are used to force equality in the distributions of every league but Tom Tango thinks curve fitting is best done by using the odds ratio method...there are arguments for both (yes Tom...using the odds ratio method is curve fitting...the odds ratio method changes the shape of the curve by applying a ratio factor to all of the raw data until the average matches some initial curve). The purpose of this step is to improve our estimate of league relative player ability. Total normalization is the best way to factor out as much as possible the variable of temporal context.
3) NOW...use your normalized measure of performance to compare leagues linearly as described in method above...the concept of finding all players common to two leagues and seeing how their performance differed as a group only works if you measure of performance has been normalized so that the two leagues are on the same scale!.
League quality is not just a subjective linear adjustment...it's not just a curve fitting exercise (we all know that the 75th percentile player in 1944 won't be the 75h percentile in 1984...that's intuitively obvious...and that reducto ad absurdem should be enough to prove that there is a linear adjustment required too)...it's both. Done in a specific order. Or it's not answering the question.
That's where I stand at the moment...I'm curious what we think about this?
I've just tried this approach in my latest blog posting. I imagine I'm not the first to do this, so I'd love to hear about other results.
Originally Posted by SABR Matt
Unfortunately, I can't find data for HBPs, PBs, WPs, etc. for the minor leagues. I used what I had, though.
Nice work, IBL. Obviously, if we had better data from the IBL PBP database we could compare rates for the rare events from the IBL with rare events from the major leagues and see what kind of gaps we're talking about.
The IBL ballplayers are stealing WAAAAAYYY too many bases but to be fair, those SB rates are fairly comparable to the SB rates recorded in the earliest years of Major League Baseball.
BTW...I was glad to see you include DER...one of the first telltale signs of a bad league (in terms of fielding) is the BABIP...in the MLB it's around .300, in AAA it's .310, in low A it can be .340.
Actually, I didn't see much variation in BABIP. I don't have my files in front of me right now, but when I do I'll show you the graph. Most of the variation in DER - assuming I've calculated it correctly! - seems to be due to errors.
The way some of us calculate BABIP is 1-DER.
I'm not sure what the conventional formula is for DER, but I would think that BABIP is typically H/(AB-HR-K). If you were to use that version, the errors would have no impact on BABIP as they are simply treated as outs.
BTW IBL, I really liked the graphs on your blog. Very insightful.
Last edited by weskelton; 10-12-2007 at 09:29 AM.
(H-HR) / (AB - HR - K) actually. And even wiht that formula...BABIP should still be higher in worse leagues.
It would be inexcusable to not include SF in the denominator.
And, like I said, I find the whole concept of treating an error the same as an out for the batter as, well.... if a groundball to 3B that was thrown away because Ichiro is speeding down the line is a "bad" thing for the batter, because it was a mistake by the fielder, what do you call a fastball down the middle that Juan Pierre hits for a HR? We don't call that an E-1, do we?
No, it's a "good" thing for the batter to get on base, regardless if it was a mistake pitch, mistake catch or mistake throw.
But we're trying to measure batter skill Tango...and other than the batter being able to make contact and force the defense to make plays, is it really a skill whe Edgar Martinez reaches base on a throwing error?
To some extent, yes. The distribution of reaching on error is not random. While there is alot of noise in it, it's not 100% random. It's not luck that Ichiro reaches base on error more than Manny or Frank Thomas. He's fast, a lefty, and a ground ball hitter.
No one would argue that some players don't create more errors than others...but you'd have us give Edgar Martinez credit for his ROE...why not just give the hitters credit for their ROE above expectation (per BIP)
I may well have the "wrong" formula for DER - I poked around and found several different versions, all of which included errors.
If you're trying to assess the effectiveness of the defense why would you not include errors? What's the sense in counting them as outs, when they represent successes in reaching base or advancing?
No...DER does in include errors...Tom Tango considered BABIP as 1-DER because he thinks you should count errors for the batters too.
"why not just give the hitters credit for their ROE above expectation (per BIP)
That'll work out to the same thing. If the league average ROE per BIP is .020, then you are simply dropping everyone's "TangoBABIP" by 20 points downward.
Keep in mind all I have for these leagues is raw errors, not ROE.
I gather your formula for BABIP is (H-HR+ROE)/(AB-HR-K+SF)?
When you say some players reach base on error more often than others, do you mean relative to their at-bats or relative to their hits? I'd be surprised if some had a significantly (and repeatably) higher errors-to-hits ratio than others.
The leader in ROE per non-HR hit (min. 500 non-HR hits), since 1957:
15.7 ROE per 150 non-HR hits
At the bottom:
3.6 ROE per 150 non-HR hits
League average since 1957 is 8.8.
Being 78% above league average or 60% below league average is something, no? There's some 70 players out of the 1000 or so who qualified that were 40% above or 40% below league average.
Yes, that's pretty impressive.
Originally Posted by Tango Tiger
Still, what I think you're seeing with the top major leaguers is an ability of exceptional batters not just to "hit it where they ain't", but also to "hit it where it's hard to field". What I think we're seeing with high overall league error rates in the minors is at the opposite end of the defensive ability scale - not balls hit where it's hard to play them, but routine plays that the sub-major-leaguers flub: dropped catches, wild throws, bobbled grounders.
That is, I suspect that the further you go down the ability ladder, the more errors reflect unprofessional fielding rather than skillful batting. Hence, overall higher error rates in overall weaker leagues.
I'm not disputing the league trend, since errors are a result of the batter, pitcher, fielder, and park. As is EVERY other batted ball event you can imagine.
What I am saying is that the relationship between reaching base on error and the identity of the batter is not random. It is certainly more random than HR/batter, and double/batter, and single/batter. But, it is not random.
Can anyone suggest where I can get such data? That is, lists of all the players who moved from league A to league B, and their stats in each league? Is there an easy way to gather this information?
Originally Posted by SABR Matt