![]() |
|
#1
|
|||
|
|||
|
Expected BABIP
I see alot of people using the formula as LD% + .120. It makes sense. The more lines drives you hit, the higher your BABIP should be. What I dont understand is why include just LD%? Given two players with the same LD rate, shouldnt the one that hits more GB then FB have a higher BABIP?
I might be doing this totally wrong but I looked at all players with at least 300plate appearences for 2006 and I got a correlation of r=.50 i believe between BABIP and the expected BABIP formula. The best fit I got was the formula = .763LD% + .265GB% + .131FB% I guess what im getting at is wouldnt using all three rates give us a better expected BABIP then just using the LD rate? |
|
#2
|
||||
|
||||
|
You are of course correct that using just LD% is too simple. The main reason that is used as a rough thumbnail is because the other trajectories tend to be more stable. GB% and FB% don't vary as much for an individual hitter over time as LD% does (LD% is subject to fluctuations of luck more often).
|
|
#3
|
|||
|
|||
|
Thats another thing I was wondering about.
Whats the year to year correlation between the three. LD% doesnt correlate well i'd assume. |
|
#4
|
||||
|
||||
|
LD% tends to hop around a lot more because LDs are rare compared to GB and FB. GB and FB tend to stya in proportion to each other unless the batter is going through a change in his hitting style (or the pitcher a change in his pitching style).
Jose Vidro this year has gone from being an average hitter in terms of GB/FB to an extreme GB hitter (a conscious choice on his part to combat the problems Safeco Field causes for flyball hitters). But generall GB/FB is pretty stable. |
|
#5
|
|||
|
|||
|
Quote:
Thanks for your response! I am working on something and will give you credit ![]() |
|
#6
|
|||
|
|||
|
Both I believe
|
|
#7
|
||||
|
||||
|
Quote:
What exactly are you working on?
__________________
2009 World Series Champions, The New York Yankees |
|
#8
|
|||
|
|||
|
I was just putting together a "sabermetric spreadsheet" to help fantasy players. It'll have BaseRuns, PrOPS, BABIP, etc. in one easy place.
As a total non-math person who has to rely on Excel's CORREL and LINEST functions to do everything for me, I'm finding what I'm seeing to be very interesting. Here are the correlation coefficients I'm getting between various formulae and BABIP (of batters with 50 or more PA in 2007, other than pitchers.) I'll use these abbreviations: FB% for flyball rate, including both popups and homers (in other words, [1-LD%-GB%]) OFFB% for outfield fly ball rate, not including popups but including homers InPlayOFFB% for in-play outfield fly ball rate, not including popups or homers HR/F for percentage of flyballs that are homers .427 = LD% + .12 .464 = (.743 * LD%) + (.251 * GB%) + (.137 * FB%) .465 = (.569 * LD%) + (.075 * GB%) - (.058 * InPlayOFFB%) + .179 .470 = (.756 * LD%) + (.247 * GB%) + (.166 * InPlayOFFB%) .490 = (.716 * LD%) + (.246 * GB%) + (.174 * OFFB%) .504 = (.958 * LD%) + (.491 * GB%) + (.452 * OFFB%) - .2468 .535 = (.715 * LD%) + (.247 * GB%) + (.132 * InPlayOFFB%) + (.214 * HR/F) .546 = (.922 * LD%) + (.453 * GB%) + (.386 * InPlayOFFB%) + (.306 * HR/F) - .214 So in other words, I'm getting substantially better results when I subtract the homers out of the flyballs, and then add them back in separately under a different weight -- even though the homers themselves are of course not included in the BABIP. Does this finding make sense? Are the guys with high HR/F ratios also hitting more doubles as opposed to outfield outs, or something like that? Last edited by Mean Dean; 11-13-2007 at 09:04 PM. |
|
#9
|
|||
|
|||
|
Update: Trying to factor in speed, I thought triples per plate appearance might be a good measure. I got a correlation of .591 between BABIP and this formula:
(.865 * LD%) + (.446 * GB%) + (.449 * InPlayOFFB%) + (.267 * HR/F) + (1.609 * (3B/PA)) - .232 I also tried doing it with Bill James' Speed Score, but it's a lot more complicated to figure out and the results weren't any better. I am totally just throwing things at the wall, but I do think I'm getting somewhere regardless! |
|
#10
|
|||
|
|||
|
Baseball Prospectus stats list pop ups seperate from fly balls, which would help fine tune the formula, as pops have a vurtually zero pct chance of being a hit, as compared to outfield flies.
One other problem with line drive data is the judegement of the scorer on what is a line drive and what is a fly ball. Using data from minorleaguesplits.dom I did find park factors up to 20% for line drives, which virtually disappeared with ld+fb. I would suggest normalized ld data for park (scorer) to minimize a potential bias. |
|
#11
|
||||
|
||||
|
That's an interesting (and important) point. The scoring of what's a LD and what's a fly ball seems to be susceptible to who's doing the scoring. Personally, I think whether it's a LD or a FB should be scientifically derived from how long it was in the air as a ratio to how far away it landed...both of which wouldn't be that hard to start calculating for each batted ball.
|
|
#12
|
||||
|
||||
|
Stopwatches are too low-tech apparently.
__________________
Author of THE BOOK -- Playing The Percentages In Baseball |
|
#13
|
|||
|
|||
|
Quote:
(1 - HR/F) * (1 - IF/F) * (1 - LD% - GB%) Quote:
And wouldn't know how to.(Is it possible, though, that the LD/FB determinations for major league games are at least more consistent than for minor league ones, with more scorers involved?) |
|
#14
|
|||
|
|||
|
Quote:
|
|
#15
|
|||
|
|||
|
I'm with you on this one even if no one else is, Tango.
|
|
#16
|
||||
|
||||
|
Too hard to implement a stopwatch system. You'd need a guy with a REALLY fast reaction time to be clicking the button in perfect time to the flight of the ball and you'd need one of those guys watching every game in baseball.
|
|
#17
|
|||
|
|||
|
Do players with great SPEED tend to have BABIP a bit higher than guys who take 5-seconds to get down to 1st?????
|
|
#18
|
|||
|
|||
|
I could be wrong but I would definintely think guys with Speed would have higher BABIP's just based on the fact they'd beat out a bunch of hits that others wouldnt.
|
|
#19
|
||||
|
||||
|
Quote:
HittrackerOnline.com has no problems recording this for HR.
__________________
Author of THE BOOK -- Playing The Percentages In Baseball |
|
#20
|
|||
|
|||
|
Quote:
But even a stopwatch in hand would be doable. We use stopwatches all the time in my work to time mating bouts among butterflies, and those can be as short as a half-second or so. There's measurement error, of course, but using a stopwatch is certainly more precise than a subjective "soft", "medium," "hard" classification scheme. I know you folks have asked him about this before, but I submitted a question for John Dewan about this in a recent interview. Figured it didn't hurt to keep bringing it up. He didn't seem particularly enthusiastic about it. Here's his response: "A: We're not using a stop watch, if that's what you mean. But we factor in virtually the same thing by utilizing both the speed (soft, medium, hard) and the type (bunt, fly, liner, fliner, grounder) of batted ball. Can we get more precise at some point? Probably." -Justin |
|
#21
|
||||
|
||||
|
And I don't know about mating butterflies, but you can get pretty good at hitting the stopwatch upon contact of the ball, since you can see the release of the object from the pitcher's hand. It's not like it's a starter's gun at a race, where you are reacting to something. In short, if a batter can time his swing to make contact with a pitch, we should be able to time our fingers to make the same contact. So, we'll be off by 0.1 or 0.2 seconds. I can live with that, if it means I know if a ball was in the air for 3.9 or 2.6 seconds.
__________________
Author of THE BOOK -- Playing The Percentages In Baseball |
|
#22
|
||||
|
||||
|
Video recoridng software is not a stopwtach Tom.
Video recording software comes with a timer based on frames per second. At 30 or 60 FPS, you will get a pretty good idea of how long it took a ball to go from bat to fielder, absolutely. That's not a stopwatch...that's the method I'd propose using. It still requires a person to replay each game...every day...all 2600 games that get played in each season...and record the time-to-zone info for each batted ball. You got hundreds of thousands of dollars to hire people to sit and do that every year? |
|
#23
|
||||
|
||||
|
BIS does that very thing, minus the timer, for whatever reason.
__________________
Author of THE BOOK -- Playing The Percentages In Baseball |
|
#24
|
||||
|
||||
|
And then charges forty billion dollars to access that data.
You got hundreds of thousands of dollars to afford to hire people to get that information WITHOUT making that information inaccessible to me and most other sabermetricians? |
|
#25
|
||||
|
||||
|
BIS does make the data available for private use for a few hundred dollars.
__________________
Author of THE BOOK -- Playing The Percentages In Baseball |
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|