Baseball Fever  

Go Back   Baseball Fever > General Baseball > Statistics, Analysis, & Sabermetrics

Reply
 
Thread Tools Rate Thread Display Modes
  #1  
Old 08-20-2007, 07:18 AM
TheAnswer1313 TheAnswer1313 is offline
Registered User
 
Join Date: Oct 2006
Posts: 46
Expected BABIP

I see alot of people using the formula as LD% + .120. It makes sense. The more lines drives you hit, the higher your BABIP should be. What I dont understand is why include just LD%? Given two players with the same LD rate, shouldnt the one that hits more GB then FB have a higher BABIP?

I might be doing this totally wrong but I looked at all players with at least 300plate appearences for 2006 and I got a correlation of r=.50 i believe between BABIP and the expected BABIP formula.

The best fit I got was the formula = .763LD% + .265GB% + .131FB%

I guess what im getting at is wouldnt using all three rates give us a better expected BABIP then just using the LD rate?
Reply With Quote
  #2  
Old 08-20-2007, 07:36 AM
SABR Matt's Avatar
SABR Matt SABR Matt is offline
Hunter of Objective Truth
 
Join Date: May 2005
Location: Where all students live...nowhere.
Posts: 8,710
You are of course correct that using just LD% is too simple. The main reason that is used as a rough thumbnail is because the other trajectories tend to be more stable. GB% and FB% don't vary as much for an individual hitter over time as LD% does (LD% is subject to fluctuations of luck more often).
Reply With Quote
  #3  
Old 08-20-2007, 07:48 AM
TheAnswer1313 TheAnswer1313 is offline
Registered User
 
Join Date: Oct 2006
Posts: 46
Thats another thing I was wondering about.

Whats the year to year correlation between the three. LD% doesnt correlate well i'd assume.
Reply With Quote
  #4  
Old 08-20-2007, 07:54 AM
SABR Matt's Avatar
SABR Matt SABR Matt is offline
Hunter of Objective Truth
 
Join Date: May 2005
Location: Where all students live...nowhere.
Posts: 8,710
LD% tends to hop around a lot more because LDs are rare compared to GB and FB. GB and FB tend to stya in proportion to each other unless the batter is going through a change in his hitting style (or the pitcher a change in his pitching style).

Jose Vidro this year has gone from being an average hitter in terms of GB/FB to an extreme GB hitter (a conscious choice on his part to combat the problems Safeco Field causes for flyball hitters). But generall GB/FB is pretty stable.
Reply With Quote
  #5  
Old 11-10-2007, 08:28 AM
Mean Dean Mean Dean is offline
Registered User
 
Join Date: Nov 2007
Posts: 15
Quote:
Originally Posted by TheAnswer1313
The best fit I got was the formula = .763LD% + .265GB% + .131FB%
Is this FB% counting homers, popups, both or neither?

Thanks for your response! I am working on something and will give you credit
Reply With Quote
  #6  
Old 11-10-2007, 04:14 PM
TheAnswer1313 TheAnswer1313 is offline
Registered User
 
Join Date: Oct 2006
Posts: 46
Both I believe
Reply With Quote
  #7  
Old 11-10-2007, 05:36 PM
Mariano_Rivera's Avatar
Mariano_Rivera Mariano_Rivera is offline
Joba Rules
 
Join Date: Mar 2006
Posts: 5,836
Quote:
Originally Posted by Mean Dean View Post
Is this FB% counting homers, popups, both or neither?

Thanks for your response! I am working on something and will give you credit
FB% counts all FB`s including HR's, popups, and outfield flyballs.

What exactly are you working on?
Reply With Quote
  #8  
Old 11-10-2007, 06:29 PM
Mean Dean Mean Dean is offline
Registered User
 
Join Date: Nov 2007
Posts: 15
I was just putting together a "sabermetric spreadsheet" to help fantasy players. It'll have BaseRuns, PrOPS, BABIP, etc. in one easy place.

As a total non-math person who has to rely on Excel's CORREL and LINEST functions to do everything for me, I'm finding what I'm seeing to be very interesting. Here are the correlation coefficients I'm getting between various formulae and BABIP (of batters with 50 or more PA in 2007, other than pitchers.) I'll use these abbreviations:

FB% for flyball rate, including both popups and homers (in other words, [1-LD%-GB%])
OFFB% for outfield fly ball rate, not including popups but including homers
InPlayOFFB% for in-play outfield fly ball rate, not including popups or homers
HR/F for percentage of flyballs that are homers

.427 = LD% + .12
.464 = (.743 * LD%) + (.251 * GB%) + (.137 * FB%)
.465 = (.569 * LD%) + (.075 * GB%) - (.058 * InPlayOFFB%) + .179
.470 = (.756 * LD%) + (.247 * GB%) + (.166 * InPlayOFFB%)
.490 = (.716 * LD%) + (.246 * GB%) + (.174 * OFFB%)
.504 = (.958 * LD%) + (.491 * GB%) + (.452 * OFFB%) - .2468
.535 = (.715 * LD%) + (.247 * GB%) + (.132 * InPlayOFFB%) + (.214 * HR/F)
.546 = (.922 * LD%) + (.453 * GB%) + (.386 * InPlayOFFB%) + (.306 * HR/F) - .214

So in other words, I'm getting substantially better results when I subtract the homers out of the flyballs, and then add them back in separately under a different weight -- even though the homers themselves are of course not included in the BABIP.

Does this finding make sense? Are the guys with high HR/F ratios also hitting more doubles as opposed to outfield outs, or something like that?

Last edited by Mean Dean; 11-13-2007 at 09:04 PM.
Reply With Quote
  #9  
Old 11-13-2007, 09:02 PM
Mean Dean Mean Dean is offline
Registered User
 
Join Date: Nov 2007
Posts: 15
Update: Trying to factor in speed, I thought triples per plate appearance might be a good measure. I got a correlation of .591 between BABIP and this formula:

(.865 * LD%) + (.446 * GB%) + (.449 * InPlayOFFB%) + (.267 * HR/F) + (1.609 * (3B/PA)) - .232

I also tried doing it with Bill James' Speed Score, but it's a lot more complicated to figure out and the results weren't any better.

I am totally just throwing things at the wall, but I do think I'm getting somewhere regardless!
Reply With Quote
  #10  
Old 11-13-2007, 10:30 PM
StillFlash StillFlash is offline
Registered User
 
Join Date: Aug 2007
Location: Johnstiwn Pa
Posts: 123
Baseball Prospectus stats list pop ups seperate from fly balls, which would help fine tune the formula, as pops have a vurtually zero pct chance of being a hit, as compared to outfield flies.

One other problem with line drive data is the judegement of the scorer on what is a line drive and what is a fly ball. Using data from minorleaguesplits.dom I did find park factors up to 20% for line drives, which virtually disappeared with ld+fb. I would suggest normalized ld data for park (scorer) to minimize a potential bias.
Reply With Quote
  #11  
Old 11-13-2007, 11:01 PM
SABR Matt's Avatar
SABR Matt SABR Matt is offline
Hunter of Objective Truth
 
Join Date: May 2005
Location: Where all students live...nowhere.
Posts: 8,710
That's an interesting (and important) point. The scoring of what's a LD and what's a fly ball seems to be susceptible to who's doing the scoring. Personally, I think whether it's a LD or a FB should be scientifically derived from how long it was in the air as a ratio to how far away it landed...both of which wouldn't be that hard to start calculating for each batted ball.
Reply With Quote
  #12  
Old 11-14-2007, 07:16 AM
Tango Tiger's Avatar
Tango Tiger Tango Tiger is offline
Registered User
 
Join Date: Mar 2006
Posts: 2,015
Stopwatches are too low-tech apparently.
Reply With Quote
  #13  
Old 11-14-2007, 09:12 AM
Mean Dean Mean Dean is offline
Registered User
 
Join Date: Nov 2007
Posts: 15
Quote:
Originally Posted by StillFlash View Post
Baseball Prospectus stats list pop ups seperate from fly balls, which would help fine tune the formula, as pops have a vurtually zero pct chance of being a hit, as compared to outfield flies.
Right; I already took them out. Or in other words, my InPlayOFFB% stat is:

(1 - HR/F) * (1 - IF/F) * (1 - LD% - GB%)

Quote:
One other problem with line drive data is the judegement of the scorer on what is a line drive and what is a fly ball. Using data from minorleaguesplits.dom I did find park factors up to 20% for line drives, which virtually disappeared with ld+fb. I would suggest normalized ld data for park (scorer) to minimize a potential bias.
That, I did not do. And wouldn't know how to.

(Is it possible, though, that the LD/FB determinations for major league games are at least more consistent than for minor league ones, with more scorers involved?)
Reply With Quote
  #14  
Old 11-14-2007, 03:03 PM
StillFlash StillFlash is offline
Registered User
 
Join Date: Aug 2007
Location: Johnstiwn Pa
Posts: 123
Quote:
One other problem with line drive data is the judegement of the scorer on what is a line drive and what is a fly ball. Using data from minorleaguesplits.dom I did find park factors up to 20% for line drives, which virtually disappeared with ld+fb. I would suggest normalized ld data for park (scorer) to minimize a potential bias.
Quote:
Originally Posted by Mean Dean View Post
That, I did not do. And wouldn't know how to.
(Is it possible, though, that the LD/FB determinations for major league games are at least more consistent than for minor league ones, with more scorers involved?)
You would calculate park effect on LD% & FB% the same as other stats - but first you have to know the LD% at home and on the road for each team. minorleaguesplits.com spiders their data from the MILB.com website, and uses their own pattern recognition software to analyze the play by play narrative for each game. MLB.com gameday uses the same data format as the minor league version. Unfortunately, I have not yet seen anyone who has calculated major league home/road for those particular percentages.
Reply With Quote
  #15  
Old 11-14-2007, 09:53 PM
mikefast mikefast is offline
Registered User
 
Join Date: Aug 2007
Location: Austin, TX
Posts: 117
Quote:
Originally Posted by Tango Tiger View Post
Stopwatches are too low-tech apparently.
I'm with you on this one even if no one else is, Tango.
Reply With Quote
  #16  
Old 11-14-2007, 10:37 PM
SABR Matt's Avatar
SABR Matt SABR Matt is offline
Hunter of Objective Truth
 
Join Date: May 2005
Location: Where all students live...nowhere.
Posts: 8,710
Too hard to implement a stopwatch system. You'd need a guy with a REALLY fast reaction time to be clicking the button in perfect time to the flight of the ball and you'd need one of those guys watching every game in baseball.
Reply With Quote
  #17  
Old 11-15-2007, 03:12 AM
Don Quixote Don Quixote is offline
Registered User
 
Join Date: Jul 2007
Posts: 26
Do players with great SPEED tend to have BABIP a bit higher than guys who take 5-seconds to get down to 1st?????
Reply With Quote
  #18  
Old 11-15-2007, 06:18 AM
TheAnswer1313 TheAnswer1313 is offline
Registered User
 
Join Date: Oct 2006
Posts: 46
I could be wrong but I would definintely think guys with Speed would have higher BABIP's just based on the fact they'd beat out a bunch of hits that others wouldnt.
Reply With Quote
  #19  
Old 11-15-2007, 08:11 AM
Tango Tiger's Avatar
Tango Tiger Tango Tiger is offline
Registered User
 
Join Date: Mar 2006
Posts: 2,015
Quote:
Originally Posted by SABR Matt View Post
Too hard to implement a stopwatch system. You'd need a guy with a REALLY fast reaction time to be clicking the button in perfect time to the flight of the ball and you'd need one of those guys watching every game in baseball.
No you don't. You can record this off the video recording.

HittrackerOnline.com has no problems recording this for HR.
Reply With Quote
  #20  
Old 11-15-2007, 12:22 PM
jinaz jinaz is offline
Basement-Dwelling Blogger
 
Join Date: Oct 2007
Posts: 52
Quote:
Originally Posted by Tango Tiger View Post
No you don't. You can record this off the video recording.

HittrackerOnline.com has no problems recording this for HR.
In fairness, the "hang time" for a home run is quite a bit longer than the hang time for a line drive, so it's a lot easier to record. But I agree that it should be possible to do this without too much trouble, especially if you attach a reasonably high-resolution clock to your playback software and then do things frame-by-frame.

But even a stopwatch in hand would be doable. We use stopwatches all the time in my work to time mating bouts among butterflies, and those can be as short as a half-second or so. There's measurement error, of course, but using a stopwatch is certainly more precise than a subjective "soft", "medium," "hard" classification scheme.

I know you folks have asked him about this before, but I submitted a question for John Dewan about this in a recent interview. Figured it didn't hurt to keep bringing it up. He didn't seem particularly enthusiastic about it. Here's his response:
"A: We're not using a stop watch, if that's what you mean. But we factor in virtually the same thing by utilizing both the speed (soft, medium, hard) and the type (bunt, fly, liner, fliner, grounder) of batted ball. Can we get more precise at some point? Probably."
-Justin
__________________
---
My blog: On Baseball and the Reds
Reply With Quote
  #21  
Old 11-15-2007, 01:50 PM
Tango Tiger's Avatar
Tango Tiger Tango Tiger is offline
Registered User
 
Join Date: Mar 2006
Posts: 2,015
And I don't know about mating butterflies, but you can get pretty good at hitting the stopwatch upon contact of the ball, since you can see the release of the object from the pitcher's hand. It's not like it's a starter's gun at a race, where you are reacting to something. In short, if a batter can time his swing to make contact with a pitch, we should be able to time our fingers to make the same contact. So, we'll be off by 0.1 or 0.2 seconds. I can live with that, if it means I know if a ball was in the air for 3.9 or 2.6 seconds.
Reply With Quote
  #22  
Old 11-15-2007, 01:50 PM
SABR Matt's Avatar
SABR Matt SABR Matt is offline
Hunter of Objective Truth
 
Join Date: May 2005
Location: Where all students live...nowhere.
Posts: 8,710
Video recoridng software is not a stopwtach Tom.

Video recording software comes with a timer based on frames per second. At 30 or 60 FPS, you will get a pretty good idea of how long it took a ball to go from bat to fielder, absolutely. That's not a stopwatch...that's the method I'd propose using. It still requires a person to replay each game...every day...all 2600 games that get played in each season...and record the time-to-zone info for each batted ball. You got hundreds of thousands of dollars to hire people to sit and do that every year?
Reply With Quote
  #23  
Old 11-15-2007, 02:05 PM
Tango Tiger's Avatar
Tango Tiger Tango Tiger is offline
Registered User
 
Join Date: Mar 2006
Posts: 2,015
BIS does that very thing, minus the timer, for whatever reason.
Reply With Quote
  #24  
Old 11-15-2007, 02:47 PM
SABR Matt's Avatar
SABR Matt SABR Matt is offline
Hunter of Objective Truth
 
Join Date: May 2005
Location: Where all students live...nowhere.
Posts: 8,710
And then charges forty billion dollars to access that data.

You got hundreds of thousands of dollars to afford to hire people to get that information WITHOUT making that information inaccessible to me and most other sabermetricians?
Reply With Quote
  #25  
Old 11-15-2007, 04:24 PM
Tango Tiger's Avatar
Tango Tiger Tango Tiger is offline
Registered User
 
Join Date: Mar 2006
Posts: 2,015
BIS does make the data available for private use for a few hundred dollars.
Reply With Quote
Reply

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -7. The time now is 05:14 PM.


Copyright © 2000-2008. All Rights Reserved.
Part of the
Baseball Almanac family: 755 Home Runs | Baseball Box Scores | Football Almanac | Pigskin Fever | Today in Baseball History.