Page 2 of 3 FirstFirst 123 LastLast
Results 21 to 40 of 47

Thread: question about Extrapolated Runs

  1. #21
    It looks to me that he went half-way between logic/empirical and regression. If you are looking to lower your RMSE on the data you are basing your equation on, then regression will certainly help you... but it won't help with the out-of-sample data, where logic will win.

  2. #22
    So, here's what I did... I took the HG weights and scaled them proportionately, so that the difference between the sums of the weighted values for 1b/2b/3b/hr in XR and LWTS was zeroed out for the 1955-1997 time period. Here are the values I ended up with...

    1B: .48
    2B: .80
    3B: 1.11
    HR: 1.43

    I then substituted these weights in the XR formula and re-ran the tests that Furtado had originally done in the 1999 BBBA. The results...

    The revised XR formula suffered degradation from an RMSE of 20.9 to 21.3 for the entire 1955-1997 time period. Likewise, there was a comparable drop in accuracy for each of the individual decade tests 1955-59, 1960's, 1970's, 1980's and 1990-97.

    What does it all mean? Probably not very much, but it does seem to suggest that there was method to the madness in deriving the original XR weights, as strange as they seem. If nothing else, XR does give us a level of accuracy to shoot for in measuring run estimators for teams in the normal range.

  3. #23
    No, you are wrong.

    All the regression does is best-fit the data for the sample, but it doesn't work in the out-of-sample. I once took the Redsox data, and I ended up with a regression where the triples run value was higher than a HR.

    What you have to do is derive the components in your sample, and test it in the out-of-sample.

    Otherwise, why not go all the way, and use just regression, where you'll end up with a run value for .66 for the double?

  4. #24

    See tables 2 and 4 above.

    If you are looking for the lowest RMSE, then you want the run value of the double as .63, just 0.15 runs more than a single!

    The only way to test, is to develop the equation in your sample, and test out-of-sample.

    However, note that each coefficient in the regression itself has a fairly wide confidence interval.

    Finally, why test on aggregated team data, where you'll only have a few thousand data points, when you can test at the game or even inning level, where you will have tens of thousands of data points or more.

  5. #25
    Quote Originally Posted by Tango Tiger View Post
    If you are looking for the lowest RMSE, then you want the run value of the double as .63, just 0.15 runs more than a single!
    What is it about regression that would give rise to the undervaluing of the double - assuming there are enough data points to yield highly significant coefficient estimates?

    Is it that regression is estimating the average value of each game event, whereas you're interested in the marginal values in a given offensive environment?

    Quote Originally Posted by Tango Tiger View Post
    Finally, why test on aggregated team data, where you'll only have a few thousand data points, when you can test at the game or even inning level, where you will have tens of thousands of data points or more.
    I've been looking at generating weights for the Israel Baseball League, with only 6 teams and 41 games each. To get meaningful results, I have in fact gone down to the inning level, which gives me over 1600 independent data points. Since run scoring actually takes place on the inning level, not the game level, I expect it should also yield more accurate estimates of event values, since an event in inning 1 shouldn't be able to create runs in inning 2.
    The blog for Israel Baseball League analysis

  6. #26
    The point is that a few thousand data points doesn't give you a "significant" estimate, as you are trying to imply. The range for the .62 run value of a double at the 95% confidence interval is probably something like .62 +/- .20 runs or something. Couple that with .47 run value (with it's smaller interval), and it's statistically possible that the run value of the single is higher than the double, according to these regressions. (And the triple being higher than HR, possibly.)

    The problem you have with team/seasonal-level is that you have two huge biases: park, and team. Rather than each "team" being a random sample of games, is that it's the actual same players all lumped up. Therefore, each team is biased by the identity of the players. On top of which, those players play in the same parks (half the time), which introduces another bias.

    It's basically a joke that we even bother to run regressions like this.

    As you correctly point out, runs are scored at the inning level. Why in the world would anyone aggregate 1458 innings into a single data point is beyond me.

  7. #27
    Interesting. The article you linked to doesn't give confidence levels.

    For my IBL data, using a total of 1633 half-innings as separate data points, and the standard spreadsheet regression function, I get what appear to be significant coefficient values for all the non-rare game events. (Keep in mind that this is a much higher run scoring environment than the MLB.)

    Out: -.156 +- .010
    Single: .598 +- .016
    Double: .845 +- .035
    Triple: 1.221 +- .127
    HR: 1.435 +- .042
    Walk: .482 +- .018
    HBP: .520 +- .038
    Reach-on-error: .522 +- .062
    Error without reach: .302 +- .050
    Stolen base: .092 +- .026
    Caught stealing: -.281 +- .058
    GIDP: -.280 +- .056

    Obviously, some of those error terms are too high for comfort.

    I'm somewhat reassured, though, by the internal consistency among the figures: the same values for caught stealing and double plays, for example, and for hit-by-pitch and reach-on-error. The weight for triples may be dodgy, and I'm inclined to declare by fiat that a triple is the average of a double and a home run.

    A double comes out to .247 more than a single,

    Error terms for sacrifices and intentional walks are higher:

    Sac fly: .404 +- .069
    Sac hit: .105 +- .082
    Intentional walk (adjustment to walk values): -.168 +- .114

    Should I be suspicious about these numbers just because they come from an (admittedly imperfect) regression?

    On the other hand, the use of constant linear weights for game events is inherently inconsistent with the true nature of run creation. How much precision can we demand from a flawed method?
    The blog for Israel Baseball League analysis

  8. #28
    Join Date
    May 2005
    Where all students live...nowhere.
    Actually IBL...your values for the main offensive events look about right for a league with such a high run scoring rate.

  9. #29

    Looking at your weights, I'd be suspicious as to how close the walk and singles are. Reached on error should be very close, or higher, than a single, and not a HBP.

    How many runs per 27 outs does your league score? Looking at your weights, and my link above, I'll guess about 8?

  10. #30
    Quote Originally Posted by Tango Tiger View Post
    No, you are wrong.
    Huh? I wasn't aware that I had actually said anything that I could be wrong about, as I was simply stating facts and the results of my experiment. But if you say so, then I probably am.

    My efforts here were just an exercise in curiousity. I wanted to see how much damage was done to the accuracy of XR by substituting the more conventional values into XR. I was optimistic that the difference would have been negligible, but I don't think I would call it that.

    I was working within the confines of the original tests more out of convenience for comparison (or maybe laziness). I don't disagree with anything you've said here about more effective ways to test the accuracy of estimators.

    Now about those 2B values...

    In "Curve Ball" there is a chapter on the development of event weights using Linear Regression. After starting with a series of weights simply from 1B,2B,3B,HR,BB and SB, they start adding other events into the mix. They are ultimately able to reduce the RMSE by adding in SF. But, it turns out that the SF has a positive weight greater than anything other than a 3B or HR, which is absolutely ludicrous. They suggest that it is most likely saying 1)something about the definition of the stat itself (i.e. the only official stat other than the HR that guarantees a run) or 2) something about the situational nature of the stat (runner on 3B, less than 2 out).

    Like the Iblemetrician, I too am curious about what the regression numbers might be telling us. I don't think that either 1) or 2) from above apply to the 2B. Perhaps there's something about the nature of teams that hit very few or very many 2B's. What do you think?

  11. #31
    Join Date
    May 2005
    Where all students live...nowhere.
    I think there's a double-to-triple conversion going on that shows up in the regressions. Teams that hit few doubles might be more likely to have hit more triples (because what is a deep fly ball that is not a double? A HR or a 3B or an out), meaning it might actually be slightly more favorable to hit a few fewer doubles (reducing the run weight for them) and a few more triples (increasing the run rate for them).

  12. #32
    Bill, I was talking about "that there was method to the madness". In my opinion, it's just madness!


    As for the nature of teams that hit lots of doubles, repeat the study, going back to 1919-2007, and see what you get. Here, I'll do it, using only ab-h,1b,2b,3b,hr,bb against runs:
    1b: +.56
    2b: +.80
    3b: +1.50
    hr: +1.40
    bb: +.37
    out: -.12

    See what I mean? The 3B higher than a HR? The double value now shooting up to where it should be?

    The triple can be acting as a proxy for SB or baserunning, etc.


    As for the SF, you get a guaranteed run. IT's like saying R=R. Technically speaking, you should be removing 1 run per HR and SF from the "R", since we know that's what those guys are worth at least.

  13. #33
    Good point about the reach-on-error value.

    As for runs per 27 outs, I get 7.85. So you got it pegged.

    Regarding 3B and HR, I get similar results when I do a game-by-game regression:

    1B: .609 +- .038
    2B: .802 +- .084
    3B: 1.369 +- .343
    HR: 1.222 +- .099
    BB: .518 +- .040
    Out: -.174 +- .020

    I had attributed it to two things: the small number of triples, and the triple proxying other park effects (parks with more triples have fewer home runs). But I don't have any evidence for that.

    I also notice now that the double coefficient is lower in those results.
    The blog for Israel Baseball League analysis

  14. #34
    A triple is a double with speed (essentially, a "stolen base" while there's no batter at the plate), plus clears the bases of course.

    I find it more useful to do 2b+3b as "double", and simply force in a constant coefficient for the "stolen triple" of .30 runs. It'll save you some grief.

  15. #35
    Join Date
    May 2005
    Where all students live...nowhere.
    Back when I was trying to use regression to find linear weights (before I had PBP data) I had a whole series of cheats to get rid of rare events and find more stable weights for them.

    I literally counted doubles and triples as extra base hits and put the triples into the stolen base column too. I counted sac bunts as stolen bases and outs. Double plays as caught stealing plus out etc. My way of trying to reduce the number of event types to something the regression could support.

  16. #36
    Quote Originally Posted by SABR Matt View Post
    Back when I was trying to use regression to find linear weights (before I had PBP data)
    How would you go about finding linear weights using play-by-play data?
    The blog for Israel Baseball League analysis

  17. #37
    Join Date
    May 2005
    Where all students live...nowhere.

    Read that first. Markov modeling to find run expectancies given a starting base/out state is where linear weights research should begin if you have detailed PBP data.

  18. #38
    Thanks, Matt!
    The blog for Israel Baseball League analysis

  19. #39
    Okay... one of the things that I got out of all that is that XR is not the best stat I could be using :-)

    How about Individual Base Runs? Can that be figured with just TB, and no 2B/3B info? The formula I found is:

    A = H + W + HB - HR - CS - DP
    B = .777S + 2.61D + 4.29T + 2.43HR + .03(W + HB - IW) - .747IW + 1.30SB + .13CS + 1.08SH + 1.81SF + .70DP -.04(AB-H)
    C = AB - H + SH + SF

    The other, simpler formulae on that page only require TB... but they don't include GIDP, which I definitely need included. Actually, I'm not even sure what I'm supposed to do once I get those A, B, and C -- is it (A+2.34PA)(B+2.58PA)/(B+C+7.98PA)+HR-.76PA ? -- but we can cross that bridge when we come to it. For now, I'd just like to know, is there a way to figure that formula with just TB?

  20. #40
    Mean Dean,

    Welcome back to the conversation. The proper construction of BaseRuns is...

    Base Runs = A*B/(B+C) +D

    Here's a version of BaseRuns from Wikipedia, which I believe was originally presented on David Smyth's BaseRuns Primer...

    A = H + BB + HBP - HR - .5*IBB

    B = [1.4*TB -.6*H -3*HR +.1*(BB+HBP-IBB) +.9*(SB-CS-GDP)] *1.1

    C = AB - H + CS + GDP

    D = HR

    That being said, BaseRuns is really constructed to estimate team run scoring. The preferred method of evaluating individuals is to use BaseRuns against a team (or league) to generate a set of custom linear weights which are then applied to the stats of the individual player.

    Patriot recently posted this method for deriving the custom linear weight values...

    LW = ((B + C)(Ab + Ba) - AB(b + c))/(B + C)^2 + d

    Where A, B, C, and D are the total A, B, C, and D factors for the entity in question. So if you are doing the 1990 NL and there were 1800 home runs hit in that league, D = 1800.

    a, b, c, and d are the coefficient for the event in question in the A, B, C, or D factor. For example, a home run has a = 0, b = ~2 (or whatever coefficient the HR has in your B factor), c = 0, and d = 1.

    So now I'm really curious, what data set are you working with that would have GIDP for an individual but not 2B or 3B?
    Last edited by weskelton; 11-26-2007 at 07:49 PM. Reason: added Patriot's custom linear weights method

Page 2 of 3 FirstFirst 123 LastLast


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts