# Thread: question about Extrapolated Runs

1. It looks to me that he went half-way between logic/empirical and regression. If you are looking to lower your RMSE on the data you are basing your equation on, then regression will certainly help you... but it won't help with the out-of-sample data, where logic will win.

2. Registered User
Join Date
May 2007
Posts
225
So, here's what I did... I took the HG weights and scaled them proportionately, so that the difference between the sums of the weighted values for 1b/2b/3b/hr in XR and LWTS was zeroed out for the 1955-1997 time period. Here are the values I ended up with...

1B: .48
2B: .80
3B: 1.11
HR: 1.43

I then substituted these weights in the XR formula and re-ran the tests that Furtado had originally done in the 1999 BBBA. The results...

The revised XR formula suffered degradation from an RMSE of 20.9 to 21.3 for the entire 1955-1997 time period. Likewise, there was a comparable drop in accuracy for each of the individual decade tests 1955-59, 1960's, 1970's, 1980's and 1990-97.

What does it all mean? Probably not very much, but it does seem to suggest that there was method to the madness in deriving the original XR weights, as strange as they seem. If nothing else, XR does give us a level of accuracy to shoot for in measuring run estimators for teams in the normal range.

3. No, you are wrong.

All the regression does is best-fit the data for the sample, but it doesn't work in the out-of-sample. I once took the Redsox data, and I ended up with a regression where the triples run value was higher than a HR.

What you have to do is derive the components in your sample, and test it in the out-of-sample.

Otherwise, why not go all the way, and use just regression, where you'll end up with a run value for .66 for the double?

4. http://www.knology.net/~johnfjarvis/runs_survey.html

See tables 2 and 4 above.

If you are looking for the lowest RMSE, then you want the run value of the double as .63, just 0.15 runs more than a single!

The only way to test, is to develop the equation in your sample, and test out-of-sample.

However, note that each coefficient in the regression itself has a fairly wide confidence interval.

Finally, why test on aggregated team data, where you'll only have a few thousand data points, when you can test at the game or even inning level, where you will have tens of thousands of data points or more.

5. Registered User
Join Date
Oct 2007
Location
Israel
Posts
43
Originally Posted by Tango Tiger
If you are looking for the lowest RMSE, then you want the run value of the double as .63, just 0.15 runs more than a single!
What is it about regression that would give rise to the undervaluing of the double - assuming there are enough data points to yield highly significant coefficient estimates?

Is it that regression is estimating the average value of each game event, whereas you're interested in the marginal values in a given offensive environment?

Originally Posted by Tango Tiger
Finally, why test on aggregated team data, where you'll only have a few thousand data points, when you can test at the game or even inning level, where you will have tens of thousands of data points or more.
I've been looking at generating weights for the Israel Baseball League, with only 6 teams and 41 games each. To get meaningful results, I have in fact gone down to the inning level, which gives me over 1600 independent data points. Since run scoring actually takes place on the inning level, not the game level, I expect it should also yield more accurate estimates of event values, since an event in inning 1 shouldn't be able to create runs in inning 2.

6. The point is that a few thousand data points doesn't give you a "significant" estimate, as you are trying to imply. The range for the .62 run value of a double at the 95% confidence interval is probably something like .62 +/- .20 runs or something. Couple that with .47 run value (with it's smaller interval), and it's statistically possible that the run value of the single is higher than the double, according to these regressions. (And the triple being higher than HR, possibly.)

The problem you have with team/seasonal-level is that you have two huge biases: park, and team. Rather than each "team" being a random sample of games, is that it's the actual same players all lumped up. Therefore, each team is biased by the identity of the players. On top of which, those players play in the same parks (half the time), which introduces another bias.

It's basically a joke that we even bother to run regressions like this.

As you correctly point out, runs are scored at the inning level. Why in the world would anyone aggregate 1458 innings into a single data point is beyond me.

7. Registered User
Join Date
Oct 2007
Location
Israel
Posts
43
Interesting. The article you linked to doesn't give confidence levels.

For my IBL data, using a total of 1633 half-innings as separate data points, and the standard spreadsheet regression function, I get what appear to be significant coefficient values for all the non-rare game events. (Keep in mind that this is a much higher run scoring environment than the MLB.)

Out: -.156 +- .010
Single: .598 +- .016
Double: .845 +- .035
Triple: 1.221 +- .127
HR: 1.435 +- .042
Walk: .482 +- .018
HBP: .520 +- .038
Reach-on-error: .522 +- .062
Error without reach: .302 +- .050
Stolen base: .092 +- .026
Caught stealing: -.281 +- .058
GIDP: -.280 +- .056

Obviously, some of those error terms are too high for comfort.

I'm somewhat reassured, though, by the internal consistency among the figures: the same values for caught stealing and double plays, for example, and for hit-by-pitch and reach-on-error. The weight for triples may be dodgy, and I'm inclined to declare by fiat that a triple is the average of a double and a home run.

A double comes out to .247 more than a single,

Error terms for sacrifices and intentional walks are higher:

Sac fly: .404 +- .069
Sac hit: .105 +- .082
Intentional walk (adjustment to walk values): -.168 +- .114

Should I be suspicious about these numbers just because they come from an (admittedly imperfect) regression?

On the other hand, the use of constant linear weights for game events is inherently inconsistent with the true nature of run creation. How much precision can we demand from a flawed method?

8. Actually IBL...your values for the main offensive events look about right for a league with such a high run scoring rate.

9. http://www.tangotiger.net/customlwts.html

Looking at your weights, I'd be suspicious as to how close the walk and singles are. Reached on error should be very close, or higher, than a single, and not a HBP.

How many runs per 27 outs does your league score? Looking at your weights, and my link above, I'll guess about 8?

10. Registered User
Join Date
May 2007
Posts
225
Originally Posted by Tango Tiger
No, you are wrong.
Huh? I wasn't aware that I had actually said anything that I could be wrong about, as I was simply stating facts and the results of my experiment. But if you say so, then I probably am.

My efforts here were just an exercise in curiousity. I wanted to see how much damage was done to the accuracy of XR by substituting the more conventional values into XR. I was optimistic that the difference would have been negligible, but I don't think I would call it that.

I was working within the confines of the original tests more out of convenience for comparison (or maybe laziness). I don't disagree with anything you've said here about more effective ways to test the accuracy of estimators.

Now about those 2B values...

In "Curve Ball" there is a chapter on the development of event weights using Linear Regression. After starting with a series of weights simply from 1B,2B,3B,HR,BB and SB, they start adding other events into the mix. They are ultimately able to reduce the RMSE by adding in SF. But, it turns out that the SF has a positive weight greater than anything other than a 3B or HR, which is absolutely ludicrous. They suggest that it is most likely saying 1)something about the definition of the stat itself (i.e. the only official stat other than the HR that guarantees a run) or 2) something about the situational nature of the stat (runner on 3B, less than 2 out).

Like the Iblemetrician, I too am curious about what the regression numbers might be telling us. I don't think that either 1) or 2) from above apply to the 2B. Perhaps there's something about the nature of teams that hit very few or very many 2B's. What do you think?

11. I think there's a double-to-triple conversion going on that shows up in the regressions. Teams that hit few doubles might be more likely to have hit more triples (because what is a deep fly ball that is not a double? A HR or a 3B or an out), meaning it might actually be slightly more favorable to hit a few fewer doubles (reducing the run weight for them) and a few more triples (increasing the run rate for them).

12. Bill, I was talking about "that there was method to the madness". In my opinion, it's just madness!

***

As for the nature of teams that hit lots of doubles, repeat the study, going back to 1919-2007, and see what you get. Here, I'll do it, using only ab-h,1b,2b,3b,hr,bb against runs:
1b: +.56
2b: +.80
3b: +1.50
hr: +1.40
bb: +.37
out: -.12

See what I mean? The 3B higher than a HR? The double value now shooting up to where it should be?

The triple can be acting as a proxy for SB or baserunning, etc.

***

As for the SF, you get a guaranteed run. IT's like saying R=R. Technically speaking, you should be removing 1 run per HR and SF from the "R", since we know that's what those guys are worth at least.

13. Registered User
Join Date
Oct 2007
Location
Israel
Posts
43
Good point about the reach-on-error value.

As for runs per 27 outs, I get 7.85. So you got it pegged.

Regarding 3B and HR, I get similar results when I do a game-by-game regression:

1B: .609 +- .038
2B: .802 +- .084
3B: 1.369 +- .343
HR: 1.222 +- .099
BB: .518 +- .040
Out: -.174 +- .020

I had attributed it to two things: the small number of triples, and the triple proxying other park effects (parks with more triples have fewer home runs). But I don't have any evidence for that.

I also notice now that the double coefficient is lower in those results.

14. A triple is a double with speed (essentially, a "stolen base" while there's no batter at the plate), plus clears the bases of course.

I find it more useful to do 2b+3b as "double", and simply force in a constant coefficient for the "stolen triple" of .30 runs. It'll save you some grief.

15. Back when I was trying to use regression to find linear weights (before I had PBP data) I had a whole series of cheats to get rid of rare events and find more stable weights for them.

I literally counted doubles and triples as extra base hits and put the triples into the stolen base column too. I counted sac bunts as stolen bases and outs. Double plays as caught stealing plus out etc. My way of trying to reduce the number of event types to something the regression could support.

16. Registered User
Join Date
Oct 2007
Location
Israel
Posts
43
Originally Posted by SABR Matt
Back when I was trying to use regression to find linear weights (before I had PBP data)
How would you go about finding linear weights using play-by-play data?

17. http://www.pankin.com/markov/theory.htm

Read that first. Markov modeling to find run expectancies given a starting base/out state is where linear weights research should begin if you have detailed PBP data.

18. Registered User
Join Date
Oct 2007
Location
Israel
Posts
43
Thanks, Matt!

19. Registered User
Join Date
Nov 2007
Posts
15
Okay... one of the things that I got out of all that is that XR is not the best stat I could be using :-)

How about Individual Base Runs? Can that be figured with just TB, and no 2B/3B info? The formula I found is:

A = H + W + HB - HR - CS - DP
B = .777S + 2.61D + 4.29T + 2.43HR + .03(W + HB - IW) - .747IW + 1.30SB + .13CS + 1.08SH + 1.81SF + .70DP -.04(AB-H)
C = AB - H + SH + SF

The other, simpler formulae on that page only require TB... but they don't include GIDP, which I definitely need included. Actually, I'm not even sure what I'm supposed to do once I get those A, B, and C -- is it (A+2.34PA)(B+2.58PA)/(B+C+7.98PA)+HR-.76PA ? -- but we can cross that bridge when we come to it. For now, I'd just like to know, is there a way to figure that formula with just TB?

20. Registered User
Join Date
May 2007
Posts
225
Mean Dean,

Welcome back to the conversation. The proper construction of BaseRuns is...

Base Runs = A*B/(B+C) +D

Here's a version of BaseRuns from Wikipedia, which I believe was originally presented on David Smyth's BaseRuns Primer...

A = H + BB + HBP - HR - .5*IBB

B = [1.4*TB -.6*H -3*HR +.1*(BB+HBP-IBB) +.9*(SB-CS-GDP)] *1.1

C = AB - H + CS + GDP

D = HR

That being said, BaseRuns is really constructed to estimate team run scoring. The preferred method of evaluating individuals is to use BaseRuns against a team (or league) to generate a set of custom linear weights which are then applied to the stats of the individual player.

Patriot recently posted this method for deriving the custom linear weight values...

LW = ((B + C)(Ab + Ba) - AB(b + c))/(B + C)^2 + d

Where A, B, C, and D are the total A, B, C, and D factors for the entity in question. So if you are doing the 1990 NL and there were 1800 home runs hit in that league, D = 1800.

a, b, c, and d are the coefficient for the event in question in the A, B, C, or D factor. For example, a home run has a = 0, b = ~2 (or whatever coefficient the HR has in your B factor), c = 0, and d = 1.

So now I'm really curious, what data set are you working with that would have GIDP for an individual but not 2B or 3B?
Last edited by weskelton; 11-26-2007 at 07:49 PM. Reason: added Patriot's custom linear weights method

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•