How to reproduce Linear Weights
(The thread title should probably read "How do you reproduce Linear Weights?")
I have a pedagogical question.
I understand that Linear Weights have been around for years and their values are pretty much agreed upon. I am writing an example for some Linear Regression software and thought it would be cool if I could reproduce the coefficients.
I plugged in team data from both leagues 1946-89 into a Least Squares Zero-Intercept Linear Regression model with R as the dependent variable and ABminusH,1B,2B,3B,HR,BB,SO,SB,CS as the independent variable and trained. I got the following coefficients:
0.51 : 1B
0.68 : 2B
1.21 : 3B
1.48 : HR
0.37 : BB
0.17 : SB
-0.22 : CS
0.0007 : SO
-0.10 : Outs
One one hand, I was pleased that this sort of looks like the right answer. I did no special massaging of the data, just using the software as a black box. The near-zero value of strikeout was fun to see. On the other hand, I notice some sizeable discrepancies in the values for 2B & 3B (I'm not so worried about SB/CS as I understand those are sometimes fudged for leverage reasons).
Does anyone know what else in done in training the Linear Weights model to obtain the known coefficients? For my software example, I think this is good enough, but now I'm curious.
Let me restate that in no way am I claiming the known values are not correct. I did almost no work in training the above model and I'm simply interested to know if there is a way I can come closer to the published coefficiencts.
I got very similar answers when I calculated my dynamic linear weights for those years. The values of linear weights do change with time...some seasons are different than others. I believe Palmer used 1960s/70s/mid 80s data in his first analysis...but that doesn't explain the differences.
What software are you using? I''ve been working with SPSS of late...trying to figure out all of its' features to see if I can improve and automate my season-by-season modelling processes.
As far as I know, the linear weights people use now aren't done with multilinear regression though...they're done with Run/Out expectency tables.
Yes, the regression model on a TEAM level will give you weird and wrong results. You have to understand that every coefficient iteself has its own uncertainty level. The run value of the double is something like .66 +/- .15, 95% of the time. The triples value is even worse. You may think that having 600 or whatever teams is a good sample size, but it hardly is.
The CORRECT way to do it is with a change in run expectancy model. I give out the whole shebang in the book.
You can also look at some results from Tom Ruane here:
He gives the LWTS run values on a year-by-year, league-by-league level. Those are based on thousands of individual plays.
I'm using some in-house software at work (work-related tools applied to stuff like baseball and poker are always a big hit). I suppose I could easily switch to using 'R'.
Thanks Matt for letting me know that my answers are reproducible given the method I used.
Thanks Tango for the link to the Tom Ruane article showing that you have to go down to individual plays to get better results.
Thanks for the quick replies, guys!
There's only one problem Tango.
How the heck do you extend linear weights to eras before PBP data?
That's no problem. The run expectancy matrix can be estimated rather easily. The LWTS numbers are determined based on the frequency of each state-to-state transitions, for each event. (If you have my book, you can probably get some good insights on these things.)
You can also look here:
Thanks Tango...I'm attempting to determine how to make your weighting system work to account for actual runs rather than runs different from average (your out factor is like -.25 which means the average player isn't producing any RC by your method...I want LWs to actually model Run Scoring directly)
No, that is not a true statement. -.25 means the out generates .25 runs less than an average PA (which includes hits, walks, HR, outs).
To model run scoring directly, you use BaseRuns (see my site). And from BaseRuns you can generate the custom LWTS (see earlier link). And from the custom LWTS you can generate the RE matrix (unpublished).
Uh Tango...how is what I said untrue?
You say "An out is -0.25 runs which means it is worth 0.25 runs less than the average PA"
I said "An out is like -.25 runs by the RE matrix, which means an average player (note...meaning he gets average plate appearances!) produces no runs above average by your method.
What I said was correct I'm fairly certain.
By the way Tango...are you absolutely certain that there is no change in the relationships between the events relative to each other that is not explained by the run scoring enviornment?
I know you hate linear regression approaches, but I found through dynamic linear weight research that the more rare a posistive event (HR for example) the more it tended to be worth compared to the other events.
I found that the value of a HR in the modern game is actually at a near-all-time low and that it was worth more back in 1912 despite the lower scoring environemtn...and that singles were worth much much less back then (today...about 0.52 R...back then...about 0.39)...as were walks (today...about 0.38...back then about 0.26)
To me...that made sense...in the deadball era, the chances of you advancing after getting a single are remarkably less than they are today...in the deadfball era...the home run was runs in the bank...on the board...put it away pally. In the modern game, you have a higher chance to score those runs without the longball.
Matt, I think your post #10 is saying what my link in post #6 is saying. Can you click that link, and see if that's the case?
You also said "which means the average player isn't producing any RC by your method". This is not a true statement. An average player DOES create runs (RC). He just does it at the same rate as... the average player.
This might simply be a confusion in definition. RC being runs created being absolute and total runs created.
LWTS being runs created above average.
You define linear weights as an average relative method...I define them as an absolute method. That's the difference. The problem with average-relative methodology is that it's not a good assumption that league average defines that league. There are other skewing factors...talent depth and disposition (some years the pitchers are better than the hitters...some years the hitters are better than the pitchers) being a big one...
BTW I have clicked that link and was reacting in post #10 to you conclusion that singles, doubles, triples, home runs, and walks all increase in value with increasing run scoring. I'm not convinced that's correct. I think that in increasingly homer-friendly environments, singles, doubles, and possibly triples increase in value and HRs *decrease* in value...and in low-scoring environments, the HR hits a PREMIUM value because it's runs on the board with 100% certainty hwereas other events become less likely to actually produce runs.
Matt, I really don't know how to respond to your first paragraph. There's like three things you are talking about there, and I don't see how it's a LWTS problem any more or less than it's an RC problem or BsR problem.
Matt: the only way to be convinced is to actually run a simulator or Markov chains and do the work. I understand why you are thinking the way you are, but until you run it through a realistic process, it's just a nice thought.
The research that I have done, some published and some not, shows that the run value of the HR starts at 1.00 (obviously), and increases to a certain point, and then decreases until it converges to 1.00 when the team OBP approaches 1.000. That tipping point is around 10-12 RPG, or around an OBP level of .500.
Those interested can go here:
The run value of the walk is pretty much a straight-line value and it tracks OBP. Pretty much, OBP = run value of a walk. (Not exactly, but that's the basic idea). The run value of the single has a higher slope, and then converges towards 1.00, etc, etc.
Until other research shows otherwise, you should consider this research to be the standard. I would be glad to publicize any other research that supplants mine as the new standard.
If I knew what a Markov chain was, perhaps I'd be more impressed.
As for running simulations...I'm not entirely convinced mathematical modelling of baseball games is capturing the interaction between events...I could be wrong, afterall you've done more research in that area (by a loooong margin).
Make no mistake...I've read some of your work in this area and have been favorably impressed...it just makes intuitive sense to me that in an environment where HRs are common, each HR would have less impact on the game (relative to the other events).
I think the book lays it out pretty well, so you might be more impressed after
Your last statement is true under certain conditions. After all, this image:
shows how quickly the gap between a HR and 3B closes. It's a very long complicated process, and it's not a simple straight-line estimate. An environment where HR are common, and other events are not, or an environment where HR are common and other events are as well, will give you different gaps in run values.
At the extreme, a league where the OBP = .100, and the HR/PA is also = .100 (meaning no other forms of getting on bases exists), the run value of the HR = 1.00.
But, if the OBP = .200, and HR/PA stays at .100, the run value of the HR will go up to say 1.100, while the run value of the walk would probably jump to .150. (I don't know, just guessing). But if the OBP = .200 and HR/PA = 0, the run value of the walk would be more like .100.
I'm not sure you and I are really disagreeing about anything. You really have to lay out exactly your parameters so that we can establish exactly their impacts.
I don't think we're necessarily disagreeing either...I just feel like I'm the guy trying to speak swahili after one week of rudamentary lessons and a lifetime of speaking english. I'm going to have to get your book and see if I can understand where you're coming from after a more thorough reading...in the end I don't think the right answer in forming linear weights is to base it solely off of league run scoring rates...I think we need to use all of the information about how common each of the offensive events is. But you probably already know that and have done that work...permit me to catch up.
Right, it is not based soley on the run scoring rates of the league. You would take an advanced metric like BaseRuns, plug in the frequency of each event, and from that you can determine the run value of each event. You can certainly have two run environments, each 3.0 RPG, and in one place the run value of the walk is .05 runs more than in another, simply because of the frequency of each event.
I highly suggest reading this:
Yep...that makes a LOT more sense than the first page I read on the subject of BaseRuns.
Now I have a problem with this idea of the B/(B+C) element in the BaseRuns formula representing % of baserunners scored...(of course I recognize that no one is claiming this is absolute)...now that I understand the central logic though, I might be able to attempt some kind of "seat-of-the-pants" logical estimate of runner scoring rates.
I guess the main thing that bugs me is that it doesn't follow logically in my head that the number of bases advanced toward scoring after reaching first...over the number of outs produced plus those bases advanced...should represent what you're trying to represent.
We're specifically trying to define % of runners scoring...couldn't we do that using PBP data and then backfit that to eras prior to PBP based on similarities in the frequency of events?
Ah...I see you did that using 1974-1990...
OK...now here's a question for you, Tango.
The complaint about multiple linear regression is that it is specifically tailored to the ideosyncracies of the dataset upon which it was calculated. It is said that you gain accuracy in terms of RMSE by doing this, but those regressed coefficients then become useless in other contexts. With htis I have no quandry...it is DEFINITELY true that static weights based on 1960-1985 data (Palmer's original LW) are not going to be as accurate when you attempt to apply them to 1900-1920.
But doesn't that complaint more or less go away if you tailor a linear regression model to each RS environment individually as I attempted to do last year (some people on this site may remember my dynamic LW research based on sliding multilinear regression analysis...sliding because I changed the center-year of the dataset range...if I was finding weights for 1916...I used a range of years centered on 1916)?
A linear regression model against team-level data won't have the sample size you need. The 1960-1985 data presented proves that, especially when compared to the supremely accurate numbers that Ruane did for the exact same time period. The run value of the double is just soooo off.
What you want is to model reality. Model run scoring. Once you have that, you can change your inputs, since the model itself is sound. Right now, your best bet for a quick model is BaseRuns. A better model is a simulator. And the best model is a highly complex Markov chain. You can google
Pankin Markov baseball
and you'll probably get more than you ever want to learn about Markov.
Tango - Are you creating your own Markov chains for the research you are doing? If so are you using a specialized program or are you using Excel? I assume it uses matrix algebra but it would seem that some of the matrices would be very large.
It's specialized in that I wrote it. Yes, I get into 5-dimension arrays, and it's confusing to look at.
Yes...the linear regression model I was using tended to show a problem with small sample sizes (I got some funny values for triples, CS, and Ks, and in a few season-ranges, 2B as well...
Is the concept of a Markov chain discussed in THE BOOK...or is that current research and not presently included?
It's explained in 1 paragraph. I use it to generate some cool charts, which you may be able to figure out from the List of Tables on the book's website.
If you want details, I suggest you go here:
*All* reviews that I have found for the book can be found here:
Anyone hesitating should check those out. If I were to come across a bad review, I'd include that too.