Originally posted by

**drstrangelove**View PostR = -521 + .352*BB + .609*1B + .713*2B + 1.14*3B + 1.48*HR + .129*SB - .163*CS.

Linear weights: .70*BB. .90*1B. 1.25*2B. 1.6*3B. 2.00*HR .25*SB. -.50*CS

(Counterintuitively, you can just ignore the -521.) (The linear weights come from Mark Klaassen beyondtheboxscore.com. Google wOBA, linear weights, and a bunch of years, and you'll get a long list of linear weights and conversion coefficients, and a good explanation.)

The regression Rsquare value was 91.4, meaning that 91.4% of the variation in runs could be "accounted for" or "explained by" variation in the 7 predictor variables. Except for CS, all the variables were significant, and except for SB, the significant variables had p< = 0.00. The residuals (differencs between predicted and actual runs scored) were roughly normal in distribution, and only a few cases were tagged as anomalous.

I think the Rsquare is rather low, considering how much information is included in the predictor variables, but otherwise it looked Ok to me.

Why the coefficients are so different from the linear weights, I cannot explain, only note that they have two different interpretations.

In regression, a coefficient of, say, .713 for a double means that if you hold all the other variables constant, an increase of one double means an increase of .713 runs, on average. In linear weights, a weight of 1.25 means that given the run expectancy for situation A, the run expectation for situation B = situation A followed by a double, will be 1.25 runs higher than that of situation A.

I wish I could do better. I'm working on it and appreciate any help.

## Comment