Announcement

Collapse
No announcement yet.

Base-Advance Average

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Base-Advance Average

    Over at retrosheet in the "feautres" section under "research papers" there's a great piece written by Gary Hardegree on a new metric he proposes is more comprehensive and correlates better with both runs and wins than any other metric commonly available. Even OPS+ and weighted batting average. In fact, the author proposes that base advance average tracks wins at a 95% rate, while OPS only does so 84% rate.

    In fact, he states it correlates nearly as well with wins as runs themselves!!

    It requires complete PBP data but- ostensibly- is a much more situationally specific and accurate measure of offensive production than any other stats du jour, like OPS+ or RC.

    Interesting, too, was that baserunning is incorporated in the form of bases advanced. Not confined to SB and CS anymore...

    I wanted to hear what the statisticians thought of the assets and limitations of the study and the conclusions the author makes.

  • #2
    I did read that paper. My critique of it is that while he got his R-squared up a little over some of the usual metrics, it doesn't really tell us much more than "teams who get guys on base and advance them will score runs."

    It's not to say that it's an awful study. It's novel and the methodology is sound, and his conclusions are proper. It's just not the type of study that will completely revolutionize the field.
    Statistically Speaking

    The plural of anecdote is not data.

    Comment


    • #3
      Originally posted by pizzacutter View Post
      It's not to say that it's an awful study. It's novel and the methodology is sound, and his conclusions are proper. It's just not the type of study that will completely revolutionize the field.
      So you don't believe it's a superior metric to OPS+, OWP, etc.? Why or why not? What's wrong with the correlation coefficients?

      Do you think it would behoove those interested in building offenses to look to base advance average before the other measures? Or for fans who want to find out which player contributed the most offensively overall?

      What do people see as the assets and limitations of the metric?

      Comment


      • #4
        I guess one of the questions here is the good ole skill vs. production debate.

        If my interpretation of your use of the term "situationally specifc" along with the fact that it needs full PBP, then it would raise some questions in the way it differs from a straight rate stat.

        If situational hitting isn't proven to be a discern-able, repeatable skill, then are we not just rewarding players for the random variations related to the external circumstances under which their successes come?

        Player A just happened to get more hits with runners on - if that's not something that can be established as a skill, then we are tilting towards production. A simple rate stat may be a more telling measure of skill.

        I'm not saying I favor one way or the other, just throwing it out there.

        If it works the way I said, it would be worth it just to prove that A-Rod doesn't hit all his homers in the eighth inning of blowouts against the D-Rays though...
        THE REVOLUTION WILL NOT COME WITH A SCORECARD

        In the avy: AZ - Doe or Die

        Comment


        • #5
          Counting bases in this manner has been around since the days of Chadwick I believe. They even did an article in Esquire magazine 70 years ago about this trick.

          If the author takes his BAA one step farther and removes runner left on because of outs he could simply divide by 4 and get exact runs because in reality that is all he is really measuring. He is doing a roundabout way of measuring runs. Which is pretty much useless because A) we already know how many runs scored and B) has no real predictive value.

          Comment


          • #6
            Ubi is dead right about this.

            a. Runs scored = 4 bases gained
            b. runners left on 1B at end of inning = 1 bases gained
            c. runners left on 2B = 2
            d. runners left on 3B = 3

            Total bases gained = a + b + c +d

            So, he's correlating "a" to "total bases gained".

            Ubi is saying why not also correlate b,c,d in addition to a to "total bases gained". You'll get r=1.00.
            Author of THE BOOK -- Playing The Percentages In Baseball

            Comment


            • #7
              I found this to be an interesting read, even if it wasn't over-enlightening. It struck me as being a strange way of measuring run contribution by looking at the change in run expectancy (and runs) before and after an event. Where this process differs is that it has improperly weighted the likelihood of scoring from a given base and as a result improperly weights the effects of the individual events.

              The metric at an individual level is also heavily-weighted with situational-dependence and clutch performance. (Note that I use the word performance as opposed to skill or ability) I'm guessing you could come up with some sort of tracer stat for individuals that would start with some light weights for setup stats (e.g. bb, 1b, 2b, 3b, sb) and then get more heavily weighted with RBI's.

              The most interesting thing I got out of the paper was that the correlation of runs to wins was .958. Does this suggest that there were samples of where the losing team actually outscored the winning team???

              Comment


              • #8
                Originally posted by weskelton View Post
                The most interesting thing I got out of the paper was that the correlation of runs to wins was .958. Does this suggest that there were samples of where the losing team actually outscored the winning team???

                No the writer was simply not factoring in those runners left on base at the end of innings or games.

                Comment


                • #9
                  Ubi,

                  Maybe you misunderstood me. I understand that the LOB essentially accounts for most of the difference beween the BAA and runs scored. But the .958 correlation number I mentioned is what was quoted as the correlation of differential runs (runs-for minus runs-against) to wins. This is not the same as the correlation value of differential BAA, which was .956.

                  Comment


                  • #10
                    Looking at page 7 of that paper, it looks to me that the .958 correlation was based on the cumulative 5 years of data per team (30 data points), not on a game-by-game basis (12,000 data points). Clearly, game-by-game, runs would have a correlation of 1.00
                    Author of THE BOOK -- Playing The Percentages In Baseball

                    Comment


                    • #11
                      This is pretty incredible. I proposed the very same stat just recently on my blog: http://www.offinlefffield.com/?p=26

                      I'm not sure who beat who to the stat, but I did write my post without reading his paper. Great minds, eh? Though it seems that much of the thread is less than fond of the stat, so perhaps I'm not as brilliant as I once thought.
                      --------------------
                      http://benchcoach.com/
                      --------------------
                      http://www.offinlefffield.com/
                      Sportswriter Mark Leff blabs about baseball

                      Comment


                      • #12
                        I've seen this stat at least a dozen times, as early as when I was a teenager doing it myself. This is all part of the path to take towards linear weights.
                        Author of THE BOOK -- Playing The Percentages In Baseball

                        Comment


                        • #13
                          I have a question. I understand about base-advancing being situationally depended, but aren't steals as well. A player has to get on base, usually first base, at a certain rate and in certain situations to be in position to steal.

                          Anyway, advancing on the bases has to be accounted for to tell how good a player is. Am I right that it can be equivalent to as much as a 15 run difference per season between the best and worst advancers? That's about +/-10 OPS+ points. It could be like adding 10 points to Mays OPS+ or taking 10 points off of somebody like McGwire.

                          I know that situational hitting may not be repeatable, that is the debate, but I do think that players with different hitting approaches (high walk OB% versus high average OB%) will differ in their ability to produce in more important situations. This is based on the argument that not all runs are equal. I think that the higher BA guy will produce more runs in higher leveraged situations than the higher walk guy.

                          I looked, for example and Joe Morgan in situations where a single would be much much more valuable than a walk-runner on second and 2 outs, runners on second and third and 2 outs. His batting average I think was basically the same or worse in those situations than in situations where a walk was closest to a single such as leading off.

                          In other words, he did not seem to have the ability to turn into a .320 hitter rather than a walk machine when called for. I think that his overall stats dropped in those situations as well. I figured that he was basically a .300 hitter who let his average drop into the .270s over his career in exchange for 100+ walks, but it looks like he was a .270 hitter in all situations.

                          Comment


                          • #14
                            I haven't looked into Morgan in the way you describe, Brett. But, I have been singing your tune since my arrival here. I've grown to have great respect for the statistical work done in the field of baseball research, but the myriad variables raise an inherent problem.

                            How many times have you seen holding on a runner (or not) being the difference between a ground ball out and a run scoring double down the line? How many times have you seen a would-be routine double play ball turn into a first-to-third single because the runner broke for second and the fielder vacated his hole to cover the bag? Everything on the field is highly situational, whether it is repeatable or not.

                            I've also raised theories about the make-up of teams relating to the consistency of their offensive output, and that seems to fit here. If you take a team constructed primarily of lower average, high walking, power hitters and compare them to a team made up mostly of higher average lower slugging guys, and assume the two teams produce an equal amount of total runs - would you still be better off with one team or the other? I think the standard deviation of the slugging team's day-to-day output would be higher, feast and famine. This would mean that more of their runs would be uneccessary, as they would win a lot of blowouts, scoring way more than they needed to, but they would also have more games in which the didn't score a lot.

                            If a team scores 810 runs, exactly 5 per, which team do you think would win more games, a team who scored exactly 5 every night, or one that scored the same runs over the course of the season in a randomly distributed? Has anybody done any work along that line?
                            THE REVOLUTION WILL NOT COME WITH A SCORECARD

                            In the avy: AZ - Doe or Die

                            Comment


                            • #15
                              Originally posted by digglahhh View Post
                              If a team scores 810 runs, exactly 5 per, which team do you think would win more games, a team who scored exactly 5 every night, or one that scored the same runs over the course of the season in a randomly distributed? Has anybody done any work along that line?
                              What matters here is if the team is above average or below average. An above average team will win more with more consistent day to day production than with more erratic production. A bad team will win more with more erratic production.

                              Comment

                              Ad Widget

                              Collapse
                              Working...
                              X