Announcement

Collapse
No announcement yet.

Yet another Baseball Statistic (yabs)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Yet another Baseball Statistic (yabs)

    I have often felt that the most important thing a batter can do is get a walk. Not that exciting. But it does win games. A walk is very productive. It avoids double plays, and it wears out the pitcher. A pitcher in a groove is likely to retire a dozen or more batters, until he gets tired. So, long live the walk.

    So, I've got my own baseball statistic. It's not only about walks, but they get more respect. I call my new statistic

    The productivity average.

    In short, it's a bit like ops, but walks and moving the runners along get much more credit. In ops, the slugging portion gives a home-run 4 points. But it doesn't matter if there are runners on or not, so as far as the slugging average goes (and therefore the ops) a solo job or a grand slam count the same. So, you need rbi's.

    Same with a walk. A 2 out bases empty pass counts the same towards ops as a bases loaded walk with a bat flip. The latter is clearly more productive, and so that's why my new statistic does it.

    Anyway, I've written a program to test my idea, and I've got a pdf file with all the gory details and some test runs.

    See here, https://www.dropbox.com/s/0kb2g9dbxh...llpdf.pdf?dl=0

    Any comments are appreciated, even if it turns out my idea has been mentioned zillions of times before.

  • #2
    O well, no takers

    I've had a lot of fun with the BB database. I’ve updated the pdf file linked to in the first post. I ran the stats on the database, by decade, 4 decades, and the entire database. You’ll have to look to see who came out on top. There's one little barefoot surprise though.

    Consider that this stat does not need any human to decide on hit vs. error, and it’s easier to keep track of things. Nothing to remember about what’s a plate appearance, etc. It’s just like in little league, you went up to bat, great, you get on better. You move em up, still better. You hit it to the outfield and the fielders both called for the other guy to catch it, so you got 2 bases and everybody roars. You walk with the bases loaded, you’re a hero as much as the guy getting solo home runs. And if you keep getting doubled up because you can’t run, well time to start doing those wind sprints .

    There’s a sample scoring-card at the very end based on the production scoring. It doesn’t take a pro scout to tell where the action was, who scored and who knocked em in. And it’s all some digits and some highlighter. No fancy codes. Check it out, scroll to the last page.
    Last edited by rocket888; 07-28-2019, 01:03 AM.

    Comment


    • #3
      The productivity average considers that hitting with runners on base is much more valuable than when the bases are clear.
      So it’s not context-independent, like linear weights/WAR. Hitters get credit for men on base, over which they have no control. It’s more like WPA (more precisely, run expectancy), except as I understand it, every base is equal, which is not the way WPA/RE works, or should work. E.g., in your system driving in a runner on first is worth 3x driving in a runner on third. In win probability/run expectancy, the ratio of run values of the two situations is not only very different, but depends on the number of outs at the time.

      The regular stats such as ave, slg, obs, ops etc. don’t really consider the productive outs, or even the production value of moving runners, e.g. a single with man on first getting to 3rd.
      RE24 can be used to calculate productive outs, and BBRef has a whole section devoted to stats on moving runners over. But in the case of advancing a runner, e.g., first to third on a single, the baserunner gets credit for that, while the fielder loses credit.

      So overall, I see two problems with this. First, it’s not context independent, which is OK, there is room for context dependence in analytics, but it’s not a fair way to compare hitters. It isn’t even a measure of clutch, because you would want to divide the production achieved by the amount possible. This is what analytics does with WPA/LI, leverage index.

      Consider a simple example. A batter is up with the bases loaded. The maximum advanced bases he can produce in this scenario is ten, with a GS. If he hits a single and drives in two runs, he will get a 5 or 6, depending on whether the third baserunner stops at second or third (and to repeat, where he ends up reflects mostly his ability as a baserunner, and the fielder's defense, not so much anything the batter did). But say the runner goes to third. So in your system, the batter gets credit for advancing six bases. Out of ten we could define a clutch index of 6/10 = 0.6. Very good.

      But suppose another player hits a HR with the bases empty. He advances four bases, but this is the maximum possible with the bases empty. So his clutch score is 4/4 = 1.0. This is the best possible.

      The 6 vs. 4 bases advanced is analogous to WPA in saber metrics. While it's certainly a stat worth keeping track of, players with high WPA scores tend not only to be good, but to have PA with a lot of men on base (and in high leverage situations). Clutch divides WPA by leverage, to get a value indicative of how well the batter did given the circumstances he faced. It would be the same with your bases advanced merit.

      Second, bases advanced are not equal in value. They depend on which base and the number of outs.

      If your system is just a simple way to tabulate what hitters have done, fine. But it can't be used to compare the value of different hitters.
      Last edited by Stolensingle; 08-01-2019, 12:50 PM.

      Comment


      • #4
        Originally posted by Stolensingle View Post

        If your system is just a simple way to tabulate what hitters have done, fine. But it can't be used to compare the value of different hitters.
        Thank you for the detailed and thoughtful response. Obviously, I'm not trying to say my "production" statistic is the be-all and end-all of statistics. I just happen to value a player that moves up base runners. And since value is a subjective thing, it depends on who is doing the value measurement. And unless you exactly define what "value" is, I don't think it's completely accurate to say this statistic (or any other) can't be used to compare value.

        I call it productivity because it computes just what you say, what hitters have done. I use the term productive because so often one hears about a productive out, vs. an unproductive one. I wanted to see how much that might matter over the course of a player's career.

        When I watch a game, I often think the most valuable play is to get a walk. I grew up watching Richie Ashburn of the Phillies in the 50's. He could (purposefully, or so we believed) foul off 5-10 balls precisely behind home plate until he got a walk. I've not calculated it, but I suspect that most big innings begin with 1 or 2 walks with nobody out. And a bases loaded walk, is more exciting to me than a solo HR because of the tension involved. And it leaves the bases loaded so the next play is equally exciting.

        But then of course, value and excitement are completely subjective; we each have our own preferences.

        WAR is a stat that is so complicated, I suspect nobody except its designers really know how it is determined. The "prod" score is a measure all the actions that led to runs being scored. My program can optionally total in the RBIs but I find it doesn't change the results that much. And the prod is simple to compute. Just count the number of bases advanced, no matter what the cause, minus the outs. You don't need an official scorer to decide on the merit of a fielder. Over the long haul, I believe these things average out anyway.

        In most other statistics, if a player puts the ball in play and an error occurs, he gets the same credit as if it were a strikeout. Yet, perhaps a Pete Rose was hustling down the line and caused the shortstop to hurry and that led to the error. Is that not more "productive" than a strikeout?

        I've updated the pdf file and added a section near the end that is the total "prod" over all the players in the database. Sure, a player can be on a poor team all his career (e.g. Ernie Banks) and so not have as many opportunities to advance runners. And of course, longevity is important here. But I think this list is a pretty telling one.

        As for average productivity, i.e. production per at bat, my statistic led to the top 5 all time being, the Babe, Lou Gehrig, Shoeless Joe, Ted Williams, and Rogers Hornsby. Joe didn't have many at bats in the database I used, but on average he moved em along with the best.

        Here's a sample of the production totals (not average) and I think it provides an interesting look at some long time players. And many do think Barry was the best all time player, but then that's of course quite subjective. It's just interesting to me that he does come out on top in this list.

        aaaprodt.jpg
        Attached Files

        Comment


        • #5
          Sorry about the double posting of that table. I couldn't seem to format it so I created a pic, then discovered it uploaded it as an attachment as well. But perhaps the extra info about teams and years might be interesting. Besides, I don't know how to edit it out

          In case anyone is interested, here's a link to a zip with a windows .exe and a couple other files needed to compute the data.

          https://www.dropbox.com/s/is324en35g...lprod.zip?dl=0

          When I found that Willie Mays was not on the top of my lists, I was curious, and found it was mainly due to his extra years where his production was lower. But when looking at just years 1954 and thereabouts, he came out quite well. The program can take the decade files and then can now also include a year range so you don't have to download every single year.

          Hopefully someone might find this interesting. As always, any comments are appreciated. This is after all simply a fun project.
          Last edited by rocket888; 08-05-2019, 11:14 PM.

          Comment


          • #6
            Originally posted by rocket888 View Post

            Thank you for the detailed and thoughtful response. Obviously, I'm not trying to say my "production" statistic is the be-all and end-all of statistics. I just happen to value a player that moves up base runners. And since value is a subjective thing, it depends on who is doing the value measurement. And unless you exactly define what "value" is, I don't think it's completely accurate to say this statistic (or any other) can't be used to compare value.
            Sabermetrics does define value precisely. It's anything that increases the probability of winning a game. Every offensive event increases the probability that a run will score, and there is a known relationship of runs to wins. So value is defined as the increase in probability that a run will score. There is nothing subjective about that at all. The idea of any game is to win, so you base value on the extent to which it helps you do that.

            Your system will certainly correlate with winning, so I'm not surprised that it identifies some of the greatest players. A more context-independent version of your system would just be total bases, irrespective of the runners on base when the batter hit. In fact, total bases (or a similar stat that includes walks, which I call cumulative bases) has a strong correlation with runs and wins, about .90. It just isn't as good as a system based on linear weights.

            WAR is a stat that is so complicated, I suspect nobody except its designers really know how it is determined.
            It really isn't that hard to understand. You just have to appreciate that walks, singles, doubles, triples and homers have a specific value that is determined by increase in runs. You add up all these values and divide by plate appearances to get wOBA. The difference between a player's wOBA and the league average wOBA gives you runs above average, and from that you get runs above replacement, then WAR. It's only tricky in that run values aren't linear--a double is not worth twice a single, a HR is not worth twice a double--and that what matters is not absolute run value, but run value relative to average or replacement.
            Last edited by Stolensingle; 08-06-2019, 02:44 PM.

            Comment


            • #7
              Thanks again. Obviously I’m just a nubee at this, and I’m only doing it for fun. You sound like you’re a professional.

              It's anything that increases the probability of winning a game. Every offensive event increases the probability that a run will score, and there is a known relationship of runs to wins. So value is defined as the increase in probability that a run will score.
              I see that you are providing a good definition of value. I like that definition. It could have been that value was determined by how many fans show up to see a player. The Babe had that sort of value in his last year away from NY.

              So, I did take a look at wOBA and a few things immediately came to mind.

              The denominator subtracts intentional walks out of the equation. As an avid Mike Trout fan, I think he scores quite a bit after an intentional walk. I make a mental note of that as it gives me (and Mike) a big smile. And that certainly is something that increases the probability of winning a game. If WAR is computed as you said, then I would suspect that his intentional walks produce quite a bit more than the average player, as they typically need to earn their walks and often strike out trying. So, it seems a bit of subtracting oranges from apples.

              Besides, does someone record when Mike is thrown 3 high and tight since it’s often an obvious pitch around. I’m not aware that there’s any record of that, although I guess it could be computed with all the data collected today.

              I recall Barry Bonds was walked an extraordinary number of times. Does the wOBA claim those walks didn’t produce some wins?

              And there’s nothing there about double plays. As an Angels fan, I cringe every time Pujlos comes up with a man on first. He’s simply so prone to a double play. As of late, teams seem to pitch around Trout knowing that Pujlos is likely to get Mike out for them if Otani doesn't reach. I don’t see anything in wOBA that considers the probability of that as it effects a win.

              And there’s also nothing about errors. I sometimes watch Philly games, and get really ticked when a particular runner “assumes” he’s out and doesn’t run hard. The Dodger announcers discussed that mistake for 5 minutes a few weeks back as it completely changed the complexion of a game. That player is certainly not like Pete Rose when he played for the Phillies. I believe one way to determine the effect of hustle is to count how many times a player reaches on an error and give credit for such.

              And our working value definition said, "anything" so we shouldn't be deducting from the results for bad style points, with errors and intentional walks.

              Now I’m certainly no expert and haven’t tested my statistic against actual wins and losses. But I do know a bit about percentages and probability theory. And if the weights have to be computed after the fact each year, then this says that next year could be quite different. So there are many things that can change in a year. Just consider this year’s baseball, as it’s said to be more balanced and thus flies further.

              Also a player who gets a big signing, like a Pujlos, and then moves to a city where he doesn’t have the same support players around him, often does terrible. Maybe S.L. knew his time was up. But when I run my statistic on him for the years he was in S.L. vs. his time with the Angels, it tells me he’s not the same producer he was before coming to LA. And the Angels haven't won that much with him in the lineup.

              But one thing is clear, more runners moving around the bases, and there’s going to be more scoring. A runners skill is only involved in a close play. You should count all plays, even runners jogging into a base or home, or by walks, cause those are all on the batter. And as you said, it does seem to correlate well with both present and previous players.

              Anyway, thanks so much for your input. I really do appreciate it.
              Last edited by rocket888; 08-06-2019, 11:03 PM.

              Comment


              • #8
                I have suggested in the past that strikeouts may be undervalued because we don’t account for the fact that they use more than an average number of pitches. Rather I might suggest that players may deserve some sabermetric value from using up more pitches from any mechanism such as foul balls but it would have to be separated from the value of their eventual outcome. F

                Comment


                • #9
                  Originally posted by brett View Post
                  I have suggested in the past that strikeouts may be undervalued because we don’t account for the fact that they use more than an average number of pitches. Rather I might suggest that players may deserve some sabermetric value from using up more pitches from any mechanism such as foul balls but it would have to be separated from the value of their eventual outcome.
                  Someone raised that question at a FG chat--whether there was value in going deep in the pitch count--and the writer said that studies that had been done so far on that hadn't shown any.

                  Comment


                  • #10
                    Originally posted by brett View Post
                    I have suggested in the past that strikeouts may be undervalued because we don’t account for the fact that they use more than an average number of pitches. Rather I might suggest that players may deserve some sabermetric value from using up more pitches from any mechanism such as foul balls but it would have to be separated from the value of their eventual outcome. F
                    Yes, especially in this day of pitcher counts being so important.

                    I was a bit unhappy with the change to the intentional walk. I often thought that forcing a pitcher to throw 4 wide ones could throw him off his game. How many times did we see the next batter get walked unintentionally because the pitcher had lost his rhythm. And when we see a batter stay alive for 10 or so pitches, it gets the crowd into it and frustrates the pitcher. Even if he were to finally strikeout, it can make a difference. I saw one recently where a pitcher was at bat for a long time. It was quite an event. He got larger cheers than anyone else that day.

                    Hmmmm, the pitches are mostly there in the database I'm using. I'll have to look at that. Not sure how I would score it though.

                    Unfortunately, the pitch data doesn't start until the 90s, so I don't think I can use that much.
                    Last edited by rocket888; 08-13-2019, 10:01 PM.

                    Comment

                    Ad Widget

                    Collapse
                    Working...
                    X