Announcement

Collapse
No announcement yet.

Trouble with Park Factor

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trouble with Park Factor

    Park Factors are a flawed mechanism when used to compare individual players from different teams. PF is a measurement of totals runs at home compared to total runs away. Yet this number is then used to augment individual players batting lines. Stats like OPS, AVG, Home Runs, Runs and RBI's. As well as the doubles, triples, hits, and walks. Of course the trouble with doing this is that it assumes that all these different events are affected by the park the same as total runs and that individual players all basically accumulate stats the same way. In otherwords if a park increases scoring by 10% then it also increases home runs by 10 percent, as well as hits, doubles, so on and so on. And that individual players will hit 10% more home runs at home then on the road.

    By looking at the components we see that it isn't even close to the truth. Looking below at the chart the first stat listed is the traditional Park Factor. The one that is gathered by comparing runs at home against runs away. The next is for home runs done the same way, then hits, doubles, triples, and walks.


    Code:
    Park Name	ParkFac	HRFact	HitFac	2BFac	3BFac	BBFac
    Rockies	         121%	112%	112%	116%	133%	109%
    Rangers	         111%	104%	105%	104%	133%	100%
    White Sox        107%	120%	105%	99%	94%	101%
    Blue Jays	 106%	106%	103%	104%	110%	105%
    Cubs	         106%	116%	102%	100%	96%	99%
    Red Sox	         106%	99%	105%	117%	80%	99%
    Orioles	         104%	103%	101%	102%	85%	98%
    Giants	         103%	95%	104%	105%	141%	97%
    DBacks	         103%	115%	103%	106%	115%	101%
    Twins	         102%	96%	101%	98%	87%	97%
    Brewers	         102%	99%	99%	108%	112%	106%
    Phillies	 101%	107%	99%	93%	118%	100%
    Athletics	 101%	104%	99%	99%	84%	103%
    Astros	         100%	104%	100%	95%	122%	96%
    Braves	         100%	106%	99%	97%	87%	101%
    Mets	          99%	90%	102%	99%	73%	99%
    Angels	          99%	103%	101%	93%	79%	98%
    Indians	          98%	87%	98%	106%	89%	106%
    Cardinals	  97%	90%	100%	102%	113%	103%
    Tigers	          96%	94%	100%	91%	142%	99%
    Yankees           96%	101%	98%	94%	77%	97%
    Pirates	          96%	94%	99%	104%	90%	95%
    DRays	          96%	100%	97%	95%	102%	101%
    Royals	          96%	85%	99%	97%	121%	99%
    Dodgers	          95%	101%	98%	88%	80%	92%
    Expos	          95%	93%	97%	107%	94%	100%
    Marlins	          95%	99%	98%	97%	101%	103%
    Reds	          92%	102%	95%	93%	77%	94%
    Padres	          92%	85%	95%	95%	126%	102%
    Mariners	  92%	102%	93%	102%	74%	102%

    Take the Red Sox as an example, last year they had a PF of 106. Most people would then use that number to adjust a players OPS and home runs totals. Saying something like Player A batted .320 with a OPS of .950 with 40 home runs, but once we adjust it he is now batting .302 with an OPS of .896 and 38 home runs. But by looking at the components we see that the players HR totals should not have been adjusted (99 HR factor), that his hits were not increase by 106 but by 105 and that his walk total was increased by 106 but decreased by 99. Last year before park factoring this player would have an OPS roughly 24% better then the league, after PF it would be 17% better then the league. But looking at his components we would see that his OPS would be at least 20% of the league if we were to adjust each individual stat based on the park factors for each individual stat.


    Of course all this ignores the other obvious flaws in PF which are that it ignores what side of the plate you bat on and whether or not the individual batter actually played the same ratio of home and away games that his team did and that he played every game. For example if you have a player who played 130 games and missed 3 games at Coors Field, 3 games at Arlington, and 6 games at the Cell his park factor adjusted numbers are going to be radically different the his teammates who did play those games.

  • #2
    Park Factors

    Well,
    Personally I'd like to see deeper center fields. Why? More triples and in-the-park homeruns. Stretching doubles into triples is exciting to watch and with today's smaller parks, as compared to the deeper ones of earlier eras, (case in point Yankee Stadium) you just don't see the three bagger enough. A stand-up triple! Wow! A bases clearing triple..will he go home or stay at third? This adds to the excitement of baseball.

    Comment


    • #3
      Cubbie,

      So what would you rather do? Just leave it unadjusted and call a season at Coors the same as one at Fenway, and the same as one at Dodger Stadium?

      Like anything else, park factor isn't perfect. But it's much, much better than nothing.
      "Simply put, the passion, interest and tradition surrounding baseball in New York is unmatched."

      Sean McAdam, ESPN.com

      Comment


      • #4
        Originally posted by ElHalo
        Cubbie,

        So what would you rather do? Just leave it unadjusted and call a season at Coors the same as one at Fenway, and the same as one at Dodger Stadium?

        Like anything else, park factor isn't perfect. But it's much, much better than nothing.


        Yes, ballpark factors are not perfect, just as any other stat.

        Without a ballpark factor, Todd Helton and Larry Walker would be ranked close to Bonds, Ruth, Gehrig. Without a ballpark factor, all time lists would be skewed and have tons of asterisks and symbols for explanations.
        Last edited by antihipster; 02-14-2005, 08:56 AM.
        unofficial Cardinals
        Playing HardballUpdated 12-06-07

        Congratulations Cardinals in 2006 World Series
        Winners in 1926, 1931, 1934, 1942, 1944, 1946, 1964, 1967, 1982, & 2006

        Comment


        • #5
          Originally posted by ElHalo
          Cubbie,

          So what would you rather do? Just leave it unadjusted and call a season at Coors the same as one at Fenway, and the same as one at Dodger Stadium?

          Like anything else, park factor isn't perfect. But it's much, much better than nothing.

          Is it much better then nothing?

          Take a look at the Rockies PF. It is 121. Yet home runs and hits are 112. So reducing SLG by 121 would be wrong. In fact it would be way off. How is a number that has an air of authenticy but is in actuality horribly wrong better then nothing. At least with nothing the viewers knows the number is not honest.

          What I expect people to do is the same thing I expect people to do whenever the want anaylze the players and the game. Which is do the work. If you are going to say that player A is better then player B don't just look at some ink scores and OPS+. Look at what type of player each one is. How that style of play is effected in his home park so on and so on.

          Whenever something like OPS+ is debated people always say they know about the limitations of Park Factor but it is the best out there or something along those lines. Yet when it comes time talk about players they always say things like Player A has an OPS+ of 117 and Player B has an OPS+ of 112 so Player A is better. They will spout that off and often they won't even know what the park factors were for each player. They are just spouting something they read on BRef. They don't know if the park factor is even accurate to the individual players. They ignore it all and blindly follow what is written on the page and just go by what is the larger number. It could be that the OPS+ 117 is unfairly getting a bonus while the OPS+ 112 is unfairly getting a penalty. Or it could be the opposite and the difference is really much larger then that.

          In the end Park Factor measures game runs at home versus game runs away. Why somebody would use that to adjust individual players individual stats is beyond me. To me that is like trying to use a Canadian dollar to buy something in America. Yes the Canadian dollar has value but not in that environment.

          Comment


          • #6
            Originally posted by cubbieinexile
            Is it much better then nothing?

            Take a look at the Rockies PF. It is 121. Yet home runs and hits are 112. So reducing SLG by 121 would be wrong. In fact it would be way off. How is a number that has an air of authenticy but is in actuality horribly wrong better then nothing. At least with nothing the viewers knows the number is not honest.
            You're not understanding how park factors work.

            Yes, the PF is 121 for Coors. No, nobody (should) adjust SLG by 121. The PF is a measure of relative runs scored. Relative runs scored tends to be roughly proportionate to OBP * SLG, and thus you can use a PF for OPS with some degree of accuracy. If you're just adjusting SLG or OBP, then you wouldn't use the regular park factor, you'd use its square root. In this case, the square root of a 121 park factor is 110, so you'd use a 110 factor to adjust SLG or OBP.
            "Simply put, the passion, interest and tradition surrounding baseball in New York is unmatched."

            Sean McAdam, ESPN.com

            Comment


            • #7
              Originally posted by antihipster


              Yes, ballpark factors are not perfect, just as any other stat.

              Without a ballpark factor, Todd Helton and Larry Walker would be ranked close to Bonds, Ruth, Gehrig. Without a ballpark factor, all time lists would be skewed and have tons of asterisks and symbols for explanations.

              All time lists of what?

              The only list where PF is used is OPS+ the rest are not PF and yet there are no asterisks and no need for explanations. People are not idiots they understand environments. People know when they see Dante Bichette on a list that he achieved it with much help from Coors. People know that when they see Ed Delahanty on a list that he achieved it because he got to play baseball right when they moved the mound back. People know when they see Bob Gibson on a list that he was helped by one of the greatest pitchers eras since the deadball. Same for Sandy Koufax. There is no need to put an asterisk on a stat accrued in 1930 or in 1893. If you know what is going on you know why these numbers were achieved.


              Lets look at Barry Bonds. Barry Bonds from 2000 to to 2003 actually got a bonus and a quite substantial one becuase he supposedly played in a park that was a pitchers park. The PF was usually 91 which means his OBP and SLG were inflated by 9%. Here is his line:
              Code:
              Barry Bonds	2003	       2002	      2001	      2000
              Home	+369/569/805	+351/564/750	+335/516/915	+321/449/741
              Away	+313/485/692	+386/596/842	+321/514/817	+291/431/633
              Does it look like Barry Bonds home stats were suppressed by 18% every year. Because that is what a PF of 91 is claiming. it is claiming that offensive stats are suppressed by 18% at home.

              Comment


              • #8
                Originally posted by ElHalo
                You're not understanding how park factors work.

                Yes, the PF is 121 for Coors. No, nobody (should) adjust SLG by 121. The PF is a measure of relative runs scored. Relative runs scored tends to be roughly proportionate to OBP * SLG, and thus you can use a PF for OPS with some degree of accuracy. If you're just adjusting SLG or OBP, then you wouldn't use the regular park factor, you'd use its square root. In this case, the square root of a 121 park factor is 110, so you'd use a 110 factor to adjust SLG or OBP.

                No you wouldn't. You are not understanding how OPS+ works. OPS+ does OBP and SLG seperately and it uses basic PArk Factor scores.
                Adjusted OPS+
                This value is calculated differently from the Total Baseball PRO+ statistic. I chose OPS+ to make this difference more clear. PRO+ as best I can tell is

                PRO+ = 100 * ( OBP/lgOBP + SLG/lgSLG - 1)/BPF

                Where lgOBP and lgSLG are the slugging and on-base percentage of a league-average player, and BPF is the batting park factor. This takes into account the difference in runs scored in a team's home and road games, so it doesn't depend on how good an offense or defense a team has.

                My method is slightly more complicated, but I think it is more correct. The BPF is set up for runs and the way it is implemented in PRO+ applies it to something other than runs.

                My method
                Compute the runs created for the league with pitchers removed (basic form) RC = (H + BB + HBP)*(TB)/(AB + BB + HBP + SF)
                Adjust this by the park factor RC' = RC*BPF
                Assume that if hits increase in a park, that BB, HBP, TB increase at the some proportion.
                Assume that Outs = AB - H (more or less) do not change at all as outs are finite.
                Compute the number of H, BB, HBP, TB needed to produce RC', involves the quadratic formula. The idea for this came from the Willie Davis player comment in the Bill James New Historical Baseball Abstract. I think some others, including Clay Davenport have done some similar things.
                Using these adjusted values compute what the league average player would have hit lgOBP*, lgSLG* in a park.
                Take OPS+ = 100 * (OBP/lgOBP* + SLG/lgSLG* - 1)
                Note, in my database, I don't store lgSLG, but store lgTB and similarly for lgOBP and lg(Times on Base), this makes calculation of career OPS+ much easier.
                That was from the glossary of BRef. It claims Total Baseball Park Factors at the end at it claims that they (BRef) park factor the run environment and then assume that everything else happens at the same rate. Which obviously it doesn't.

                Comment


                • #9
                  Originally posted by cubbieinexile
                  No you wouldn't. You are not understanding how OPS+ works. OPS+ does OBP and SLG seperately and it uses basic PArk Factor scores.


                  That was from the glossary of BRef. It claims Total Baseball Park Factors at the end at it claims that they (BRef) park factor the run environment and then assume that everything else happens at the same rate. Which obviously it doesn't.
                  Yes, it does assume that everything happens at the same rate, but for much simpler purposes than you're claiming. It assumes that so that a park factor for OBP and SLG can be used independantly of having to do one for BB, H, TB, etc. While this isn't precisely true, it's not going to be horrendously off either.

                  And what I said was indeed true. Look at the formula for OPS+ again.

                  100 * (OBP/lgOBP + SLG/lgSLG) / PF.

                  Note that it uses relative OBP + relative SLG. Roughly speaking (again, this is all rough work, no exacts used here), OBP + SLG will be distinctly proportionate to OBP * SLG. I.e., as one goes up, the other goes up, pretty constantly.

                  Now. We know that park factors will be fairly accurate at determining OBP * SLG differences for a given park (not exact, again, but fairly accurate). We know that OBP + SLG will be roughly proportionate to OBP * SLG. Therefore, we can say that PF will be fairly accurate at determining OBP + SLG.

                  However, it's not accurate at determining OBP or SLG, independantly of each other. Since the park factor works with runs, and the runs are roughly proportional to OBP * SLG, then for a PF`, the park factor of just OBP, say, we'd need to take (PF / SLG) to get an accurate measure. Since OBP and SLG will tend to be roughly equal (again, this is all in approximates, but a .350 OBP is numerically a rough equivalent of a .450 SLG), we take a square root of the PF to get PF`.

                  You follow? So yes, it assumes, for one small part of the math, that hits and walks will change at the same rate, which they won't. However, we NEVER use that 121 park factor for Coors' to determine an SLG factor. Note where he talks about using the quadratic formula to compute lgOBP* and lgSLG*. This is not the same as the formula used for lgOPS*.
                  "Simply put, the passion, interest and tradition surrounding baseball in New York is unmatched."

                  Sean McAdam, ESPN.com

                  Comment


                  • #10
                    Actually BRef does not claim the other events happen at the same rate like they used to. It appears that BRef changed its OPS+ within the last year or so. They used to go by the Total Baseball method, and they used to Park Factor OBP and SLG seperately. I know because I took part in the discussion on Baseball Primer when he was setting up OPS+ for the first time for his site.

                    What they do now is park factor the runs created of the league average. They then use this difference to create a stat line. It cannot go up or down by the same rate because there is not a one to one relationship between hits and runs created. So they use a quadratic formula to find out how many offensive events must occur to make the new runs created. Once they got that they can figure out LgOBP and LgSLg.
                    Last edited by cubbieinexile; 02-14-2005, 10:22 AM.

                    Comment


                    • #11
                      Originally posted by ElHalo
                      Yes, it does assume that everything happens at the same rate, but for much simpler purposes than you're claiming. It assumes that so that a park factor for OBP and SLG can be used independantly of having to do one for BB, H, TB, etc. While this isn't precisely true, it's not going to be horrendously off either.
                      How so? Coors park factor is 121. Coors walk factor is 133. Coors home run factor is 112.

                      The data is now out there why shouldn't we look at each stat seperately? Why should we assume that a park effects lefties and righties equally?

                      Look at Dodger stadium it hurts everything except Home Runs, don't you think that power fly ball hitter will be effected by the park differently then a contact line drive hitter? The power hittesr slugging is not going to be effected much if all he does is walk and hit homers (McGwire type) while a player like Tony Gwynn is going to see his stats effected much more. In fact it is entirely possible for McGwire to hit more home runs at Dodger park then he would have hit at Busch stadium. McGwire and his style of play is least likely to be effected by Dodger stadium yet he would get the same bonus as Gwynn.

                      Comment


                      • #12
                        The more I look at BRef's new OPS+ the more concerned I get. They take a park factor based on actual runs then use that on hypothetical runs. Runs Created is generally around 5% or more off on actual runs. It would be interesting to see what the Park Factors would be if we used runs created instead of actual runs. How much of a difference there would be or if there is any difference at all.

                        Comment


                        • #13
                          I just crunched some numbers for Coors field last year and found some things out. Coors field had a OBPfactor of 107. a SLGfactor of 110 and a run factor of 121.

                          So for instance if we were to figure OPS+ using the most common method out there, which is the Total Baseball method (100 * ( OBP/lgOBP + SLG/lgSLG - 1)/BPF) then Helton's OPS+ would be 156.

                          It would be 156 because the park factor is based on runs and not the components. If we were to use the component PF's Todd Heltons OPS+ would be 166.

                          To me that is a big difference one that should not be ignored. By taking the easy way out the numbers are artificially lowered by 10 points.
                          Last edited by cubbieinexile; 02-15-2005, 09:01 AM.

                          Comment


                          • #14
                            I took at what the park factor would be for Coors if we only used a basic runs created formula. The Park Factor would be 122 compared to 121. Not a big deal, though I have no idea if this is consistent for every team. So far I have only ran numbers for the 2004 Rockies.

                            BRef's new way of doing OPS+ has intrigued me. I had no idea they switched around their formula. They must have done it only a few months ago. I like that they are not only comparing players OPS versus other hitters and not pitchers. I think that is a plus. For instance Todd Helton using the old method has an OPS+ of 156. Only comparing Todd to other hitters he has an OPS+ of 148.

                            The runs created part I was skeptical at first but it seems to me that it is more accurate the Total baseball way. Using Runs created to come up with a park adjusted OBP and SLG gets you this.
                            OBP: .368
                            SLG: .475

                            Using the real data accrued during the season gets you a park adjusted league average of this:
                            OBP: .365
                            SLG: .481

                            Not bad. Using the runs created method gives Todd Helton and OPS+ of 158, and using real data give hims an OPS+ of (drunroll please) 158. The same, but that is of course luck. Or at least I think it is. It just so happens that one went up enough and the other went down enough to cancel each other out. Normally there is going to be a couple point difference in the two numbers. Possibly more, it depends if runs created always over-values OBP and undervalues SLG, if it doesn't meaning that it is possible that RC undervalues both or overvalues both at the same time the difference can get even higher. I haven't done enough numbers yet to see what exactly happens. Anyway if we were to do this with Vinny Castilla his RC OPS+ would be 103 and his real data OPS+ would be 102. Total Baseballs OPS+ would 105

                            Does this mean I like OPS+ now? No it doesn't. What I am showing is that PF is highly subjective. Using Total Baseballs method we get a numbers that is much different then the other methods. Even if we adjust TB's method by only not including the pithcer we get a number that is off by a few points from both methods as well. TB's OPS+ for Vinny would be 99, Heltons would be 148. One off by 3 to 4 points the other off by 10 points. Using the newer Quadratic formula method is better but it still come ups with different numbers. Finally park factors even the ones I am using still ignore the platoon factor and the playing time factor. Last year Vinny did not play in 14 games, 57% of those games missed were road games. What about somebody who played even less. Like Matt Holliday who missed 41 games? Who got to play more home games then his team. What about players who miss the first month of the season versus players who miss the last part of the season?
                            Last edited by cubbieinexile; 02-15-2005, 10:58 AM.

                            Comment


                            • #15
                              Originally posted by cubbieinexile
                              The data is now out there why shouldn't we look at each stat seperately? Why should we assume that a park effects lefties and righties equally?
                              When you use an overall run factor, you are not assuming that the park affects everybody equally. OPS+ is not designed to measure what a player would do if magically moved to a neutral park. Instead, it compares what the player did to what an average player would do in his own park. It measures the value of his performance; using an overall park factor adjusts the value of the runs the player created.

                              If you are attempting to make projections for how players will do after changing ballparks, then I agree that component factors are better (especially those separating lefties and righties). But it is not inherently wrong to use overall park factors.

                              Comment

                              Ad Widget

                              Collapse
                              Working...
                              X