Announcement

Collapse
No announcement yet.

Doubles Park Factors

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Doubles Park Factors

    I need some lateral thinking. Why would single-year park factors for doubles not become more predictive as they got more recent?

    That is, I ran multi-year regressions on 1B, 2B, 3B, and HR park factors, and the only odd result was the doubles. The other three had standard weights, getting more significant year by year, but for doubles, the third year back was clearly more important than the second year (not at that computer right now, but it was something like 10/10/30/20/30.) But don't think about the numbers, just try to come up with reasons why this might happen.

  • #2
    I don't do park factors that way.

    If a park hasn't changed, it's park factor shouldn't change. Wrigley Field has the same factors in 2007 as in 1954.

    Team factors are a weighted average of how many games a team played in each park in a given season. These will change from year to year.

    I ran a test of the NL from 1985-1991, seven consecutive seasons during which none of the twelve ballparks changed, there was no change in schedule, and no interlague play. If nothing changed for seven years, each of the twelve park factors should be the total of all seven seasons, with no one season being any more important than any other.

    During this seven season test, HRs had an SD of .211. Any individual season had a HR factor RMS of .149. Any two consecutive seasons had an RMS of .085, and any three consecutive seasons had an RMS of .060.

    I have a formula for calculating each park configuration's factors over the multiyear period when the configuration didn't change. The RMS of comparing each of the twelve parks seven year factor to their factor for the entire period of that configuration had a HR RMS of .071. The highest was Wrigley at .133 - 1.344 during the test period, 1.211 from 1954-2007, which is still well below the SD of .211 in the test period.

    I had similar results for other categories - BABIP, XBH=(DO+TR)/(H-HR), SI, DO, TR, HR, BB & SO. The differences between the seven year control period and the entire lifespan were too small to significantly alter any rankings between parks.
    Last edited by StillFlash; 03-30-2008, 11:32 PM. Reason: grammar
    Baseball Prospectus articles
    FanGraphs articles
    MVN Statistically Speaking articles
    Seam Heads articles

    Comment


    • #3
      Originally posted by StillFlash View Post
      If a park hasn't changed, it's park factor shouldn't change. Wrigley Field has the same factors in 2007 as in 1954.
      StillFlash, I'm not sure I buy this. If a park factor represents the effects of a park relative to a neutral park, then isn't it effected by changes in the "neutral park". Basically, what I'm saying is that if all of the other parks have swapped for something more hitter-friendly (I'm assuming this is mostly the case), wouldn't that lessen the relative impact of Wrigley?

      Comment


      • #4
        Right, it depends. If you force the average park to be "1.00" each year, then Wrigley has to change to reflect its relative value to the parks in the league.

        But, you don't have to make the average park 1.00 each year. You can make it relative to a 1975 park if you like.

        Two other points:
        1 - Weather. It's possible that the weather pattern of 2007 in Chicago is more indicative of what will happen in 2008 than the weather of 1957 in Chicago.

        2 - Players. The kind of players playing in Wrigley in 2007 is not the same as in 1977 or 1957. So, because you don't have the same representative group, some type of group may over or under leverage certain aspects of the park, relative to how they approach other parks in the league.
        Author of THE BOOK -- Playing The Percentages In Baseball

        Comment


        • #5
          Originally posted by Tango Tiger View Post
          Right, it depends. If you force the average park to be "1.00" each year, then Wrigley has to change to reflect its relative value to the parks in the league.

          But, you don't have to make the average park 1.00 each year. You can make it relative to a 1975 park if you like.
          That's an interesting point. I was working under the assumption that we were forcing that average park to 1.00 each year, as I suspect that is the typical approach. However, of you do take the other approach, where you make everything relative to a specific year, aren't you then at risk of measuring things that aren't necessarily the effect of the parks, e.g. changes in rules (strike zone, height of mound, etc), changes in the ball, changes in the players (fill in your own reason here)?

          I'm wondering which of the two approaches that Still Flash is taking?

          Comment


          • #6
            Just about every one takes the 1.00 approach.

            And to answer your question: no. It's just like a year-to-year aging analysis. You look for the 26 parks that are the same, and that becomes your "control". That group of 26 parks becomes "1.00". I dunno, let's say 1975-1976. (I have no idea which two years had no turnover in parks, but work with me here. This is just an illustration.) Anyway, now when you look at 1976-1977, Montreal drops off and you are comparing each stadium in 1977 to the 25 remaining. You already know what the park factor was for those 25 stadiums in 1975-76. That's your baseline. You just keep chaining along.
            Author of THE BOOK -- Playing The Percentages In Baseball

            Comment


            • #7
              Originally posted by Tango Tiger View Post
              Just about every one takes the 1.00 approach.
              I think part of this is a failure to distinguish that there are two numbers, the park factor, which refers to the ballpark, and imo should not change as long as that ballpark does not change, and then the factor for each team, which is a weighted average based on how many games the team played in each ballpark each season. The team factor will change whenever any one ballpark changes, or when the schedule changes.

              Fenway was the 2nd easiest park to homer in the the 1977 AL, but was the toughest in 1998 - but Fenway didn't change, all the other parks did. So why then insist on the mean of all parks being 1.00 each season? You could have a league full on bandboxes, or one full of Astrodomes. They shouldn't average out the same.

              So far, I've used a mythical neutral ballpark which is simply the mean of all the stats in the Retro years 1956-2007
              Baseball Prospectus articles
              FanGraphs articles
              MVN Statistically Speaking articles
              Seam Heads articles

              Comment


              • #8
                Right, that's what I'm saying. You can just choose a mythical park, and compare everything to that. Or, simply compare everything to Fenway, 1975. Or Coors, 1995. It doesn't matter, as long as you choose one constant.
                Author of THE BOOK -- Playing The Percentages In Baseball

                Comment


                • #9
                  So at the end of the day do you still have a different set of park factors for 1975 Fenway vs 2005 Fenway?

                  This approach definitely seems like it would be much more difficult to calculate. Is there a resource somewhere that accurately documents all of the changes that have been applied to parks over the years?

                  Comment


                  • #10
                    Originally posted by weskelton View Post
                    So at the end of the day do you still have a different set of park factors for 1975 Fenway vs 2005 Fenway?
                    The Red Sox factors vary depending on their schedule and whether the road ballparks have changed, but Fenway factors are the same for all years.

                    What I have done so far is to select a "benchmark" ballpark for each league, being Wrigley and Fenway, as they haven't changed during the study period. For the lifetime of each ballpark configuration, the home/road ratios are compared to how the benchmark park scored during the same period, and then scaled against the all-time value.

                    For example, Fulton Co Stadium V02 (1977-1982) had a raw HR factor of 1.791, Wrigley from 1977-1982 was 1.296, but 1.211 all-time, so Fulton Co's adjusted rating is 1.673.

                    This seems to work very well with more than two or three seasons worth of data. I'm working on revising the formulation to get a better handle on one-year configurations, looking at head-to-head comparisons of ballparks, and then getting a weighted mean of the resulting ratios, instead of relying on just one benchmark, as the benchmark can have it's own fluctuations in the short run. Variances are greatly reduced by increasing the "reads" you are comparing to. Wrigley has shown to be own of the "flakier" ballparks, probably due to weather. Shea has been exceptionally stable, and has only had one comfiguartation, but cme in a few years later, and is going away soon.

                    Originally posted by weskelton View Post
                    This approach definitely seems like it would be much more difficult to calculate. Is there a resource somewhere that accurately documents all of the changes that have been applied to parks over the years?
                    I'd at least say it takes a little more imagination. I'm now converting from using season summaries as input to Retro event files. I've also done AAA since 1998, and almost almost finished with AA.

                    Here's what I have right now. I need to add rh/lh for 2006-07, the AA ballparks need leveled between leagues, and some older minor ballparks need rated.
                    http://http://spreadsheets.google.com/pub?key=pLg_vfW0QCD-unIb3umdfYw

                    KJOK has compiled a configuration table as part of his database. I can't guarantee that's it's 100% complete or accurate, but it's still a very good resource. I've imported his ballpark and configuation tables and Lahman's player table.
                    Last edited by StillFlash; 04-02-2008, 12:54 AM. Reason: add link to spreadsheet
                    Baseball Prospectus articles
                    FanGraphs articles
                    MVN Statistically Speaking articles
                    Seam Heads articles

                    Comment


                    • #11
                      Right, KJOK's Parks database is the best around. Join the yahoo group KJOKbaseball.

                      And Fenway HAS changed. And Wrigley HAS changed.

                      Fenway added structures that changes the wind pattern.

                      And Chicago wind patterns are hardly constant each year.

                      A "park" is the combination of its dimensions, the structures, and the climate.

                      Read this:
                      http://www.tangotiger.net/parks.html

                      Related to what we are discussing is this:
                      http://www.tangotiger.net/parks2.html

                      If you are looking for something static, your best bet is a domed stadium, where the fences aren't moved around.
                      Author of THE BOOK -- Playing The Percentages In Baseball

                      Comment


                      • #12
                        Other things that can affect "park" - the length of the grass of the infield (teams cut it long or short to maximize the perceived home team strengths), seats up on the green monster, the number of home games in April vs. August

                        Comment


                        • #13
                          From my link:

                          2 - dynamic conditions: the weather, the cut of the grass, the wetness of the field, the wind, etc, etc. You should use multi-year if these things are predictable-dynamic, but single-year if they are unpredicatable-dynamic. If the groundskeeper is the same guy, and cuts the grass kinda the same way, then use multi-years. The wind patterns probably change drastically, so you should use single-year.
                          Author of THE BOOK -- Playing The Percentages In Baseball

                          Comment


                          • #14
                            The point here is to try to get a fairly accurate number with which to normalize a players stats. I've run my numbers to 2 or 3 decimals, but realistically 1 might be sufficient. All of these points on what can cause variations are valid, but do what extent do they actually influence the numbers.

                            I ran the numbers for each year for each park, then compared them to the configuration table in the KJOK database. A handful of times KJOK had a change listed, but I couldn't see a change in the stats (Dolphins Stadium 1993-2000 is virtually identical to 2001-2007). Another handful of times there was no change listed in KJOK, but I see one in the stats (I have a recollection that Comerica was shortened in 2000, and it shows in the stats, but there's no change in listed KJOK). After that analysis I assigned version numbers to the different ballpark configurations.

                            Do you have a date for the structures in Fenway?

                            Wind can be a factor - after getting that query dilemma figured out, I've been playing with some event data. One of the queries is batting data, by ballpark, by direction of wind, for 2006-2007 so far. Most parks have a 20-30% change in HR rates, but predictably, Wrigley's rates more than double when the wind is blowing out to cf. I'll add wind speed and see what the numbers look like then, plus adding more years.

                            Recently I added XBH, the pct of base hits that go for extra bases. Highest number was Kaufman Stadium, 1973-1979, at 1.230. The lowest number was Dodger Stadium 1973-1982 at 0.744, and they have never been higher than 0.857 - I would guess high grass in the outfield to go along with far fences that allow the outfielders to play deeper, thus more easily cutting off balls in the gap and down the lines. I plan on grouping the turf stadiums and grass stadiums together to get a standard XBH rate for each, and then see how much each park varied from the norm for their type of surface.
                            Baseball Prospectus articles
                            FanGraphs articles
                            MVN Statistically Speaking articles
                            Seam Heads articles

                            Comment


                            • #15
                              I'll also suggest a couple of articles at Hardball Times that looked at climate. You can get the links from my blog here:
                              http://www.insidethebook.com/ee/inde...ategory/Parks/

                              Mar 2008
                              Jan 2007
                              Oct 2006
                              Jun 2006 (Coors humidor)

                              I don't know when Fenway was changed, but I'm sure there's plenty of Sox fans around that can help you.
                              Author of THE BOOK -- Playing The Percentages In Baseball

                              Comment

                              Ad Widget

                              Collapse
                              Working...
                              X