No announcement yet.

Introducing The Fiato-Souders Intrinsic Analysis Matrix

  • Filter
  • Time
  • Show
Clear All
new posts

  • Introducing The Fiato-Souders Intrinsic Analysis Matrix


    One of the first missions of almost every sabermetrician is to determine a preferred strategy for rating the performance of baseball teams and players while keeping in mind the many complicating factors that distort statistics like wins and losses and run differentials.

    There is a host of available data today that makes analysis of teams possible, but some understanding of the dynamic way in which those statistics combine to produce wins and losses is required, and this is not a simple matter. Empirical analysis has for years centered on the idea that averages tell enough of the story to be used as the backbone of any system designed to adjust raw statistics to account for the context in which they occured. This document will explore the problems with empirical sabermetrics and introduce a new tool designed to bridge the gap between the intrinsic skill of the players, and the real world statistics that define them.

    REVIEW OF EMPIRICAL METHODS (empirical or traditional analysis includes my own work in the field...proto-PCA for example)

    Up until this moment, all documented analyses of player and team value has proceeded in a straight forwad, logical fashion, going from point A to point B to point C in order.

    A) Rate the offensive context of the league.

    This has commonly been done through some variant of looking directly at the league average run scoring rate. If 10,000 runs scored in 2500 games, then the assumption was made that it was a 4 R/G league...that the other factors would essentially cancel out and that the average scoring rate would be fully explanatory.

    B) Rate the offensive context of the park as it relates to the league.

    Even in the most sophisicated of modern traditional park adjustments, this boils down to a direct comparison of the scoring rate in a given park (the home team and its' opponents combined) and the scoring rates of that team and its' opponents on the road. The best of methods is iterative...adjusting and readjusting to account for the effect each park has on the net park effect of "the road"...but these are not commonly used or published What is typically available at or any source for baseball statistics is a simple ratio between the run scoring rate of the home park and the run scoring rate of everyone else.

    C) Combine the league and park contexts to come up with an average expectation to produce runs for each player and team.

    Because of the way traditional park factors are calculated (a ratio...the most natural thing to do with two sets of data that the statistician is trying to compare) the normal method for blending league and park statistics into one number (a R/G or R/O or R/PA type statistic) is to multiply the league scoring rate by the park adjustment for each team and use that as a basis for comparison.


    1) Missing Elements

    There has always been an assumption in empirical sabermetrics that the variation in run scoring at the league level can be entirely explained by the league. If the league scores five runs a game compared to an average league that scores 4.75 runs a game, there is an implicit assertion made that the league and only the league is responsible for that change.When you put the league context with the park adjustment, it is assumed that those two things combine to fully explain what we should expect from average talent in the same conditions.

    There's a serious problem with that claim, however. It is fairly evident by just taking a quick glance at the rosters of the teams as the years pass that balance of talent changes. Some years the pitching is a little better than others. Some years the hitting is a little better. It should be pretty clear looking at the rosters from 1999 that the hitter was better than the pitching.

    Top Hitters from 1999 in no particular order

    Alex Rodriguez
    Sammy Sosa
    Mark McGwire
    Barry Bonds
    Ken Griffey Jr
    Edgar Martinez
    Jason Giambi
    Manny Ramirez
    Vlad Guerrero
    Mike Piazza...etc etc

    Top Pitchers from 1999 in no particular order

    Pedro Martinez
    Roger Clemens
    Greg Maddux
    Randy Johnson
    Curt Schilling
    Kevin Brown
    Mike Mussina
    ... ... ... ... uh ...

    Obviously I'm leaving some names off and some of you reading this can fill in both lists with more detail, but it seems clear to me that there was a greater depth of hitting talent in 1999 than there was pitching talent.

    To assume that 1968 was a great defensive year only because the league made it easier is to rip off the incredible depth of pitching and fielding talent and give too much credit to a mediocre crop of hitters by major league standards.

    When park factors are calculated, there are two elements that are commonly forgotten and ignored.

    A) The opponents a team faces are not necessarily neutral competition. The late 90s Cleveland Indians did not face a league average offense overall when they played their games at Jacobs Field. Those Indians were a cut above the rest with the bat, which means the rest were a cut below normal by definition. Traditional park factors make no effort to account for this.

    B) Players make adjustments to the parks in which they play. Some players do this better than others, but the personnel that play in any given park have a direct impact of how that park APPEARS to play (how offensively friendly it is). Some front offices do a great job acquiring players that maximize their potential because they are good matches for the home park. The 1998 Yankees had a lot of left handed hitters up and down the line-up turning a normally neutral park into a hitter's haven...for the Yankees. The 2001 Mariners filled their outfield with defensively gifted players and loaded up on flyball pitchers to take advantage of the dead air in center.

    2) Runs are cumulative, not multiplicative.

    Traditional analysis as we have covered above includes a step where the league context is multiplied by a park adjustment to come up with a new expectation to score runs.But contexts shouldn't be multiplied like that. If the park makes the offensive environment more conducive to run scoring, it does not do so by multiplying the does so by ADDING runs to the scoreboard. Additive adjustments are less prone to the vagueries of small sample sizes, generally more stable, and more intuitive. They also translate more logically to player level analysis. If the park is adding a run per game (27 outs) one can easily see how it effects the players that player there. A multiplicative park factor will effect higher run scoring contexts more severely than lower scoring periods. If the league average R/G is 4 and the park factor is 120, then it we are claiming it increases run scoring by 20 percent (0.8 runs). If the run scoring environemtn is changed to 5 R/G, the park didn't change at all, but one of two things is true...either the multiplicative factor remains 120 (and the park therefore adds 1 R/G)...or the amount the park adds to run scoring doesn't change (and the multiplicative factor drops to 116).

    It seems evident to us that the park effect should not depend on the offensive environment...the park has the impact it has...whether it's the deadball era or the rabbitball of 1930, if the park adds a run a game, it does it either way (unless of course it's doing that run adding by being homer friendly in which case it's not likely to hekp deadball hitters much!).

    3) The Denominator is Wrong

    Traditional contextual analysis includes adjustments that take the form of a series of fractions of the form (Runs per Run). Park factors are formed by a ratio of runs scoring rate at home over run scoring rate on the road. League contexts are set in essence by the ratio of the league's run scoring rate to the all time average scoring rate. The contexts themselves are not attached to anything...they're unitless multipliers that blow up with small sample sizes. In reality, any context has an increasing impact on a player or team the longer they spend in that context. And as we all know, time in the game of baseball is measured in outs, or when outs are not available, games. The denominator of any contextual adjustment should take the form of R/Out or R/G.As soon as you change the denominator to that form, it becomes very easy to see that contexts add together to explain run scoring changes.

    4) The elements that come together to explain runs are completely inter-related.

    Traditional sabermetric analysis proceeds from A to B to C, without stopping to fully appreciate how dependent each step of their analysis is on the steps that come before and after it. In order to know how the league impacted scoring, we need to know how the parks, the teams, and the players impacted the order to know how the parks impacted scoring, we need information about the league, the teams, and the players...etc. What is needed is some sort of system of equations where each variable is considered as it related to the others.


    As noted earlier, sabermetricians fight a constant battle with small sample sizes. Even a full major league season includes match-ups that only recur 6-12 times between pairs of teams. Getting information from these match-ups requires a more useful method than simply taking the statistics at face value.There is a wing of statisical analysis known as Bayesian probability. The general idea behind Bayesian Probability is that we cannot assume we have seen an entire distribution simply because we have all of the available data. Just because two teams face each other 10 times and one of the teams wins all ten, doesn't mean there is a 100% probability that the successful team will win the next game or that if those ten games were replayed under identical conditions the results would be the same.

    The Bayesian model starts with the assumption that every team, every park, every league is average and forces the statistics to prove or disprove this assumption, one run at a time. This methodology is the driving force in our analysis and the idea came to us (myself and Randy Fiato...a programmer of great skill and tenacity and a budding sabermetricin in his own right) by way of a Dr. Colley of Princeton University, who used Bayesian probability to mathematically explain the success of college football teams and rank them (his matrix is still used today as a part of the BCS ranking system). His system is somewhat simpler, because all football fields are the same dimension, he doesn't have to deal with a changing timeline, and his method deals only with ordinally ranking football teams so a certain level of precision is not necessary to achieve the desired accuracy in the rankings. But beyond cosmetic differences, our approach relies on the same central theory - the law of succession.

    Rather than assume that without any data present, no conclusion cam be drawn, we assume that in the absense of data, one conclusion MUST be drawn...that being that future events will occur at the average pace until proven otherwise.


    The unifying idea behind the Fiato-Souders Intrinsic Analysis Matrix can be summed up in one equation.

    For any team: (ARSPG + OIRAAPG - DIRAAPG + LIRAAPG + PIRAAPG + OPR - DPR) = Actual Runs Scored - Actual Runs Allowed (both of which could be accurately predicted using only the componants that apply)

    Holy Acronyms, Batman! I think we need a decoder ring!

    ARSPG -> Alltime Runs Scored per Game (per side)...this turns out to be approximately 4.76 Runs/Game/Side excluding 1871-1875 which were not even slightly major league calibar baseball and would unnecessarily throw off the alltime scoring average. All additional contextual adjustments are relative to an "average" league.

    OIRAAPG -> Offensive Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average this team's offense could be expected to score in an average league agaisnt average competition in a neutral park.

    DIRAAPG -> Defensive Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average this team's defense could be expected to allow in an average league against average competition in a neutral park.

    LIRAAPG -> League Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average this league would result in given neutral parks and average players and teams.

    PIRAAPG -> Park Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average would score in this park given average teams and players, and an average league.

    OPR -> Offensive Park Reactions...this term represents how the offensive players on each team did relative to what would be expected of them given the intrinsic strengths of the parks in which they played. This is a little harder to put in words, but to put it as simply as possible...if the park favors pitchers, and your team has found a way to score at an above average clip, you're doing something that is not statistically expected of you and that needs to be accounted for separately.

    DPR -> Defensive Park Reactions...same as above only with defensive players (pitchers and fielders).

    That sounds like a lot...but here's how it goes together. Each individual variable in the matrix is reported (these variables include the team's unique reactions to each and every park in which they at a time...each park...the league as a whole...and the offenses and defenses of each team in the majors in a specific year and league) and placed in a linear equation where every other variable upon which it depends is set to an all time average (the Law of Succession...the average assertion). This is done for every variable in the history of the game...and those variables number greater than 1,000 in each year of the modern era.

    Those equations are placed on the left hand side of a linear system of equations. They're set equal to the real world results on the right hand side (in the league row, the runs scored in that league would be the intrinsic offense row, that team's runs scored would be recorded...etc). This system of equations can be solved using matrix algebra and what comes out the other end of that process is a set of results explaining each variable.


    1) What do the results look like?

    Top Fifty Teams since 1900
    (In terms of Intrinsic Run Differential per Game)
    Year	Team	InRD/G
    1939	NYA	2.265
    1927	NYA	2.197
    1902	PIT	2.000
    1936	NYA	1.945
    1931	NYA	1.869
    2001	SEA	1.790
    1998	NYA	1.772
    1906	CHN	1.723
    1937	NYA	1.702
    1929	PHA	1.644
    1905	NY1	1.592
    1932	NYA	1.586
    1942	NYA	1.567
    1944	SLN	1.530
    1942	SLN	1.525
    1904	NY1	1.504
    1935	CHN	1.492
    1953	BRO	1.492
    1931	PHA	1.490
    1969	BAL	1.489
    1901	PIT	1.475
    1903	BOS	1.452
    1911	PHA	1.448
    1998	HOU	1.447
    1934	DET	1.441
    1948	CLE	1.439
    2001	OAK	1.410
    1975	CIN	1.401
    1921	NYA	1.369
    2002	ANA	1.365
    1935	DET	1.359
    1995	CLE	1.351
    1938	NYA	1.339
    1974	LAN	1.336
    1912	BOS	1.336
    1912	NY1	1.330
    1949	BRO	1.330
    1998	ATL	1.328
    1910	PHA	1.325
    1942	BRO	1.321
    1932	PHA	1.315
    1909	PIT	1.298
    1999	ARI	1.295
    1922	SLA	1.289
    1905	CHN	1.288
    1955	BRO	1.286
    1950	NYA	1.268
    1909	PHA	1.266
    1953	NYA	1.262
    1901	CHA	1.261
    Bottom Fifty Teams since 1900
    Year	Team	InRD/G
    1909	WS1	-1.480
    1904	BSN	-1.486
    1963	NYN	-1.514
    1940	PHI	-1.522
    1906	BSN	-1.534
    1951	SLA	-1.536
    1955	KC1	-1.540
    1935	BSN	-1.548
    1974	SDN	-1.559
    1953	DET	-1.565
    1920	PHA	-1.583
    1901	CIN	-1.586
    1948	CHA	-1.590
    1910	SLA	-1.604
    1923	PHI	-1.612
    1908	SLN	-1.618
    1979	OAK	-1.619
    1937	SLA	-1.628
    1952	PIT	-1.635
    1924	BSN	-1.636
    1926	BOS	-1.636
    1909	BSN	-1.640
    1925	BOS	-1.653
    1956	WS1	-1.676
    1969	SDN	-1.682
    1941	PHI	-1.701
    1942	PHI	-1.702
    1905	BRO	-1.719
    1928	PHI	-1.726
    1939	PHI	-1.749
    1919	PHA	-1.764
    1904	WS1	-1.767
    1921	PHI	-1.773
    1945	PHI	-1.775
    1954	PIT	-1.778
    1962	NYN	-1.783
    2002	DET	-1.784
    1936	PHA	-1.812
    1916	PHA	-1.834
    1939	SLA	-1.849
    1938	PHI	-1.873
    1911	BSN	-1.887
    1903	SLN	-1.901
    1996	DET	-1.925
    1932	BOS	-1.930
    2004	ARI	-1.940
    1954	PHA	-1.974
    1939	PHA	-2.009
    1915	PHA	-2.024
    2003	DET	-2.112
    Prior to 1900, the intrinsic un differentials start to take off in magnitude owing largely to the wildly oneven distribution of talent, unstable franchises, shorter schedules, and higher run scoring environments that make up the 19th century game, but the 1899 Cleveland Spiders..widely recognized as the worst baseball team ever to compelte a season, finish dead last among teams to play at least 80 games, with an abysmal -4.048 InRD/G...(that's 624 runs they allowed more than they scored INTRINSICALLY...they were that bad all on their own!).

    It should be noted that these intrinsic calculations included the intrinsic offenses and defenses of each team as well as the team's unique park reactions (because park reactions are a skill that shouldbe accounted for when rating the merits of teams).

    2) Benefits of the FSIA

    A) This represents the first ever system that has made an attempt to credit the players at least in part for helping to create the changes in the run scoring environment.

    Typipcally, the credit awarded to the offenses and defenses (one way or the other depending on the conditions in the league) is on the order of 50-400 Runs over the course of an entire season for an entire league, so the credit is relatively small, but certain extreme seasons like 1999 in the national league, or 1968 in the NL, or 1987 in the AL swing further (1999 for instance gives almost as much credit to the hitters as the leagues themselves for the huge spike in offensive production).

    B) Park adjustments are significantly more conservative, and stable over time compared to ratio factors currently available. When you apply a ratio factor of 120 (the Coors Field effect) to a player season, you get a rather extreme result...when you apply a cumulative adjustment of one additional run expected every 27 batting outs to the same season (the park added roughly 160-180 runs each year to the scoring from both sides combined), the park's pull on the hitter's value will be somewhat muted (though still very real). FSIA park factors are significantly less prone to wild fluctuations from season to season and reflect our belief that most parks have a very minor effect on scoring and that it's only a few extreme parks at either end of the spectrum that can really be counted on from year to year to have a certain impact. Stable park factors were made possible by switching to cumulative math, and by factoring out the unexpected fluctuations in the reactions of players to the parks (and thus neutrallizing the home-team bias problem mentioned earlier)

    C) This represents the first complete effort to separate the intrinsic abilities of teams from their contexts, while being able to reproduce real-world statistics with a high degree of accuracy. One of the problems with Baseball Prospectus's EqR statistic is that while it is a fairly aggressive attempt to put all players on a level playing field, it does not in any way model actual run scoring (it's not intended's a conceptualized ideal league environment based on the average EqA being .260), so it's not particularly useful for doing any kind of top-down win analysis (you can't use EqR to predict how many runs a team will score and allow). The FSIA not only places players on a level playing models the real world too.


    Using run differential data totalled up for each league and season, we were able to determine a series of encouraging error-statistics that we hope will make it clear that the FSIA is a highly accurate intrinsic analysis tool for use in real-world modelling.

    First we tested its' ability to accurately reproduce league run scoring results from the componants. The largest discrepencies we found when comparing real-world run scoring totals to the FSIA generated RS was 68 runs. The error range was -68 to +49. To put this in clearer terms, on a per game basis, the error range was -0.030 to +0.027 R/G. In the worst case scneario, we're talking about maybe a 1% error (more likely closer to half a percent). The root-mean-square-error (standard deviation of the error) was a mere 8.7 runs. In an average league which scores something like 8,000-11,000 runs!!

    Next we tested its' ability to accurately predict runs scored and allowed by teams. We expected a larger error here, because the fewer games you have in a sample, the more the Law of Succession will play a part in pulling that sample variable toward the mean. This model will tend to underestimate the spread of run differentials in the case of extreme teams, partially because it is a proper statistical question whether we have seen the entire distribution of outcomes when the sample is reduced in size to 162 or 154 games (in most cases), and partially because in the case of extreme teams, we begin to run into a new error source which we are working toward correcting and which will be discussed in our future research plans below.

    In any event, we did get a larger error here, but it was far smaller than even I had expected. The error range was -47 runs to +52 runs...or in terms of runs per game...-0.315 to +0.326 runs/game. In the worst case scenario we're looking at something like a 6-8% error, but this wasn't all that common.

    The RMSE for team offenses was 14.7 R and is was 20.4 R for team defenses. Given that the average team scores and allows about 770 runs over the course of major league history, the "typical" error is something more like 2-3%.

    That error shouldn't really even fully be called error, since, particularly in the case of teams with shorter schedules or extreme teams, all laws of probability suggest that a center-pull is wise (there is an increased probability that what we've seen out of a team with a shorter schedule or an extreme team is just a part of the distribution and that if those games were replayed under identical conditions, a somewhat less severe result would occur).


    Aside from random chance and the center-pull inherent to Bayesian probability, the primary problem with the FSIA is that there is one somewhat incorrect assumption required to make it work. The FSIA is a system of LINEAR equations. But we already know from research done by Bill James that teams do not combine LINEARLY to produce wins and losses...and they probably don't combine linearly to produce runs either. Teams and the contexts in which they play combine very NEARLY linearly when winning percenages of those contexts fall inside a range near .500 (.400 to .600 is considered the acceptible range of the linear assunption). The FSIA works very well for most of the variables it evaluates...but particularly park reactions, which are very small sample sizes, and prone to random fluctuations that make them appear extreme and therefore force them to fall outside the range where the linear assumption holds, and extremely good and poor teams, are sometimes vulnerable to error.


    Randy and I have already planned out the concepts for the final advancement of our intrinsic analysis and are beginning work on a non-linear solver for systems of equations following a form pioneered by Bill James called "log5". More details on the log5 system when we are ready with new results, but as you have seen, the FSIA is already very accurate in just about every case, and ready for application to player evaluation models like PCA.

    Adding to our work on log5, we are beginning to strategize on how to improve the accuracy of dynamic linear weights...more details on that at a later time.

    It should also be noted that the FSIA masakes no attempt to correct for the strength of a league...that's another project entirely. We're working on ways to try to quantify the competitiveness and depth of a league as well, but that'll take some time.

    I think I've written quite enough for one day...anyone still reading this...I solute you for taking the LOOONG time necessary to digest it all and I thank you for reading.

    Thoughts? Quibbles? General wonderings?

  • #2
    --I did have a major quibble until I got to the very end of your piece, where you say that league strength is not factored into your calcualations. Exhibit A being your #3 team the 1902 Pirates. The National League had been decimated by raids from the AL with the exception of Pittburgh, which returned its pennant winning roster from the year before virtually untouched.
    --I would say your system (at first glance anyway, I haven't yet digested the whole concept) is as good as any at telling us how successfull a team was. Whether it tells us how good it was is another story. Those are not always the same thing, even without considering league quality. The 2001 Mariners, for example, were wildly successfull, but that success was fueled by some flukishly good seasons. I would not pick that roster as one of the best in history by any strech of the imagination.


    • #3
      A perhaps more revealing look at "goodness" least relative to the league...(again...this is still without strength of league...which we're working on) through the use of intrinsic strengths only.

      Using that measure and eliminated the "nuique reactions to parks" the '01 Mariners not only drop out of the top ten...the drop behind the '01 ATHLETICS for tops in the 2001 AL.

      We feel that while it is imperative that players be given credit for reacting well to the parks in which they play in the rating and ranking process...intrinsic strengths will prove to be more predictive of team performance in the future. Totally divorced from context...the As were a better team than the Mariners in fact Seattle drops from an RD of 1.77 to on closer to 1.4...and Oakland retains much of its' 1.5-ish success.

      That having been said...although it is true that Bret Boone had a fluke season in '01...he is the ONLY 2001 Mariner that I can think of that performed far afield from his career line in 2001. The main thing that made those Mariners success was an enormous depth and a stellar team defense...things that are hard to see upon a visual inspection of a roster looking for "star power".


      • #4
        Great Hitter's and pitchers parks by the FSIA

        Top 50 hitter's Parks in terms of NET Park Adjustment (the weighted and combined park factors for all parks played in by the team whose home park is listed here...this is done so you can see what kind of mathematical adjustment will actually be applied to team and player contexts)...1900 and beyond
        Team	Year	NetPkA
        COL	1996	0.794
        COL	2000	0.702
        COL	1995	0.692
        COL	1999	0.665
        KCA	2002	0.649
        PHI	1925	0.601
        TEX	2002	0.591
        COL	1993	0.589
        PHI	1923	0.570
        KCA	2001	0.540
        TEX	1998	0.519
        PHI	1929	0.514
        PHI	1930	0.513
        COL	1998	0.472
        PHI	1933	0.470
        MIN	2000	0.461
        CLE	1998	0.460
        PHI	1935	0.458
        COL	2004	0.442
        PHI	1922	0.440
        BOS	1950	0.430
        BOS	1955	0.426
        PHI	1936	0.409
        CHA	2000	0.401
        SEA	1999	0.401
        COL	2001	0.400
        BSN	1911	0.399
        PHA	1932	0.397
        CHA	2004	0.397
        BOS	1977	0.394
        COL	1997	0.391
        PHI	1932	0.391
        KCA	1998	0.390
        TEX	2000	0.388
        CHN	1970	0.384
        OAK	2002	0.381
        KCA	1997	0.379
        DET	1937	0.374
        COL	1994	0.371
        MIN	1999	0.367
        CIN	1903	0.367
        PHA	1902	0.366
        SLA	1930	0.358
        TEX	2004	0.356
        PIT	1951	0.356
        TOR	2004	0.353
        KCA	2000	0.352
        TEX	1999	0.351
        ATL	1977	0.351
        CLE	2002	0.344
        Fifty greatest pitcher's parks since 1900 by the FSIA
        Team	Year	NetPkA
        SDN	1998	-0.526
        SFN	1999	-0.523
        PHI	2002	-0.495
        CHA	1903	-0.491
        SLA	1903	-0.491
        FLO	1999	-0.440
        CLE	1903	-0.420
        MON	1998	-0.418
        SDN	2002	-0.417
        DET	1903	-0.416
        FLO	2002	-0.405
        SFN	2001	-0.401
        HOU	1995	-0.388
        OAK	1973	-0.386
        LAN	1964	-0.386
        HOU	1999	-0.385
        LAN	2001	-0.384
        LAN	1970	-0.372
        BSN	1938	-0.372
        SDN	2001	-0.371
        BSN	1934	-0.368
        LAA	1964	-0.367
        PHA	1903	-0.364
        ATL	1999	-0.362
        LAN	2002	-0.359
        HOU	1976	-0.358
        NYA	1903	-0.354
        LAN	1998	-0.345
        CHN	2000	-0.344
        CIN	2004	-0.343
        NYN	2002	-0.341
        NYN	2000	-0.341
        SDN	1972	-0.333
        NYA	1951	-0.332
        ML1	1958	-0.331
        NYA	1939	-0.330
        BSN	1950	-0.328
        CLE	1952	-0.328
        SDN	1999	-0.324
        BAL	1962	-0.323
        BSN	1926	-0.322
        ARI	1999	-0.318
        HOU	1981	-0.314
        LAN	1967	-0.314
        SFN	2002	-0.313
        NYN	2001	-0.312
        CHA	1932	-0.312
        CHA	1965	-0.311
        CAL	1972	-0.310
        LAN	1997	-0.309
        Parks that make appearances on either list tend to do so more than once most of the time...there are a lot of year-families (a single park will appear multiple times in a number of adjoining years...which is what we'd expect)...and the parks appearing on these lists are not at all unexpected as far as I can tell.

        Reminder...these figures represent how many Runs per Game (per side) a park adds to scoring. When the 1996 Rockies played out their entire schedule, the net amalgom of parks in which they played (weighted by games obviously)...including Coors field for 81 games...added 0.794 runs per game to their own scoring and to the scoring of their opponants. That league was about a 4.5 R/G league so that's something like the equivalent of claiming they had a weighted park adjustment of 117 (which is the equivalent of saying they had a park factor of about 134).

        0.794 R/G is about 129 runs on a whole season for the whole team. For the average line-up spot...that's about 14 runs (1/9th the team total)...

        Your commonly available park adjustment from for the Rockies in 1996 is 131...mine is 117-ish. When I calculated park factor susing three year normally weighted averagnig a la James I got a number about twice as aggressive as the number I'll be using now.

        Just to give you an example of the more conservative nature of FSIA park factors.

        And I was just looking at an extreme teams...there are many more teams hovering far closer to neutrality in the FIAS model then there are using standard park factors.


        • #5
          I haven't quite gotten a chace to read the whole thing yet, but obviously the major problem is the lack of an LQ adjustment. Matt, earlier you introduced a way of quantifying league quality, couldn't you somehow incorporate that into your system?

          Looking at the results, they don't seem horrible. The 1939 Yanks as the best team ever is a conclusion that has been reached by a lot of statisticians. You have also reached the same conclusion as others that the Yankees of the early 50s just weren't all that great either, despite their 5 World Series titles. The Brooklynites may suffer a group stroke seeing their '55 team that low (though their '53 team is very high).

          About the FSIA results, what the hell are the Great American Ballpark and Wrigley Field dong so high up there on the pitcher's parks? And Kauffman Stadiium? I've always though of that as a neutral park, you have it an extreme hitter's park.

          I don't know if that helps, but those are just a few strange things I found in the results. I'll try to read through the system's details when I get time.


          • #6
            My first attempt at league quality was pretty good, but I was not convinced that it was really seeing league quality changes and league quality changes ONLY...nor was I convinced I had the right method for converting that league quality estimate into a percentage. I want to exhaust all possibilities for how to measure league quality including some rather difficult to calculated ideas like measuring interquartile ranges of PCA Wins Created ratings and attempting to quantify the idea that weak leagues cause dramatic shifts in the rating patterns of players (Zwillig goes from bench player to all star when he hits the Federal league...Ace Adams goes from great reliever to crappy last man when WWII ends...etc)...I believe looking at changes from season to season in rating patterns can reveal something about league quality.

            That '55 WS team was not the best team Brooklyn produced...though being low in the top 50 isn't exactly a BAD thing (if you finish in the top 50...that's pretty impressive consider there've been 2100 teams since 1900.

            Wrigley isn't a hitter's park. Not anymore. That myth needs to be put to rest. It got it's reputation as a great hitter's park back in the 70s when it represented one of the smallest parks in baseball.'s playing as at best a near-neutral park in most seasons and at worst a pitcher's park.

            As for Great its' opening season it did play strongly as a pitcher's park...I'm certainly not the only guy to reach that odd thing I've observed is that a lot of new parks play extremely in their first season or two...

            Minute Maid played as an extreme hitter's park in 2000...since then it's been very mild as hitter's parks go. GAB played extreme in 2004...less so in 2005. There are a number of examples like seems that a lot of new parks play extreme and them the entire league adjusts (something that wouldn't be seen inunique reactions to parks because if the whole league is doing becomes expected).


            • #7
              Over 100 years of baseball over 2000 baseball seasons and two of the 6 WS winning 50's Yankee teams show up in the top 50, or I should say two of the teams show up in the top 2% of history. Not bad for a great team.

              Though I don't think anybody in a million years would have picked the 1935 Cubs as the 17th greatest team of all time.


              • #8
                The 1935 Cubs were VERY successful...they dominated a very weak national league...that would be why they ended up where they did.

                And I agree...the 1950s Yankees were good...not "all-time awsome" but certainly a solid dynasty.


                • #9
                  They only won 100 games against that very weak league. Yet the 1954 Indians win 111 games and they don't show up.


                  • #10
                    the FSAA is based on runs though...not wins...the 1954 Indians WAAY out-won what you'd expect from a team with their runs scored and allowed...and while we've had debates about this before I continue to believe that wins are too prone to random chance to use as a measure of statistical success.

                    Obviously there remains the possibility that some teams overperform their pythagorean for a reason (or in this case their intrinsic run differential)...perhaps performance in the late innings...but I am of the increasing belief that this is not likely to be a major factor...I couldn't find any obvious pattern in the group of teams who outperformed their RS/RA seemed more like random chance than anything one could/should credit the players for.


                    • #11
                      The 1954 Indians were expected to win 104 games. They won 111. The 1935 Cubs were expected to win 101 games they won 100 games. According to the Indians Run differential they should have won more games then what the Cubs run differential tells us the Cubs should have won. Yet its the Cubs that get ranked 17th all time and the Indians on the outside looking in.


                      • #12
                        Complete breakdown of the two teams.

                        1954 Indians
                        Intrinsic Offense -> 0.215 RAA/G
                        Intrinsic Defense -> 0.767 RAA/G
                        1954 AL -> -0.197 RAA/G
                        Net Park Adjustment -> -0.037 RAA/G
                        Offensive Park Reactions -> 0.072 RAA/G
                        Defensive Park Reactions -> 0.104 RAA/G
                        Alltime scoring average > 4.76 R/G

                        Actual RS -> 746
                        FSIA projected RS -> 714.6

                        Actual RA -> 504
                        FSIA projected RA -> 533.8

                        What conclusion can we draw here? They both scored significantly more than the FSIA thinks is likely from them if the games were repeated and allowed significantly fewer than the FSIA thinks is likely if the games were repeated. There are two possible explanations...either their park reactions were flukey (and therefore prone to the linear error mentioned in the initial pos here) or they beat the crap out of bad teams and took part in some lopsided series that the FSIA doesn't see as being likely to repeat if the games were replayed under identical conditions.

                        By Comparison, the 1935 Cubs looked like this.

                        Intrinsic Offense -> 0.761
                        Intrinsic Defense -> 0.487
                        1935 NL -> 0.059
                        Net Park Adjustment -> -0.002
                        Offensive Park Reactions -> 0.099
                        Defensive Park Reactions -> 0.144

                        Actual RS -> 847
                        FSIA Projected -> 838.5
                        Actual RA -> 597
                        FSIA Projected -> 608.6

                        There is a center pull here too, but not nearly as severe as in the case of the '54 Indians.

                        Why the difference between the two teams? Perhaps the FSIA feels more confident that the rest of the 1935 NL was significantly weaker and therefore the Cubs' strength of schedule was bad enough that probabilities increased for better run differentials...whereas Cleveland had direct competition (the 1954 Yankees rated as a better team than the Indians...for instance...) and so the FSIA was less confident about the potential for a repeat?


                        • #13
                          Some breakdowns for you...

                          The 1935 Cubs were the model of sabermetric consistancy.

                          They played each opponant almost exactly like you'd expect them to have played...the .248 W% Braves they beat by 75 runs in 22 games. The .411 Phillies they beat by 52 runs and the .441 Reds they beat by 31 runs...

                          In fact they had no sabermetric trouble beating anyone in the NL except the second place Cardinals...who they played a little behind with -18 runs...(and that was a good team).

                          The 1954 Indians on the other hand were outscored by the third place White Sox by ten runs (half a run per game), played even with the Yankees, and the .411 Senators (same W% as the second worst team in the 1935 NL) they managed only a +32 margin...

                          These were all negative drags on the FSIA's probabilistic modelling of their performance. It looks like the Indians had some trouble playing with the teams right behind them in the standings and didn't beat up on the dead weights the way you'd expect a 111 win team to beat up on them.

                          That's probably why the FSIA gives that Cubs team just about every run they earned in the real world but the Indians have a heavy center pull.

                          It's worth mentioning that the 1955 Indians collapsed rather badly...there may have been warning signs in the play of the '54 team that the FSIA sniffs out...they may have been a lot better in real-world record than they actually were in personnel.
                          Last edited by SABR Matt; 01-21-2006, 12:25 PM.


                          • #14
                            One other point worth making...

                            The FSIA by its' very nature heavily discounts blowouts. If you beat someone 22-0, it sees that is more probably representing 17-4 or even terms of what the score would be if that game were played again. So if one of these two teams had some blowouts throwing off certain match-ups, it could play a large (and justified) role in altering their final rank.

                            A run against a .600 team has a LOT more value than a run against a .300 team in this goes game by game by game counting the intrinsic runs (based on Bayesian probability)...

                            The 1935 Cubs played +31 against .500 teams in 66 games, whereas the 1954 Indians played -6 against the two .500+ teams...that's the story here...

                            The '54 Yankees played the other good teams WAY better than the 54 Indians.
                            Last edited by SABR Matt; 01-21-2006, 12:31 PM.


                            • #15
                              --Matt, I'd agree that only Bret Boone had a season which exceeded his norm by a wide margin. What led to that great record was EVERYBODY playing above their norms by at least a little. At least half the roster had a season that was arguably their career best and most of the rest were above their career averages.


                              Ad Widget