Announcement

Collapse

Updated Baseball Fever Policy

Baseball Fever Policy

I. Purpose of this announcement:

This announcement describes the policies pertaining to the operation of Baseball Fever.

Baseball Fever is a moderated baseball message board which encourages and facilitates research and information exchange among fans of our national pastime. The intent of the Baseball Fever Policy is to ensure that Baseball Fever remains an extremely high quality, extremely low "noise" environment.

Baseball Fever is administrated by three principal administrators:
webmaster - Baseball Fever Owner
The Commissioner - Baseball Fever Administrator
Macker - Baseball Fever Administrator

And a group of forum specific super moderators. The role of the moderator is to keep Baseball Fever smoothly and to screen posts for compliance with our policy. The moderators are ALL volunteer positions, so please be patient and understanding of any delays you might experience in correspondence.

II. Comments about our policy:

Any suggestions on this policy may be made directly to the webmaster.

III. Acknowledgments:

This document was based on a similar policy used by SABR.

IV. Requirements for participation on Baseball Fever:

Participation on Baseball Fever is available to all baseball fans with a valid email address, as verified by the forum's automated system, which then in turn creates a single validated account. Multiple accounts by a single user are prohibited.

By registering, you agree to adhere to the policies outlined in this document and to conduct yourself accordingly. Abuse of the forum, by repeated failure to abide by these policies, will result in your access being blocked to the forum entirely.

V. Baseball Fever Netiquette:

Participants at Baseball Fever are required to adhere to these principles, which are outlined in this section.
a. All posts to Baseball Fever should be written in clear, concise English, with proper grammar and accurate spelling. The use of abbreviations should be kept to a minimum; when abbreviation is necessary, they should be either well-known (such as etc.), or explained on their first use in your post.

b. Conciseness is a key attribute of a good post.

c. Quote only the portion of a post to which you are responding.

d. Standard capitalization and punctuation make a large difference in the readability of a post. TYPING IN ALL CAPITALS is considered to be "shouting"; it is a good practice to limit use of all capitals to words which you wish to emphasize.

e. It is our policy NOT to transmit any defamatory or illegal materials.

f. Personal attacks of any type against Baseball Fever readers will not be tolerated. In these instances the post will be copied by a moderator and/or administrator, deleted from the site, then sent to the member who made the personal attack via a Private Message (PM) along with a single warning. Members who choose to not listen and continue personal attacks will be banned from the site.

g. It is important to remember that many contextual clues available in face-to-face discussion, such as tone of voice and facial expression, are lost in the electronic forum. As a poster, try to be alert for phrasing that might be misinterpreted by your audience to be offensive; as a reader, remember to give the benefit of the doubt and not to take umbrage too easily. There are many instances in which a particular choice of words or phrasing can come across as being a personal attack where none was intended.

h. The netiquette described above (a-g) often uses the term "posts", but applies equally to Private Messages.

VI. Baseball Fever User Signature Policy

A signature is a piece of text that some members may care to have inserted at the end of ALL of their posts, a little like the closing of a letter. You can set and / or change your signature by editing your profile in the UserCP. Since it is visible on ALL your posts, the following policy must be adhered to:

Signature Composition
Font size limit: No larger than size 2 (This policy is a size 2)
Style: Bold and italics are permissible
Character limit: No more than 500 total characters
Lines: No more than 4 lines
Colors: Most colors are permissible, but those which are hard to discern against the gray background (yellow, white, pale gray) should be avoided
Images/Graphics: Allowed, but nothing larger than 20k and Content rules must be followed

Signature Content
No advertising is permitted
Nothing political or religious
Nothing obscene, vulgar, defamatory or derogatory
Links to personal blogs/websites are permissible - with the webmaster's written consent
A Link to your Baseball Fever Blog does not require written consent and is recommended
Quotes must be attributed. Non-baseball quotes are permissible as long as they are not religious or political

Please adhere to these rules when you create your signature. Failure to do so will result in a request to comply by a moderator. If you do not comply within a reasonable amount of time, the signature will be removed and / or edited by an Administrator. Baseball Fever reserves the right to edit and / or remove any or all of your signature line at any time without contacting the account holder.

VII. Appropriate and inappropriate topics for Baseball Fever:

Most concisely, the test for whether a post is appropriate for Baseball Fever is: "Does this message discuss our national pastime in an interesting manner?" This post can be direct or indirect: posing a question, asking for assistance, providing raw data or citations, or discussing and constructively critiquing existing posts. In general, a broad interpretation of "baseball related" is used.

Baseball Fever is not a promotional environment. Advertising of products, web sites, etc., whether for profit or not-for-profit, is not permitted. At the webmaster's discretion, brief one-time announcements for products or services of legitimate baseball interest and usefulness may be allowed. If advertising is posted to the site it will be copied by a moderator and/or administrator, deleted from the site, then sent to the member who made the post via a Private Message (PM) along with a single warning. Members who choose to not listen and continue advertising will be banned from the site. If the advertising is spam-related, pornography-based, or a "visit-my-site" type post / private message, no warning at all will be provided, and the member will be banned immediately without a warning.

It is considered appropriate to post a URL to a page which specifically and directly answers a question posted on the list (for example, it would be permissible to post a link to a page containing home-road splits, even on a site which has advertising or other commercial content; however, it would not be appropriate to post the URL of the main page of the site). The site reserves the right to limit the frequency of such announcements by any individual or group.

In keeping with our test for a proper topic, posting to Baseball Fever should be treated as if you truly do care. This includes posting information that is, to the best of your knowledge, complete and accurate at the time you post. Any errors or ambiguities you catch later should be acknowledged and corrected in the thread, since Baseball Fever is sometimes considered to be a valuable reference for research information.

VIII. Role of the moderator:

When a post is submitted to Baseball Fever, it is forwarded by the server automatically and seen immediately. The moderator may:
a. Leave the thread exactly like it was submitted. This is the case 95% of the time.

b. Immediately delete the thread as inappropriate for Baseball Fever. Examples include advertising, personal attacks, or spam. This is the case 1% of the time.

c. Move the thread. If a member makes a post about the Marlins in the Yankees forum it will be moved to the appropriate forum. This is the case 3% of the time.

d. Edit the message due to an inappropriate item. This is the case 1% of the time. There have been new users who will make a wonderful post, then add to their signature line (where your name / handle appears) a tagline that is a pure advertisement. This tagline will be removed, a note will be left in the message so he/she is aware of the edit, and personal contact will be made to the poster telling them what has been edited and what actions need to be taken to prevent further edits.

The moderators perform no checks on posts to verify factual or logical accuracy. While he/she may point out gross errors in factual data in replies to the thread, the moderator does not act as an "accuracy" editor. Also moderation is not a vehicle for censorship of individuals and/or opinions, and the moderator's decisions should not be taken personally.

IX. Legal aspects of participation in Baseball Fever:

By submitting a post to Baseball Fever, you grant Baseball Fever permission to distribute your message to the forum. Other rights pertaining to the post remain with the ORIGINAL author, and you may not redistribute or retransmit any posts by any others, in whole or in part, without the express consent of the original author.

The messages appearing on Baseball Fever contain the opinions and views of their respective authors and are not necessarily those of Baseball Fever, or of the Baseball Almanac family of sites.

Sincerely,

Sean Holtz, Webmaster of Baseball Almanac & Baseball Fever
www.baseball-almanac.com | www.baseball-fever.com
"Baseball Almanac: Sharing Baseball. Sharing History."
See more
See less

Pitching Pythagorean W-L% (?)

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pitching Pythagorean W-L% (?)

    I'm just sort of thinking aloud here. I really enjoy Pythagorean W-L% because it tends to be surprisingly accurate. Furthermore you can use the same equation in 1889 or 1922 or 1975 or 2010 unlike the nonsense that accompanies linear weights.

    Is there something similar out there (or possible) that could be computed for a pitcher given his run support and runs allowed? You could go a step further and adjust for team defense, but I just want to know if there's an empirical way to calculate the base % using only runs
    "Allen Sutton Sothoron pitched his initials off today."--1920s article

  • #2
    You mean to project a pitcher's W/L record?

    The difference, obviously, is that pitchers typically appear in a number of games that 'don't count' toward their personal records--a team's RS and RA are always in 'decisions.'

    Still, what happens if one just does the math anyway? Presumably the correlation is reduced in the later years of shorter starts and specialty pitching.

    Comment


    • #3
      Originally posted by Tyrus4189Cobb View Post
      I'm just sort of thinking aloud here. I really enjoy Pythagorean W-L% because it tends to be surprisingly accurate. Furthermore you can use the same equation in 1889 or 1922 or 1975 or 2010 unlike the nonsense that accompanies linear weights.

      Is there something similar out there (or possible) that could be computed for a pitcher given his run support and runs allowed? You could go a step further and adjust for team defense, but I just want to know if there's an empirical way to calculate the base % using only runs
      Tom Tango has done something like that over at The Book.
      1885 1886 1926 1931 1934 1942 1944 1946 1964 1967 1982 2006 2011

      1887 1888 1928 1930 1943 1968 1985 1987 2004 2013

      1996 2000 2001 2002 2005 2009 2012 2014 2015


      The Top 100 Pitchers In MLB History
      The Top 100 Position Players In MLB History

      Comment


      • #4
        Originally posted by Tyrus4189Cobb View Post
        I'm just sort of thinking aloud here. I really enjoy Pythagorean W-L% because it tends to be surprisingly accurate. Furthermore you can use the same equation in 1889 or 1922 or 1975 or 2010 unlike the nonsense that accompanies linear weights.

        Is there something similar out there (or possible) that could be computed for a pitcher given his run support and runs allowed? You could go a step further and adjust for team defense, but I just want to know if there's an empirical way to calculate the base % using only runs
        I've done stuff like that:

        1) take a pitcher's season for runs allowed
        2) adjust runs allowed and project expected run support on a new team (e.g., take Feller from 1946 and put him on the 1998 Yankees, adjusting runs allowed for team defense, park and league factor, etc.)
        3) project a w-l record using the theorem

        I expected it would not actually work well in application. Then I took the pitcher's game logs, randomly applied the changes for runs allowed, slapped the revised game log on the other team's actual game schedule. The actual W-L came out really close to the projected one. I did this for multiple seasons (13) and I was off in total by .3%. Individual seasons were off by a larger % (7.5%.)

        Overall, I think using that unscientific approach, that pitcher specific luck is as much as +/- 7% over a selection of 13 seasons, but averages out over time.
        Last edited by drstrangelove; 11-18-2012, 11:17 PM.
        "It's better to look good, than be good."

        Comment


        • #5
          Or you can use ERA+ and 100* as the inputs, get the WL%, and multiply it times decisions to get the number of wins the pitcher "should have had." It's not, of course, but it does sort of level the playing field:

          W% = (ERA+)^2/(100+ERA+)"

          *Edit: Sorry, I was thinking of ERA+ as a percentage.
          Last edited by Jackaroo Dave; 11-19-2012, 02:13 AM.
          Indeed the first step toward finding out is to acknowledge you do not satisfactorily know already; so that no blight can so surely arrest all intellectual growth as the blight of cocksureness.--CS Peirce

          Comment


          • #6
            Originally posted by drstrangelove View Post
            1) take a pitcher's season for runs allowed
            2) adjust runs allowed and project expected run support on a new team (e.g., take Feller from 1946 and put him on the 1998 Yankees, adjusting runs allowed for team defense, park and league factor, etc.)
            3) project a w-l record using the theorem.
            I'm following you but I don't know the math to match your description. Could you give me an example?

            Originally posted by Jackaroo Dave View Post
            Or you can use ERA+ and 100* as the inputs, get the WL%, and multiply it times decisions to get the number of wins the pitcher "should have had." It's not, of course, but it does sort of level the playing field:

            W% = (ERA+)^2/(100+ERA+)"

            *Edit: Sorry, I was thinking of ERA+ as a percentage.
            Computing for 2011 John Lackey, a 6.41 ERA (67 ERA+) who managed a 12-12 record due to Boston's explosive offense behind him

            PythW-L%= 67^2/(100+67)=26.88

            Based on what you're saying, his W-L% "should" have been 0.269?
            "Allen Sutton Sothoron pitched his initials off today."--1920s article

            Comment


            • #7
              BBRef player value lists WAA WL% which is the W/L percentage a team would hypothetically have in games in which the player played. I am not sure if it is situational. If you take a player's innings divided by 9 and then give him half that total plus his WAA it might work. For example Gooden in 1985 pitched 276 2/3 innings or 30.74 games worth of innings. Half of that is 15.37, plus he was 9.8 WAA which if added to 15.37 would give him approx 25.2 wins and 5.5 losses.

              Comment


              • #8
                Originally posted by brett View Post
                BBRef player value lists WAA WL% which is the W/L percentage a team would hypothetically have in games in which the player played. I am not sure if it is situational. If you take a player's innings divided by 9 and then give him half that total plus his WAA it might work. For example Gooden in 1985 pitched 276 2/3 innings or 30.74 games worth of innings. Half of that is 15.37, plus he was 9.8 WAA which if added to 15.37 would give him approx 25.2 wins and 5.5 losses.
                That's a hypothetical team, though. PythW-L deals with something he "should" have given his actual team. Plus it's probably easier because there are only so many runs to go around which is what makes regular Pyth so reasonably accurate.
                "Allen Sutton Sothoron pitched his initials off today."--1920s article

                Comment


                • #9
                  Does anyone know how to get the run support for a pitcher in only the innings he pitched? Could run support per innings (RS/IP) work? It is the "runs scored per 27 outs while the pitcher was in the game as the pitcher."
                  Last edited by Tyrus4189Cobb; 11-19-2012, 01:44 PM.
                  "Allen Sutton Sothoron pitched his initials off today."--1920s article

                  Comment


                  • #10
                    Originally posted by Tyrus4189Cobb View Post
                    Computing for 2011 John Lackey, a 6.41 ERA (67 ERA+) who managed a 12-12 record due to Boston's explosive offense behind him

                    PythW-L%= 67^2/(100+67)=26.88

                    Based on what you're saying, his W-L% "should" have been 0.269?
                    Actually, I messed up again. Simplest route: ERA+ is the league ERA divided by the individual ERA expressed as a percentage. So the league ERA is .67 of Lackey's. So by the pythagorean formula, a team scoring .67 as many runs as the opposing team (e.g. Bozox vs league) would be expected to win .67^2/(1 + .67^2) = .3098, not what I told you before.

                    I'm really sorry for messing it up, because now it seems complicated, but it's really back of the envelope stuff and provides a limited insight about run support and league run environment.

                    Let's call ERA+ as a decimal fraction ERA%. Then just square ERA%, add 1, and divide into the original square.
                    Indeed the first step toward finding out is to acknowledge you do not satisfactorily know already; so that no blight can so surely arrest all intellectual growth as the blight of cocksureness.--CS Peirce

                    Comment


                    • #11
                      Originally posted by Tyrus4189Cobb View Post
                      I'm following you but I don't know the math to match your description. Could you give me an example?
                      I'll use the example I made up. (Also, doing this from memory, so yell if you see a mistake.)

                      1) take Feller's actual record for 1946.

                      2) calculate what his runs allowed would have been had he played for the Yankees in 1998. (This is not meant to be perfect, just modeling how well the pyth works in real world usage. As you can see later, there is no adjustment for actual teams played or actual parks played, since that is not the purpose.) To do this step, use Feller's 1946 ERA+, recalculate to a 1998 ERA using the AL league ERA, adjust for current PF, multiply by innings to get earned runs, then adjust for the Yankees actual unearned run % in 1998 to gross up to runs allowed.

                      3) take Feller's actual 1946 game log and adjust runs allowed up or down as needed to force the recalculated runs allowed in step 2 above. You now have an 'adjusted' game log.

                      4) Take the Yankees actual team game log and compare to Feller's and determine wins / losses on a game by game basis. This has to be done inning by inning. That is the adjusted 'real' W-L record.

                      5) Take the calculated runs from step 2 (noting the innings pitched), divide and create a RPG allowed and a calculated number of games (e.g., 372 innings = 41.333 games.)

                      6) Take the actual total runs score by NYY in 1998, divide by the number of games played and create RPG for.

                      7) Multiple both RPG for and RPG against using the calculated number of games in step 5. You now have runs for, runs against and games (e.g., 41.333) to do pyth. calculate W-L.

                      I have found that the calculation in #7 is, on average, extremely like the actual figure derived in #4. That is, pitcher's expected wins, follow closely to actual wins.

                      My sample is only a few hundred games, so i'm not certain it's accurate, but I think it might be.


                      1) if the sample was sufficient, it implies that given a random pattern of starting pitcher game performances and a given average RPG for the pitcher's team, it should be possible to estimate accurately the pitcher's W-L recond over a sufficient number of seasons.

                      2) it implies that events such as, leaving a game with a lead, then not winning, getting knocked out early but not losing, leaving behind in the 7th but winning, etc., are simply a series of random events with a net sum in W-L of around zero.

                      3) it implies that the pyth theorem is more robust than some may think. It doesn't work just on completed games. It works on calculated games, if you will, simulated games as in parts of xx number of games equaling a calculated number of 'whole' games.

                      Obviously, the smaller the sample, the more likely you'll see a divergence from the expectation. There is luck involved, but not nearly as much from my small sample as I expected.
                      Last edited by drstrangelove; 11-19-2012, 06:15 PM.
                      "It's better to look good, than be good."

                      Comment


                      • #12
                        This is more compulated than I thought it would be, but I appreciate the feedback.

                        I've been toying with a different approach using logarithms based solely on runs allowed and run support during the pitcher's stay (derived from RA/IP, unless there's something better out there). Using just these matches the traditional ones. I'm only looking for what the pitcher's win-loss% should have been given his runs allowed and run support. His actual performance has nothing to do with the number in my approach except for his ability to prevent runs. Correct me if I'm wrong, but drstrangeglove's method (did you invent that?) looks more like a neutralized stat instead of something that "should" be based on what occurred.

                        Less than two hours have passed on me working on this. In this short time the best number I've derived to compute w-L% solely on RA/RS is (logRA-logRS)^-0.85. In 2008, Ben Sheets allowed 76 runs with a support of 108 runs. His actual w-l was .591 and .548 based on mine. That's pretty close.

                        The fatal flaw with my equation is how it plays out for extremes. Pitchers who perform really well or poorly still end up with a w-l in the .500s because the exponent is so close to one. The formula only makes sense for pitchers hovering around a certain skill (like 95-135 ERA+).

                        I'm wondering if I should include league or teams runs since pythagorean W-L% is able to use runs, of which there are so many to go around. A team that scores 100 runs and allows 100 runs (400 and 400, 232 and 232, whatever) has a theoretical w-l% of .500.
                        "Allen Sutton Sothoron pitched his initials off today."--1920s article

                        Comment


                        • #13
                          Originally posted by Tyrus4189Cobb View Post
                          This is more compulated than I thought it would be, but I appreciate the feedback.

                          I've been toying with a different approach using logarithms based solely on runs allowed and run support during the pitcher's stay (derived from RA/IP, unless there's something better out there). Using just these matches the traditional ones. I'm only looking for what the pitcher's win-loss% should have been given his runs allowed and run support. His actual performance has nothing to do with the number in my approach except for his ability to prevent runs. Correct me if I'm wrong, but drstrangeglove's method (did you invent that?) looks more like a neutralized stat instead of something that "should" be based on what occurred.
                          Yes, technically I did invent this method, fwiw. I've been building models of seasons, game logs etc for quite a while. I think it just occurred to me one day that if you could covert a season from one era to another, that you could covert each game within the season. The logical step was to see what happens when you overlay the conversion onto a real season.

                          The pythagorean theorem application was something I did before I started re-doing logs, but was only converting seasons. Once I started doing logs, comparing the two methods was a logical step.

                          I think I am doing a couple different things, but for this thread, I'd focus on the application step of the pitcher's log to the team log. What I think it means is that, e.g., if you take Koufax's 1965 season, as is, and apply it game for game, inning for inning, to the 1962 Mets, the 1927 Yankees, the 1966 Senators, the 1917 Red Sox, and let's say 12 other random teams, that you can get an overall W-L record that will closely match a calculated W-L record based solely upon his total runs allowed and a calculated runs scored for his team.
                          Last edited by drstrangelove; 11-19-2012, 08:35 PM.
                          "It's better to look good, than be good."

                          Comment


                          • #14
                            Originally posted by Tyrus4189Cobb View Post
                            This is more compulated than I thought it would be, but I appreciate the feedback.

                            Less than two hours have passed on me working on this. In this short time the best number I've derived to compute w-L% solely on RA/RS is (logRA-logRS)^-0.85. In 2008, Ben Sheets allowed 76 runs with a support of 108 runs. His actual w-l was .591 and .548 based on mine. That's pretty close.

                            The fatal flaw with my equation is how it plays out for extremes. Pitchers who perform really well or poorly still end up with a w-l in the .500s because the exponent is so close to one. The formula only makes sense for pitchers hovering around a certain skill (like 95-135 ERA+).
                            Any further progress, Tyrus?

                            I have a question, maybe helpful, maybe not. If you are working on RA/RS by taking the log and getting (logRA-logRS), shouldn't you also log the exponent and use (-0.85)*(logRA-logRS) to get the equivalent of (RA/RS)^(-0.85)? (Of course that may not be what you're trying to do at all, in which case I apologise for once again adding some dumb-ass.)

                            Frankly, I've never seen the form (logX-logY)^Z and don't know what it would work out to. But there are a lot of things I've never seen and don't understand.

                            As far as it not working for extremes goes, didn't Tom Tiger Tango observe that the pythagorean theorem works only in the narrow range you speak of?
                            Indeed the first step toward finding out is to acknowledge you do not satisfactorily know already; so that no blight can so surely arrest all intellectual growth as the blight of cocksureness.--CS Peirce

                            Comment


                            • #15
                              Originally posted by Jackaroo Dave View Post
                              Any further progress, Tyrus?

                              I have a question, maybe helpful, maybe not. If you are working on RA/RS by taking the log and getting (logRA-logRS), shouldn't you also log the exponent and use (-0.85)*(logRA-logRS) to get the equivalent of (RA/RS)^(-0.85)? (Of course that may not be what you're trying to do at all, in which case I apologise for once again adding some dumb-ass.)

                              Frankly, I've never seen the form (logX-logY)^Z and don't know what it would work out to. But there are a lot of things I've never seen and don't understand.

                              As far as it not working for extremes goes, didn't Tom Tiger Tango observe that the pythagorean theorem works only in the narrow range you speak of?
                              Not much progress since, but I'm still dabbling with it. The bolded part made me laugh.

                              I'm not sure what Tango said about the extremes. I've only read his stuff about WAR and wOBA. From what I can tell, the extremes are impossible to compensate if we use only RS and RA. The exponent, being so close to one, neutralizes the numbers to somewhere in the .500s. One one hand, pitchers who do well by allowing fewer runs are hurt by the bigger difference created in the subtraction of logs. On the other hand, pitchers who sucked and/or had a lot of run support (Milt Pappas 1966, John Lackey 2011) benefit from the increased gap in log of Runs Allowed - log Runs Scored. If Tango mentioned the extremes, I definitely see why.

                              I'm going to tinker with your suggestion of "logging" the exponent and whatnot. Believe me, I'm such an amateur statistician that there is no way you can sound like a dumb-ass to me unless you bring up touchdowns or bogies
                              "Allen Sutton Sothoron pitched his initials off today."--1920s article

                              Comment

                              Working...
                              X