Results 1 to 17 of 17

Thread: Using R With Baseball Analysis

  1. #1
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146

    Using R With Baseball Analysis

    Hey everyone, I'm 1phillies fan382. How many of you use R for your baseball computations. If you do, post your code/graphs here (SAS, SPSS, and others can count as well).

    On another forum I post on, someone was saying that Cameron Rupp will hit 20 home runs next year (and that him and Andrew Knapp will combine for 30 home runs). I showed him how ridiculous and unlikely this was by looking at all players from age 20-27 who had a .163 ISO or greater with 732 AB's (PA's were too hard to come by in my data set), and then looking at how many of those players hit 20 home runs the next season. It turns out the percentage was ~22% since 1876. I thought about looking up how likely it would be for Rupp to hit 20 home runs AND Knapp to hit 10 home runs, but I just settled on the fact that only 5 teams in the last 3 years have received 30 home runs from their entire catching position. Here is my code. The second chart is the 43 players (of 197 possible players) to hit 20 home runs or more after having an ISO of .163 in any of their age 20-27 seasons. I told this dude, look at the list- do any of the players sound like freaking Cameron Rupp to you?Screen Shot 2016-12-11 at 3.09.03 AM.pngScreen Shot 2016-12-11 at 3.14.09 AM.png

  2. #2
    Join Date
    Aug 2005
    Location
    Atlanta
    Posts
    3,742
    Quote Originally Posted by 1phillies fan382 View Post
    Hey everyone, I'm 1phillies fan382. How many of you use R for your baseball computations. If you do, post your code/graphs here (SAS, SPSS, and others can count as well).

    On another forum I post on, someone was saying that Cameron Rupp will hit 20 home runs next year (and that him and Andrew Knapp will combine for 30 home runs). I showed him how ridiculous and unlikely this was by looking at all players from age 20-27 who had a .163 ISO or greater with 732 AB's (PA's were too hard to come by in my data set), and then looking at how many of those players hit 20 home runs the next season. It turns out the percentage was ~22% since 1876. I thought about looking up how likely it would be for Rupp to hit 20 home runs AND Knapp to hit 10 home runs, but I just settled on the fact that only 5 teams in the last 3 years have received 30 home runs from their entire catching position. Here is my code. The second chart is the 43 players (of 197 possible players) to hit 20 home runs or more after having an ISO of .163 in any of their age 20-27 seasons. I told this dude, look at the list- do any of the players sound like freaking Cameron Rupp to you?
    Nah, none of them look like Cameron Rupp to me.... And that's why I don't like using historical comps to predict the future performance of a player. Each player is unique and mutually exclusive.
    Rest in Peace Jose Fernandez (1992-2016)

  3. #3
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    Quote Originally Posted by Francoeurstein View Post
    Nah, none of them look like Cameron Rupp to me.... And that's why I don't like using historical comps to predict the future performance of a player. Each player is unique and mutually exclusive.
    ...............

    ferrel.jpg

  4. #4
    Join Date
    Jul 2006
    Location
    Boston, MA
    Posts
    4,083
    Quote Originally Posted by 1phillies fan382 View Post
    Hey everyone, I'm 1phillies fan382. How many of you use R for your baseball computations. If you do, post your code/graphs here (SAS, SPSS, and others can count as well).

    On another forum I post on, someone was saying that Cameron Rupp will hit 20 home runs next year (and that him and Andrew Knapp will combine for 30 home runs). I showed him how ridiculous and unlikely this was by looking at all players from age 20-27 who had a .163 ISO or greater with 732 AB's (PA's were too hard to come by in my data set), and then looking at how many of those players hit 20 home runs the next season. It turns out the percentage was ~22% since 1876. I thought about looking up how likely it would be for Rupp to hit 20 home runs AND Knapp to hit 10 home runs, but I just settled on the fact that only 5 teams in the last 3 years have received 30 home runs from their entire catching position. Here is my code. The second chart is the 43 players (of 197 possible players) to hit 20 home runs or more after having an ISO of .163 in any of their age 20-27 seasons. I told this dude, look at the list- do any of the players sound like freaking Cameron Rupp to you?Screen Shot 2016-12-11 at 3.09.03 AM.pngScreen Shot 2016-12-11 at 3.14.09 AM.png
    Which set are you using?

    I always go with Lehman, but am always looking for better.

  5. #5
    Join Date
    Aug 2005
    Location
    Atlanta
    Posts
    3,742
    Quote Originally Posted by 1phillies fan382 View Post
    ...............

    ferrel.jpg
    What doesn't make sense about what I said? You're extrapolating data from other players in past seasons in order to try and make a prediction for another player's future season.
    Rest in Peace Jose Fernandez (1992-2016)

  6. #6
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    Quote Originally Posted by sturg1dj View Post
    Which set are you using?

    I always go with Lehman, but am always looking for better.
    I toyed around with the Lahman package, but decided not to use it on this one, and just used Fangraphs' sheets. I don't think Lahman contains ISO

  7. #7
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    Quote Originally Posted by Francoeurstein View Post
    What doesn't make sense about what I said? You're extrapolating data from other players in past seasons in order to try and make a prediction for another player's future season.

    Exactly. You don't notice a trend in certain types of players? Like there's not many 34 year old catchers or first baseman? Or the Prince Fielder and Ryan Howard were going to end up the same as Mo Vaughn and Greg Vaughn? Or do you just use those vaunted intuitions and "gut-feelings" to predict how a player will do?

  8. #8
    Join Date
    Aug 2005
    Location
    Atlanta
    Posts
    3,742
    Quote Originally Posted by 1phillies fan382 View Post
    Exactly. You don't notice a trend in certain types of players? Like there's not many 34 year old catchers or first baseman? Or the Prince Fielder and Ryan Howard were going to end up the same as Mo Vaughn and Greg Vaughn? Or do you just use those vaunted intuitions and "gut-feelings" to predict how a player will do?
    I'd consider myself a pretty stat-savvy guy. I agree that there is an indeed a trend that exists in certain players. However, these models don't account for the individual adjustments that exist on a player-to-player basis.
    Rest in Peace Jose Fernandez (1992-2016)

  9. #9
    Join Date
    Jul 2006
    Location
    Boston, MA
    Posts
    4,083
    Quote Originally Posted by 1phillies fan382 View Post
    I toyed around with the Lahman package, but decided not to use it on this one, and just used Fangraphs' sheets. I don't think Lahman contains ISO

    ahh.

    I get it. I usually like to calculate my own stats, but I also usually do it in SAS. When I do things in R it usually becomes a huge (or more huge) pain in the butt to do the calculations.

  10. #10
    Join Date
    Jul 2006
    Location
    Boston, MA
    Posts
    4,083
    Quote Originally Posted by 1phillies fan382 View Post
    Hey everyone, I'm 1phillies fan382. How many of you use R for your baseball computations. If you do, post your code/graphs here (SAS, SPSS, and others can count as well).

    On another forum I post on, someone was saying that Cameron Rupp will hit 20 home runs next year (and that him and Andrew Knapp will combine for 30 home runs). I showed him how ridiculous and unlikely this was by looking at all players from age 20-27 who had a .163 ISO or greater with 732 AB's (PA's were too hard to come by in my data set), and then looking at how many of those players hit 20 home runs the next season. It turns out the percentage was ~22% since 1876. I thought about looking up how likely it would be for Rupp to hit 20 home runs AND Knapp to hit 10 home runs, but I just settled on the fact that only 5 teams in the last 3 years have received 30 home runs from their entire catching position. Here is my code. The second chart is the 43 players (of 197 possible players) to hit 20 home runs or more after having an ISO of .163 in any of their age 20-27 seasons. I told this dude, look at the list- do any of the players sound like freaking Cameron Rupp to you?Screen Shot 2016-12-11 at 3.09.03 AM.pngScreen Shot 2016-12-11 at 3.14.09 AM.png
    wait, I guess I am confused.

    you are looking at those with an ISO of .163 or greater? Why not .163 or less?

    What is your argument here?

  11. #11
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    Quote Originally Posted by sturg1dj View Post
    wait, I guess I am confused.

    you are looking at those with an ISO of .163 or greater? Why not .163 or less?

    What is your argument here?
    Oh ****, you're right. I guess that doesn't make much sense. I did it last night at 4 AM haha. I guess all mine shows is players who are capable of an ISO that great or not

  12. #12
    Join Date
    Jul 2006
    Location
    Boston, MA
    Posts
    4,083
    yeah, I think if you do the flip then you will find more garbage players. At least one journeyman catcher is my guess.

  13. #13
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    Quote Originally Posted by sturg1dj View Post
    yeah, I think if you do the flip then you will find more garbage players. At least one journeyman catcher is my guess.
    So I did the ISO being <= .163, and it turns out there are 742 players who qualify, with 139 of those hitting 20 home runs in their age 28 season. Thanks!

    Here are the players. Again, any look to have similar career trajectories to Rupp?

    Screen Shot 2016-12-11 at 1.27.40 PM.pngScreen Shot 2016-12-11 at 1.27.54 PM.pngScreen Shot 2016-12-11 at 1.28.11 PM.png

  14. #14
    Join Date
    Jul 2006
    Location
    Boston, MA
    Posts
    4,083
    Quote Originally Posted by 1phillies fan382 View Post
    So I did the ISO being <= .163, and it turns out there are 742 players who qualify, with 139 of those hitting 20 home runs in their age 28 season. Thanks!

    Here are the players. Again, any look to have similar career trajectories to Rupp?

    Screen Shot 2016-12-11 at 1.27.40 PM.pngScreen Shot 2016-12-11 at 1.27.54 PM.pngScreen Shot 2016-12-11 at 1.28.11 PM.png


    I am not sure looking at it this way gets you where you want to go. We have a player whose trajectory is a bit too small to compare to anything.

    If we only care about ISO to predict HR's then who are we to say he cannot have the same 28+ year trajectory of the above (because we know better, of course).



    I looked at it slightly different.

    I took players who through their 27th birthday had zero (0) 20 HR seasons, and who had a career ISO through that point of .163 and under; and looked at which of them had at least one 20 HR season post 27th birthday.

    There were 145 players:
    LAST FIRST 20+ HR seasons post 27
    Roberto Alomar 3
    Sandy Alomar 1
    George Altman 2
    Brady Anderson 3
    Cap Anson 1
    Rich Aurilia 4
    Bob Bailey 3
    Clint Barmes 1
    Earl Battey 1
    Jose Bautista 4
    Buddy Bell 1
    Jay Bell 3
    Craig Biggio 8
    Wade Boggs 1
    Aaron Boone 1
    Bret Boone 6
    Clete Boyer 1
    Milton Bradley 1
    Eddie Bressoud 1
    Lou Brock 1
    Hubie Brooks 2
    Smoky Burgess 1
    Marlon Byrd 2
    Ken Caminiti 4
    Bert Campaneris 1
    Leo Cardenas 1
    Chris Chambliss 2
    Mickey Cochrane 1
    Coco Crisp 1
    Joe Cronin 1
    Johnny Damon 3
    Al Dark 2
    Darren Daulton 2
    Doug DeCinces 5
    Mike Devereaux 1
    Bill Dickey 4
    Vince DiMaggio 1
    Brian Downing 6
    Ray Durham 2
    Damion Easley 3
    Bob Elliott 3
    Jacoby Ellsbury 1
    Kevin Elster 1
    Carl Everett 3
    Hoot Evers 1
    Chico Fernandez 1
    Steve Finley 7
    Darrin Fletcher 1
    Jack Fournier 3
    Julio Franco 1
    Jim Fregosi 1
    Carl Furillo 3
    Charlie Gehringer 1
    Bernard Gilkey 1
    Carlos Gomez 1
    Alex Gonzalez 1
    Luis Gonzalez 7
    Bobby Grich 2
    Tom Grieve 1
    Marquis Grissom 5
    Kelly Gruber 2
    Carlos Guillen 2
    Tommy Harper 1
    Charlie Hayes 1
    Von Hayes 2
    Chase Headley 1
    Harry Heilmann 1
    Ken Henderson 1
    Jose Hernandez 3
    Larry Herndon 2
    Jim Hickman 2
    Tommy Holmes 1
    Brandon Inge 2
    Randy Jackson 1
    Davey Johnson 1
    Eddie Joost 2
    Corey Koskie 2
    Joe Kuhel 1
    Mike Lansing 1
    Jeffrey Leonard 3
    Sherm Lollar 2
    Ernie Lombardi 1
    John Lowenstein 1
    Mike Macfarlane 1
    Frank Malzone 1
    Felix Mantilla 1
    Edgar Martinez 8
    Russell Martin 1
    Frank McCormick 1
    Brian McRae 1
    Irish Meusel 1
    Minnie Minoso 4
    Bengie Molina 1
    Yadier Molina 1
    Paul Molitor 1
    Rick Monday 3
    Don Money 1
    Joe Morgan 4
    Xavier Nady 1
    Phil Nevin 4
    Ben Oglivie 4
    Miguel Olivo 1
    Lyle Overbay 2
    Dustin Pedroia 1
    Terry Pendleton 2
    Tony Phillips 1
    A. J. Pierzynski 1
    Bill Robinson 4
    Jimmy Rollins 4
    Joe Rudi 2
    Chris Sabo 3
    Benito Santiago 1
    Nate Schierholtz 1
    Frank Schulte 1
    David Segui 1
    Andy Seminick 2
    John Shelby 1
    Norm Siebern 1
    Roy Sievers 9
    Harry Simpson 1
    Duke Sims 1
    Bob Skinner 1
    Lonnie Smith 1
    Eric Soderholm 2
    Jim Spencer 1
    Ed Sprague 2
    Mike Stanley 3
    Leroy Stanton 1
    Terry Steinbach 1
    B. J. Surhoff 3
    Ed Taubensee 1
    Alan Trammell 2
    Mickey Vernon 1
    Tillie Walker 2
    Bob Watson 1
    Lou Whitaker 4
    Frank White 2
    Ty Wigginton 3
    Bernie Williams 7
    Glenn Wright 1
    Eddie Yost 1
    Kevin Youkilis 2
    Kevin Young 3
    Michael Young 4
    Todd Zeile 4

  15. #15
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    Quote Originally Posted by sturg1dj View Post
    I am not sure looking at it this way gets you where you want to go. We have a player whose trajectory is a bit too small to compare to anything.

    If we only care about ISO to predict HR's then who are we to say he cannot have the same 28+ year trajectory of the above (because we know better, of course).



    I looked at it slightly different.

    I took players who through their 27th birthday had zero (0) 20 HR seasons, and who had a career ISO through that point of .163 and under; and looked at which of them had at least one 20 HR season post 27th birthday.

    There were 145 players:
    LAST FIRST 20+ HR seasons post 27
    Roberto Alomar 3
    Sandy Alomar 1
    George Altman 2
    Brady Anderson 3
    Cap Anson 1
    Rich Aurilia 4
    Bob Bailey 3
    Clint Barmes 1
    Earl Battey 1
    Jose Bautista 4
    Buddy Bell 1
    Jay Bell 3
    Craig Biggio 8
    Wade Boggs 1
    Aaron Boone 1
    Bret Boone 6
    Clete Boyer 1
    Milton Bradley 1
    Eddie Bressoud 1
    Lou Brock 1
    Hubie Brooks 2
    Smoky Burgess 1
    Marlon Byrd 2
    Ken Caminiti 4
    Bert Campaneris 1
    Leo Cardenas 1
    Chris Chambliss 2
    Mickey Cochrane 1
    Coco Crisp 1
    Joe Cronin 1
    Johnny Damon 3
    Al Dark 2
    Darren Daulton 2
    Doug DeCinces 5
    Mike Devereaux 1
    Bill Dickey 4
    Vince DiMaggio 1
    Brian Downing 6
    Ray Durham 2
    Damion Easley 3
    Bob Elliott 3
    Jacoby Ellsbury 1
    Kevin Elster 1
    Carl Everett 3
    Hoot Evers 1
    Chico Fernandez 1
    Steve Finley 7
    Darrin Fletcher 1
    Jack Fournier 3
    Julio Franco 1
    Jim Fregosi 1
    Carl Furillo 3
    Charlie Gehringer 1
    Bernard Gilkey 1
    Carlos Gomez 1
    Alex Gonzalez 1
    Luis Gonzalez 7
    Bobby Grich 2
    Tom Grieve 1
    Marquis Grissom 5
    Kelly Gruber 2
    Carlos Guillen 2
    Tommy Harper 1
    Charlie Hayes 1
    Von Hayes 2
    Chase Headley 1
    Harry Heilmann 1
    Ken Henderson 1
    Jose Hernandez 3
    Larry Herndon 2
    Jim Hickman 2
    Tommy Holmes 1
    Brandon Inge 2
    Randy Jackson 1
    Davey Johnson 1
    Eddie Joost 2
    Corey Koskie 2
    Joe Kuhel 1
    Mike Lansing 1
    Jeffrey Leonard 3
    Sherm Lollar 2
    Ernie Lombardi 1
    John Lowenstein 1
    Mike Macfarlane 1
    Frank Malzone 1
    Felix Mantilla 1
    Edgar Martinez 8
    Russell Martin 1
    Frank McCormick 1
    Brian McRae 1
    Irish Meusel 1
    Minnie Minoso 4
    Bengie Molina 1
    Yadier Molina 1
    Paul Molitor 1
    Rick Monday 3
    Don Money 1
    Joe Morgan 4
    Xavier Nady 1
    Phil Nevin 4
    Ben Oglivie 4
    Miguel Olivo 1
    Lyle Overbay 2
    Dustin Pedroia 1
    Terry Pendleton 2
    Tony Phillips 1
    A. J. Pierzynski 1
    Bill Robinson 4
    Jimmy Rollins 4
    Joe Rudi 2
    Chris Sabo 3
    Benito Santiago 1
    Nate Schierholtz 1
    Frank Schulte 1
    David Segui 1
    Andy Seminick 2
    John Shelby 1
    Norm Siebern 1
    Roy Sievers 9
    Harry Simpson 1
    Duke Sims 1
    Bob Skinner 1
    Lonnie Smith 1
    Eric Soderholm 2
    Jim Spencer 1
    Ed Sprague 2
    Mike Stanley 3
    Leroy Stanton 1
    Terry Steinbach 1
    B. J. Surhoff 3
    Ed Taubensee 1
    Alan Trammell 2
    Mickey Vernon 1
    Tillie Walker 2
    Bob Watson 1
    Lou Whitaker 4
    Frank White 2
    Ty Wigginton 3
    Bernie Williams 7
    Glenn Wright 1
    Eddie Yost 1
    Kevin Youkilis 2
    Kevin Young 3
    Michael Young 4
    Todd Zeile 4
    Nice, man! Your methodology is indeed much better than mine. I was just trying to show how rare it is for a player with Rupp's methodology (singularly using ISO, which is of course flawed) to hit 20 home runs at age 28 (also flawed because they could hit home runs after age 28), which was rare, at 19%. I mainly had beef with the statement that Rupp and a rookie catcher will combine for 30 home runs next season.

  16. #16
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    Quote Originally Posted by sturg1dj View Post
    I am not sure looking at it this way gets you where you want to go. We have a player whose trajectory is a bit too small to compare to anything.

    If we only care about ISO to predict HR's then who are we to say he cannot have the same 28+ year trajectory of the above (because we know better, of course).



    I looked at it slightly different.

    I took players who through their 27th birthday had zero (0) 20 HR seasons, and who had a career ISO through that point of .163 and under; and looked at which of them had at least one 20 HR season post 27th birthday.

    There were 145 players:
    LAST FIRST 20+ HR seasons post 27
    Roberto Alomar 3
    Sandy Alomar 1
    George Altman 2
    Brady Anderson 3
    Cap Anson 1
    Rich Aurilia 4
    Bob Bailey 3
    Clint Barmes 1
    Earl Battey 1
    Jose Bautista 4
    Buddy Bell 1
    Jay Bell 3
    Craig Biggio 8
    Wade Boggs 1
    Aaron Boone 1
    Bret Boone 6
    Clete Boyer 1
    Milton Bradley 1
    Eddie Bressoud 1
    Lou Brock 1
    Hubie Brooks 2
    Smoky Burgess 1
    Marlon Byrd 2
    Ken Caminiti 4
    Bert Campaneris 1
    Leo Cardenas 1
    Chris Chambliss 2
    Mickey Cochrane 1
    Coco Crisp 1
    Joe Cronin 1
    Johnny Damon 3
    Al Dark 2
    Darren Daulton 2
    Doug DeCinces 5
    Mike Devereaux 1
    Bill Dickey 4
    Vince DiMaggio 1
    Brian Downing 6
    Ray Durham 2
    Damion Easley 3
    Bob Elliott 3
    Jacoby Ellsbury 1
    Kevin Elster 1
    Carl Everett 3
    Hoot Evers 1
    Chico Fernandez 1
    Steve Finley 7
    Darrin Fletcher 1
    Jack Fournier 3
    Julio Franco 1
    Jim Fregosi 1
    Carl Furillo 3
    Charlie Gehringer 1
    Bernard Gilkey 1
    Carlos Gomez 1
    Alex Gonzalez 1
    Luis Gonzalez 7
    Bobby Grich 2
    Tom Grieve 1
    Marquis Grissom 5
    Kelly Gruber 2
    Carlos Guillen 2
    Tommy Harper 1
    Charlie Hayes 1
    Von Hayes 2
    Chase Headley 1
    Harry Heilmann 1
    Ken Henderson 1
    Jose Hernandez 3
    Larry Herndon 2
    Jim Hickman 2
    Tommy Holmes 1
    Brandon Inge 2
    Randy Jackson 1
    Davey Johnson 1
    Eddie Joost 2
    Corey Koskie 2
    Joe Kuhel 1
    Mike Lansing 1
    Jeffrey Leonard 3
    Sherm Lollar 2
    Ernie Lombardi 1
    John Lowenstein 1
    Mike Macfarlane 1
    Frank Malzone 1
    Felix Mantilla 1
    Edgar Martinez 8
    Russell Martin 1
    Frank McCormick 1
    Brian McRae 1
    Irish Meusel 1
    Minnie Minoso 4
    Bengie Molina 1
    Yadier Molina 1
    Paul Molitor 1
    Rick Monday 3
    Don Money 1
    Joe Morgan 4
    Xavier Nady 1
    Phil Nevin 4
    Ben Oglivie 4
    Miguel Olivo 1
    Lyle Overbay 2
    Dustin Pedroia 1
    Terry Pendleton 2
    Tony Phillips 1
    A. J. Pierzynski 1
    Bill Robinson 4
    Jimmy Rollins 4
    Joe Rudi 2
    Chris Sabo 3
    Benito Santiago 1
    Nate Schierholtz 1
    Frank Schulte 1
    David Segui 1
    Andy Seminick 2
    John Shelby 1
    Norm Siebern 1
    Roy Sievers 9
    Harry Simpson 1
    Duke Sims 1
    Bob Skinner 1
    Lonnie Smith 1
    Eric Soderholm 2
    Jim Spencer 1
    Ed Sprague 2
    Mike Stanley 3
    Leroy Stanton 1
    Terry Steinbach 1
    B. J. Surhoff 3
    Ed Taubensee 1
    Alan Trammell 2
    Mickey Vernon 1
    Tillie Walker 2
    Bob Watson 1
    Lou Whitaker 4
    Frank White 2
    Ty Wigginton 3
    Bernie Williams 7
    Glenn Wright 1
    Eddie Yost 1
    Kevin Youkilis 2
    Kevin Young 3
    Michael Young 4
    Todd Zeile 4

    Wow, didn't realize just how much of a late bloomer Biggio was in the power department

  17. #17
    Join Date
    Dec 2016
    Location
    Newark, DE
    Posts
    146
    One other thing that's important to my analysis is that of the 143 players who did indeed hit 20 or more home runs after not doing so previously, 94 of them had a wRC+ of 120 or better in the next season. 78 had a wRC+ of 130 or better, 54 were at 140 or better. The median wRC+ of the 143 players was 134. Cameron Rupp has a career wRC+ of 88.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •