How is it formulated? I saw a sheet last year that had predictions. Anyone know a simular formula?
How is it formulated? I saw a sheet last year that had predictions. Anyone know a simular formula?
I assume you're talking about PECOTA, which is Nate Silver's proprietary performance prediction system that he publishes at Baseball Prospectus every year.
He hasn't made the system public other than to talk about some of its characteristics. It's based around finding comparable players/seasons in baseball history and using those to project the performance of players going forward.
That's a BP trend though, much like major league teams have farm systems for players they have a farm system for execs. A lot of teams get FO types from Prospectus (and rightfully so, for the most part those guys are insanely brilliant) so they don't like making their stuff public. However, for the most part I find PECOTA to be as good as any other system out there.
"he probably used some performance enhancing drugs so he could do a better job on his report...i hear they make you gain weight" - Dr. Zizmor
"I thought it was interesting and yes a conversation piece. Next time I post a similar story I will close with the question "So, do you think either of them have used steroids?" so that I can make the topic truly relevant to discussions about today's game." - Eric Davis
http://www.youtube.com/watch?v=gqul1GyK7-g
The accuracy of PECOTA is what makes me trust in things like FRAR and BRAR without fully understanding them.
Hey, this is my public apology for suddenly disappearing and missing out on any projects I may have neglected.
PECOTA is more accurate than other projection techniques, but we're talking about 20% odds of it getting within a reasonable distance of projecting the performance of pitchers and 45-50% odds for batting performance...and it doesn't project fielding at all so...
We have a long way to go before PECOTA should be considered any sort of gold standard.
Matt, I find your remarks intriguing. Can you tell us how you came up with the 20% and 45-50% numbers? Not that I think you're wrong or anything, but is this based on analysis that you or someone else has done.
Has anyone undertaken the effort to do a comparative analysis of the various prognostications that are out there on an annual basis?
If PECOTA is the best we've got, doesn't that make it the gold standard by default?
I guess a better question is... How likely is it that we'll eventually do significantly better? For all the time, effort and research that BP's spent on PECOTA, how much better is it than something like Marcels? I'm not suggesting that we shouldn't try, I'm just pointing out that we are trying to predict human performance here.
Why not take it straight from Nate Silver's mouth?
Here's my blog entry that points to Nate's article:
http://www.insidethebook.com/ee/inde...t_evaluations/
Nate also chimes in my blog. If you look at post 26, I asked Nate to do a head-to-head between PECOTA and the other systems, including Marcel. And out of 295 hitters forecasted by both systems (Marcel forecasts anyone with at least 1 career PA), PECOTA won 50.8% of the battles.
And for pitchers (post 22)? Out of the 288 forecasted pitchers, PECOTA won 50.3% of those battles.
Basically, there's not much difference between the most possible basic forecasting system, and the most convoluted one.
How about saying "The arguments by noted researchers who study these things and deride the basis for FRAR makes me distrust FRAR without fully understanding the reason"?
And, since it's been shown how little accuracy gain there is in PECOTA, how can you possibly give it such sweeping powers?
I tried once to get Nate to talk about PECOTA's accuracy in terms of what it is trying to do (project player performance) as opposed to which system seems to be more accurate. No luck.
Buck O'Neil: The Monarch of Baseball
He was fairly forthcoming in my blog. What specific question do you have? Why don't you post it on my blog, and we'll see what we can do about it.
Wasn't there a study posted some time ago (last year?) showing that the ceiling for correlation of these projection systems was around 0.7?
vr, Xei
Author of Fantasy Baseball Mock Draft Software.
http://www.fantasyinfocentral.com/ml...ware/index.php
Author of DodgerSims Blog
http://DodgerSims.blogspot.com/
As much work as BP put into PECOTA...it uses no sabermetric underpinnings (not really) in its' forecasting methods...for example, it's certainly not DIPS compliant for pitchers since none of their metrics are truly DIPS compliant pitching metrics. As far as I know, PECOTA uses league relative standard statistics and phenotypic attributes (height, weight, handedness etc), and for pitchers it uses things like GB/FB, K/BB etc...but there is no top level analysis to properly factor out things like league quality, defensive contexts, park splits and the like (there are attempts but I'm not a fan of BP's contextual analysis...it's logically flawed from what I've read about it).
I think assuming that we can't do any better than Marcel predictions because the first few such attempts failed to do better is...well...silly.
There was a study on this, yes...I have not read the study, but I have a tough time trusting anyone claiming there is a hard ceiling on the accuracy of prediction...there were folks claiming there was a hard ceiling on the accuracy of weather prediction forty years ago (saying predicting 48 hours out would at best be a 50/50 proposition) and forecasts have steadily improved since those claims were made.
There is a hard ceiling, and it has to completely due to the sampling issue.
(I had said it was close to .7, but it's actually .85, because I forgot to take the square root.)
You can try it yourself, if you don't believe me. Assume you know, for certain, the true talent OBP of all players in the league. Presume your population standard deviation is .030 (or whatever number you want). So, figure a certain number with an OBP at .330, others at .320, .310, .300, .290, .340, .350, .360.... you know, just figure a normal distribution, such that 1 SD = .030.
Now, assume you have 200 players at 600 PA each. Randomly assign an OBP around the mean you selected for each player. Run a correlation.
Do you know what you'll get?
var(observed) = var(true) + var(luck)
Since we've established that var(true) = .030^2 and we can figure that var(luck) = (.33*.67/600)=.019^2, then you will see that the var(observed) of the randomly generated OBP from above will equal .0355^2.
And r (for observed to observed) = 1-(.019/.0355)^2 = 0.71, then for r (true to observed) will be the square root of that, or .84.
And remember, this is the absolute best case, where you know for sure the true talent level of all players, and that each of these players has 600 PA.
Go ahead. Very simple to test, even in Excel.
I just ran one right now... took me all of 2 minutes in Excel. I used just 13 players, with 600 PA each, with 1 true SD = .030. The correlation between the observed and the true was .84.
"he probably used some performance enhancing drugs so he could do a better job on his report...i hear they make you gain weight" - Dr. Zizmor
"I thought it was interesting and yes a conversation piece. Next time I post a similar story I will close with the question "So, do you think either of them have used steroids?" so that I can make the topic truly relevant to discussions about today's game." - Eric Davis
http://www.youtube.com/watch?v=gqul1GyK7-g
Come on, when you have to resort to comparing to advances in weather forecasting, maybe it is time to give up.there were folks claiming there was a hard ceiling on the accuracy of weather prediction forty years ago (saying predicting 48 hours out would at best be a 50/50 proposition) and forecasts have steadily improved since those claims were made.During the 2000 olympics, those of us in the NY/NJ/CT area had to endure multi-minute commercials for NBC's new Doppler 4000. You would have thought they had split the atom.
And yet, here's a typical week of winter forecasts in NJ...
Day -3 : Hear about the worst blizzard in 10 years, headed your way. See pictures of snow in Denver. (forecast 18-24 inches)
Day -2 : Exclusive Storm Track 200x coverage. See snow in Chicago. (forecast 12-16 inches)
Day -1 : Tune in at 11 to see how the weather will effect your morning commute. (forecast 6-9 inches)
Actual result : 3 inches in northern counties, rain by the shore.
I've always felt that forecasts should get progressively narrower and more accurate. You shouldn't get to make your high's higher or your lows lower. If on Tuesday, the real potential snowfall for Friday is somewhere between 0-36 inches, just be up front and tell me you really have no idea.
Oh, and if there's a fantasy baseball league with a bunch of weather men in it, that's playing for money, let me know where I can sign up.
<End of Rant>
I hate it when people who have no knowledge of how weather forecasting is done talk about weather forecasting as though they know what they're talking about. Very irritating.
BTW weskelton...you're in one of the most difficult to forecast areas of the country for winter weather....the rain/snow line for nor'easters generally SHARPLY defines who will get 2 feet and who will get slush or rain and an error of 20 miles results in a forecast error of 20 inches. And oh BTW...no meteorologist would ever forecast 24 inches of snow 3 days in advance of the storm...that just isn't done unless the guy is playing for ratings.
You might be interested to know that I did my senior thesis on the topic of forecast model bias in the 60-hour forecasts for major winter storms and found that our current regional 12 (or 40) km grid spaced numerical model has a consistent eastward bias in forecasting the position of the surface low in nor'easters...and I got interested in that topic because I recorded 12 years of 48 hour snowfall forecasts and found a bias (the worst snow usually occurs further west and north than the forecasts indicate 2 days in advance).
From someone who is actually in a position to tell you anything about the science of weather forecasting...we've come a LONG way despite the tendency for the biggest events to come with the largest errors (because big events mean tight gradients of pressure, wind, air masses and just about anything else you associate with weather prediction - therefore small errors in position produce big errors in results).
Frankly, this attitude most people have about weather forecasting aggravates me a great deal...it's a miracle more meteorologists aren't hanging themselves from the rafters with the amount of work they put into serving a public that thinks they're a joke.
That having been said...thanks Tom for explaining the reason there should be a cap on accuracy and for correcting the location of that cap. R^2 of .7 sounds a lot more reasonable than R of 0.7...and there's a BIIIIGGG gap in accuracy between R^2 of 0.49 and R^2 of 0.6, let alone 0.7.
Matt,
Don't get me wrong. I wasn't venting at you, although you may have interpreted that way. Weather forecasting just happens to be one of my pet peeves. Also, I never said that I knew very much about weather forecasting. You obviously know much more about this than anyone else I've ever meet. What I do know is what I see and hear on the news and what actually transpires. And while I may have exaggerated some of the numbers for effect, the reality probably isn't far off.
Like I said, maybe 24 inches is an exaggeration, but I think ratings are a HUGE factor. The weather often becomes the primary means of promoting the 11 o'clock news during primetime programming. I think if they were to give us the straight scoop, then the actual snowfall amounts would be just as likely to fall in the high end of the forecasted range as the low end and I'm fairly certain that isn't the case (at least around here). Please correct me if you know otherwise.And oh BTW...no meteorologist would ever forecast 24 inches of snow 3 days in advance of the storm...that just isn't done unless the guy is playing for ratings.
I also stick to my comment that forecasting should become more precise as time progresses, i.e. projected lows shouldn't get lower. The problem is that if they tried to make forecasts like... "We project that there is an 80% chance that Friday's snowfall will be somewhere between x and y inches"... then the actual numbers to make this statement true on Tuesday would probably just seem silly. Thus they just project high, as it's more interesting.
Sorry if you're offended by the opinions of the less informed, but there is a reason why the vast majority of the population thinks that weather forecasts are a bit of a joke.
Actually...ratings impact weather forecasting far less than they impact the headlines. If you want to be concerned about yellow journalism, I suggest you worry more about the propaganda being sold as news.
Also...in my experience, local forecasters tend to rip their forecasts from the national weather service, which has absolutely no motivation for ratings grabbing...so...not sure I'd worry much about exaggerated weather forecasts other than perhaps the adjectives the local weather man use to describe things. The actual numbers they give you are usually from NWS text forecasts with a little bit of personal input based on their history of forecasting for that local region (if they're good ones anyway, they will incorporate experience).
And I can site about six examples from the Washington DC area of winter snowfall forecasts that went the exact opposite way you described.
Prior to the (generally well forecasted) Blizzard of 1996, for example, the snowfall forecasts coming from the major local networks for DC went:
3 days out: 3-6 inches at least with the possib8ility for more.
2 days out: 5-10 inches in the southern suburbs, 4-8 inches north.
1 day out: 6-12 inches north, 12-18 inches south.
12 hours out: 12-24 inches city wide, lesser amounts far NW and far SE where sleet might mix in.
Morning of onset: 18-30 inches city wide, including far NW counties, 12-18 in SE Maryland with heavy sleet.
Final results: 17 inches right on the Potomac river, 27 inches in my backyard (15 miles SW of DC in Prince William County), 48 inches in the Appalachians west of the region, 11 inches in Annapolis with heavy sleet and even some freezing rain.
Blizzard of January 2000:
2 days out: Cloudy with a chance of flurries.
1 day out: Cloudy with flurries likely, light snow with 2-4 inch amounts far SE counties.
12 hours out: Light snow city wide, heavy snow SE counties (1-3 in town, 8-12 SE)
6 hours out: 3-5 inches in town, 12+ SE.
Result: 15 inches in my backyard, 10 inches in DC right on the river, 17 inches in Annapolis.
The same eastward bias in nor'easter prediction in the 2-3 day range can cause severe UNDERforecasting of a winter storm as well as severe OVERforecasting. It's more common for the overforecasting to occur because it's more common for a storm to hug the coast and be predicted to stay a little offshore than it is for one to be predicted well south and wind up just off shore.
Oh BTW...for you New Yorkers, do you perhaps remember the President's Day snowstorm of 2003? 3 day forecasts were calling for SUNNY SKIES AND COLD TEMPS in New York city...Even 2 days out, light snow was all that was expected. By 1 day out, they had caught on that it was coming north but still only predicted a foot or so. New York get 2+ feet city wide and Boston got 30 inches.
Matt,
This is all good information. You obviously come armed with much more info than I do. However, I would point out that you are citing blizzards here, where the actual snowfall is large, thus increasing the likelihood of underforecasting. I would venture a guess that while these instances certainly do exist, they are probably the exception to the rule. My gut feeling is this (keyword gut)... if we were to look at the forecast data of local forecasts for cases where snowfall was forecast 3 days in advance, then we would likely see that there is a trend in overforecasting. Any idea if this kind of data is freely available anywhere.
I hear you there.If you want to be concerned about yellow journalism, I suggest you worry more about the propaganda being sold as news.
Also, since I'm sure that most of the readers of this forum are more interested in your views on PECOTA, than east coast weather patterns, this will be my last weather-related post in this thread. Can I hear an "Amen"?
LOL
Fair enough. I don't believe the NWS makes their forecast archives available...they periodically publish reports on their accuracy but they're all in generalities and standard deviations and skill scores and you don't get specifics on specific situations. I took an interest in forecasting patterns (especially in the winter months) after the great ice storm of 1994 and have been recording forecasts for a few dozen forecast offices every time there's a significant event on the way or else I wouldn't even know about their error pattern.
I believe that there is a consistent forecast bias in winter weather prediction on the large events (that being an eastward bias in heavy snowfall forecasts) and I believe that on the small snow events in places highly impacted by lighter snows (the megalopolis mainly), there is a bias toward predicting frozen precip three days in advance to let people know about the possible hazard when more often than not, the cities get rain or minor snow to rain type events. I don't think there's much in the way of intentional ratings-grabbing among strong media outlets...I think the ratings grabbing happens in network news stations that are struggling to gain ground (the forecasts and advertisements for weather coverage on our NBC and CBS affiliates in DC are always more overblown than the ones coming from our ABC affiliate...the group that's led in news ratings in DC since 1997).