March 2, 2005
Translating Cuban Performance
The Surprising Result
by Clay Davenport
When we were having discussions about how to build our Top 50 Prospects list, the question of imported players came up. Most seasons since the 1994 strike have seen at least one notable player, usually either from Japan (Ichiro Suzuki, Kazuhiro Sasaki, both Matsuis) or Cuba (Jose Contreras, Orlando Hernandez), making his major-league debut. This year, the only player who looked good enough to consider (Tadahito Iguchi hadn't signed when we had this discussion, or he certainly would have been in) was Kendry Morales of the whatever-city-they're-calling-themselves-from-this-week Angels.
What did we have to say about him? Scouting reports said he had power to all fields, and he hit .391 in Cuba one year. Pretty much everything we knew about him was in the Baseball America article announcing his signing.
As much as I love reading BA, though, that was a pretty unsatisfying answer. We're performance analysts, dangit, and we didn't have a performance record to analyze, because Cuban baseball has always been this gaping black hole. Players came out every once in a while, the scouting reports raved over them, and George Steinbrenner or some other sap wrote out the big checks for them, but no one really knew how they would perform. While the Brothers Hernandez did fine, it seemed that the greatest talent of Cuban players was to be little Barnums, making suckers of the U.S. baseball establishment. Fidel may have been upset at losing the players, but the sight of so many capitalists losing so much money to Cubans had to bring him at least a small chuckle.
So we tried to rectify the situation, and see if we couldn't bring a little more light to Senor Morales. Lo and behold, I discovered something that either didn't exist or that I'd missed the last half-dozen times I went looking for it: a Web site for Cuba's Serie Nacional, their highest league. It gave me complete statistics for the last four years, once I was able to translate the categories. (Change the "40" in the website's name to 41, 42, 43, or 44 to get other years).
Now we not only had a complete line for Morales, but we also had some context. Yes, Morales did indeed hit .391 one season, albeit in just 202 at-bats, and had a three-year career average of .350. Impressive. It also told us that the Cuban league, as a whole, had stats that looked like this:
Year avg OBP slg R/27 outs
2001 0.295 0.372 0.441 5.99
2002 0.293 0.368 0.425 5.63
2003 0.297 0.379 0.439 6.19
2004 0.288 0.361 0.415 5.32
The Cubans, it turns out, play in a rocket-fueled offensive environment. The highest batting average for any North American league over the last four years was the .287 put up by the Pioneer League last year. The Mexican League is a high-offense league, and they peaked at .286. The Pacific Coast League maxes out at .284 over this time frame, while the majors top out at .270. The Cuban league sweeps the batting average category.
The worst OBP from Cuba over that time is .361; the 2004 Pioneer had an OBP of .379, and the 2003 Mexican League reached .362. No other league operating out of the U.S. these past four years can match those numbers, so the Cuban leagues have that category pretty well covered. Besides the high batting averages, the Cubans have insane hit by pitch rates; the 2003 Cuban league is the only recent league in my database to break .02 HBP per plate appearance. The HBP rates for 2002 and 2004 are also higher than any other league I have, then you get a couple of Pioneer League seasons and the 2001 Cuban league; that's the whole top three and four of the top six, if you're ranking them. Painful.
The power and run scoring numbers aren't quite as dramatic; they tend to run towards the high end of the leagues, but they don't dominate the top of the chart like the average and OBP numbers do. Still, from an offensive standpoint, the amount of air contained in a Cuban ballplayer's statistics is comparable, overall, to that of the Pioneer League. Even without considering the difficulty level of the league, the numbers are substantially less impressive than they appear at first glance.
Of course, the difficulty level of the league is a pretty important thing to consider; it is something we certainly have to know before we can move forward to producing any sort of translation. The problem is that there simply aren't that many players who have played in Cuba during the last four years AND played in some other league whose quality level is known. With Morales, recent Mariner signee Yuniesky Betancourt, and maybe Yankee Yobal Duenas, we should have a lot more comps after this year. For now, though, the best guess at setting the league level is going to have to come from the following players…and I've had to stretch on these, as you'll see.
Jose Contreras played for Pinar del Rio in Cuba in 2001 and 2002 before coming to the U.S. for the 2003 and 2004 seasons. He alone represents a little more than half of the total number of common plate appearances I've been able to establish between Cuba and known leagues, which is a good and valid reason to be uncomfortable with the results I'm going to present.
It is also troubling that it is a pitcher on whom I'm relying, since I'm a lot more comfortable running the process from the batting side. Batter translations, for setting a league rate, are literally as easy as looking at how many runs he produced in environment #1, looking at how many runs he produced in environment #2, and taking the ratio; the components of run scoring, for hitters, all behave in near enough the same way. For pitchers, the difficulty changes primarily affect strikeout and walk rates. Hit rates depend a lot (without getting involved in a discussion of whether "a lot" means 50% or 100% or something else) on the fielders behind him. As a pitcher moves up the ladder, he tends to pitch in front of better defenses, which tends to offset the fact that he's pitching against better hitters. Bottom line, to find the difficulty rating for pitchers, you can't simply look at the runs allowed values and know--you have to manipulate the difficulty rating until the results match up.
For Contreras, the difficulty rating that gave the best matchup was .470. Don't worry about the specific value just yet; I'll talk about that later. Here are the translated figures, using a .47 difficulty for his Cuban translation:
Place IP H R HR BB SO H9 HR/9 BB9 SO9 NERA PERA
Cuba 285.3 259 123 26 101 260 8.2 0.8 3.2 8.2 4.57 3.85
US 274.7 241 120 35 114 250 7.9 1.1 3.7 8.2 4.55 4.06
That is a pretty faithful translation. Of course, since it was designed specifically to fit him, it certainly ought to be pretty good. Who else do we have?
The Cuban government has allowed several players to go to Japan, but only one played in the Japanese major leagues. That was the great Omar Linares, although you couldn't tell he was once great by his performance for the Chunichi Dragons. (Being disappointed in Cuban players is not simply a Western hemisphere phenomenon.) In two-and-a-half years he managed less than 400 plate appearances, putting up a translated line of just .243/.331/.343 and a woeful .241 EqA. Compare that to his last two seasons in Cuba, where his untranslated line was .388/.529/.633; that is a .396 EqA. If we were to rely only on Linares, we would estimate the difficulty of the Cuban league at .289. Adding his numbers to Contreras', our total estimate is .437.
Then there's Duenas. He's a second baseman, already 33, and like Contreras and Linares he played for Pinar del Rio. He's supposed to be a buddy of Contreras, and may have been signed by the Yankees as a favor to him; now that the right-hander is no longer a Yankee, I wonder how long Duenas will be. Duenas played in three games last year for the Yankees Gulf Coast League team, and then got a longer look--12 games!--in the Arizona Fall League. Putting those together for all of 49 plate appearances, he had a .230 EqA in the U.S., compared to an untranslated .301 in Cuba. That suggests a difficulty rating of .510, and nudges the total rating up to .440.
That would be it for Cuban players in known leagues, except for one thing. There were enough players from American leagues in Athens last year to get a pretty good read on the level of play in the Olympics, which gives us a stepping stone to a few more players. In that article, I showed that Cuba's hitters hadn't performed as well as their pitchers had, and that the difference between the two is almost perfectly matched by the difficulty analysis. The difference between the hitters' performance in the Olympics and their performance in the 2004 Cuban season sets a difficulty rating of .400. The pitchers' collective performance worked out to a difficulty rating of .575.
The grand total, figuring in the Olympians, comes to .456. The fact that it comes nearly the same as the rating using Contreras alone (.470) is comforting; the actual difficulty for the Cuban league should be in that vicinity. So what does that .456 mean?
The difficulty rating is the ratio between what a run is worth in this league and what it is worth in the major leagues. A player who produced 100 runs in Cuba, even after allowing for the offensive level of the league, would only be expected to produce 45.6 runs in the majors. The closest American league to that level of play is the New York-Penn League, which over the last four years has averaged a .436 rating. The next one above it would be the Midwest League, which has averaged .484 over the last four years.
So yes, I am saying that the top Cuban league is about equal, skill-wise, to the New York-Penn league. I'm sure that will come as a shock to many, seeing as how highly everyone regards the players on that particular island, but it does go a long way towards explaining why so many Cuban players have performed so poorly--our expectations were too high.
On the other hand, it shouldn't be so surprising. However baseball-mad the populace may be, the island is only home to 11.5 million people, about the same as Ohio, and they are supplying a league of 16 teams. Compare that to the similarly baseball-crazy Dominican Republic, whose winter league grades at Double-A level or maybe even a touch higher. The Dominicans have a population a little over nine million themselves, staffing a league of only six teams, and they allow non-Dominicans to play. Leaving out the non-Dominicans, the D.R. has twice as many people per team as Cuba; count them, and the ratio goes up even farther.
Add in the small detail that the Cuban league has had to replace more than 100 players in the last ten years due to defections, about half a player per team per season, and it is a wonder that they manage to be even that strong.
What does this mean for Kendry Morales? It means that my current best-guess translation for his Cuban play looks something like this:
Year age AB H DB TP HR BB SO avg OBA slg EqA
2002 19 335 83 15 1 15 27 82 0.248 0.307 0.433 0.252
2003 20 188 59 10 1 6 28 34 0.314 0.409 0.473 0.306
2004 21 118 38 13 0 1 7 19 0.322 0.362 0.458 0.281
Total --- 641 180 38 2 22 62 135 0.281 0.348 0.449 0.274
That is probably too optimistic, because it isn't allowing for the suddenness of jumping from low-A competition to a major-league level of play all at once. (The translation system assumes ordinary, one-league-at-a-time progression, and players do tend to fall short when they jump leagues.) Still, these numbers, granted that they have even more uncertainty about them than the usual translation, make him appear to be one of the best hitting prospects in the game.
I want to emphasize the word "prospect;" even if he does manage to play this well, he wouldn't be an immediate impact player. He is, after all, just 22. We think. Cuban imports have built up a bad history, in terms of being older than advertised--Orlando Hernandez, Rey Ordonez, Andy Morales, Adrian Hernandez and Jorge Toca, to name a few. From his performance it wouldn't be a surprise to find out that Morales is older; against that, he has a fairly long track record of playing in international under-16 and under-18 competitions, and I can't find any record of a Cuban team ever being caught using overage players in such competitions. Taking that as his true age, he's a good bet to have a peak season EqA around .300, something like a .300/.380/.520.
As for Yuniesky Betancourt, he wasn't nearly as good a hitter in Cuba as Morales was, and doesn't project as a star player. His translation comes to
Year age AB H DB TP HR BB SO avg OBA slg EqA
2002 20 286 69 10 6 3 8 43 0.241 0.262 0.350 0.213
2003 21 311 80 15 4 5 18 36 0.257 0.298 0.379 0.236
2004 22 18 2 0 0 0 0 3 0.111 0.111 0.111 0.085
Total --- 615 151 25 10 8 26 82 0.246 0.276 0.358 0.222
Plus 17 stolen bases. It certainly looks like he's got got good wheels and drives the ball with some authority, but the future scheme doesn't think he'll hit enough to overcome his frighteningly low walk rate; his estimated peak only comes to .275/.310/.416, a .251 EqA.
Maels Rodriguez, who is supposed to be close to signing with the Orioles, was reported to have been able to throw 100 mph when he was in Cuba. However, when he threw for major-league teams last spring, he was only throwing about 88.
Year G GS IP H ER HR BB K W L H9 HR/9 BB9 K9 NERA PERA STUF
2001 24 23 165.0 116 53 4 89 191 13 5 6.3 0.2 4.9 10.4 3.60 2.89 51
2002 24 20 147.0 98 58 18 102 175 10 6 6.0 1.1 6.2 10.7 3.98 3.55 33
2003 19 16 101.2 78 55 13 95 94 5 6 6.9 1.2 8.4 8.3 4.60 4.87 5
Totals 67 59 413.2 292 166 35 286 460 28 17 6.4 0.8 6.2 10.0 3.98 3.61 34
Looking at the results from Cuba, I'd say that was not a case of overhype; his numbers from 2001 and 2002 are entirely consistent with that kind of velocity. Something clearly went wrong in 2003, even before he left Cuba; his strikeouts dropped and his control, never good, totally disappeared. You might say he was the Cuban version of Nick Neugebauer, in all likelihood complete with some kind of injury. What kind it was, and how well it has healed, I don't know.