Seth Mnookin

I reprint, you decide: the Times on Manny, pt. 2

March 21st, 2007 → 11:00 am @ Seth Mnookin

On Sunday, I took issue with a New York Times “Keeping Score” article on the extent to which Manny Ramirez defensive deficiencies detracted from his overall value. Specifically, I called the piece “boneheaded,” “stupid,” and “embarrassing.”

Dan Rosenheck, the author of said piece, took issue with my criticisms, and in at least two situations, he’s totally right, First, it was stupid and boneheaded for me to call his piece stupid and boneheaded. (Ed: But isn’t the Interweb supposed to be about baseless accusations and unwarranted vitriol? Yes, but if the Interweb was jumping off a bridge, does that mean I should do it too?) (Ed: Aren’t you blatantly ripping the whole “Ed:” gimmick from Kausfiles, Mickey Kaus’s mostly political blog? Yes, except he usually has the editor assume the voice of reason. Which, you know, makes more sense.) Another good point Dan makes: from my post, it sounds as if I’m arguing that the problems with defensive metrics mean they’d (potentially) take a +30 player and make him a -30 player and that therefore PBP* stats are totally useless. I’m not arguing that…but you should read the back and forths below, starting with Dan’s initial e-mail, my response, and his response to my response. And then weigh in with your comments. There are certainly holes to be poked in my arguments. Although I’m still right…

(At this point, it probably makes sense to read Dan’s piece and my post, in that order. Or else this is all going to seem a bit obtuse.)

On using defensive metrics to evaluate Manny’s worth:

DR: I just have to ask: Are you actually familiar with the output of PBP systems? If the raw data they are based on were as haphazard as you suggest, the results would be inscrutableâ€šÃ„Ã®youâ€šÃ„Ã´d have guys going from +30 to -30 from one year to the next, guys who are clearly superlative defenders coming out poorly, and guys who are clearly horrific coming out well. Instead, the PBP systems pass every conceivable â€šÃ„Ãºsmell test.â€šÃ„Ã¹ They show year-to-year consistency, with a clearly distinguishable aging pattern: playersâ€šÃ„Ã´ defense tends to improve until they are about 24, decline slowly until around 30, and then fall off a cliff. By and large, they square with anecdotal evidence: Gold Glove winner Orlando Hudson is indeed a great second baseman; the universally panned Alfonso Soriano was indeed a terrible one. And, perhaps most tellingly, they line up with each other. There are two different companies that each send observers to games, STATS Inc. and Baseball Info Solutions (BIS). Despite being based on entirely different data sets compiled by entirely different groups of observers, the PBP metrics based on STATSâ€šÃ„Ã´ data show an exceedingly high correlation to those based on BISâ€šÃ„Ã´ results. None of this would be the case if the systems were half as unreliable as you suggest they are.

SM: The year-to-year consistency only shows consistency on the part of the individual scorers in each park. I can predict a counter-argument: but what about when players are traded, etc? Well, one crucial part of the equation that I left out of my post — probably stupidly — is that the hired-gun defensive scorers are actually fined (or docked pay) when their assessments vary too much from other assessments. That, combined with the fact that the training of the people who compile the PBP stats is so haphazard and varied means you get a huge amount of self-perpetuation. Finally, I know you know that I’m not arguing that a player would go from +30 to -30; I’m arguing a player could really be a -10 and, because of a combination of factors (including calcified notions of that person’s fielding), he’d consistently get rated a -20.

DR: That still doesnâ€šÃ„Ã´t account for the agreement between BISâ€šÃ„Ã´ and STATSâ€šÃ„Ã´ data sets, which are compiled independently by entirely different people and seems to me to be the biggest feather in the PBP systemsâ€šÃ„Ã´ cap. I certainly did *not* know that you were not arguing that a player would be from +30 to -30â€šÃ„Ã®I had nothing to go on but your blog post, which made it sound like PBP systems were completely worthless.

That said, I think youâ€šÃ„Ã´re only slightly underrating the systemsâ€šÃ„Ã´ accuracy in your email. The consensus I was given in my interviews is that the 95% confidence interval is probably about 7 runs, meaning that if somebody is rated a -20 thereâ€šÃ„Ã´s a 95% chance heâ€šÃ„Ã´s â€šÃ„Ãºtrulyâ€šÃ„Ã¹ between a -13 and a -27. So it seems to me highly (less than 5%) unlikely that â€šÃ„Ãºa player could really be a -10 and, because of a combination of factors (including calcified notions of that personâ€šÃ„Ã´s fielding), heâ€šÃ„Ã´d consistently get rated a -20.â€šÃ„Ã¹ But if you changed your estimate to â€šÃ„Ãºcould really be a -13,â€šÃ„Ã¹ or that a player who was rated -10 could consistently get rated a -17, Iâ€šÃ„Ã´d buy that. PBP metrics are not yet a fine-toothed combâ€šÃ„Ã®the main factors we are missing are positioning (extremely important!) and lots of sample sizeâ€šÃ„Ã®but itâ€šÃ„Ã´s more than precise enough to make the kind of estimates I did in my story (if the best guess we have is that Ramâˆšâ‰ rez is about a -18, heâ€šÃ„Ã´ll probably be about as valuable as Nick Swisher). I donâ€šÃ„Ã´t think I overstated the case for the PBP statsâ€šÃ„Ã´ reliability, either, since I made clear that the disagreement between systems was as big as 19 runs on Manny in 2006!

On moving Ortiz to first and Manny to DH:

DR: As for my proposal that the Red Sox move David Ortiz to first: Iâ€šÃ„Ã´m sure he is â€šÃ„Ãºmore comfortable as a full-time DH.â€šÃ„Ã¹ But I find it hard to believe that from the teamâ€šÃ„Ã´s perspective, his â€šÃ„Ãºcomfortâ€šÃ„Ã¹ level is really worth 15 runs/1.5 wins/ $4 million per season. Iâ€šÃ„Ã´m not sure what â€šÃ„Ãºactual evidenceâ€šÃ„Ã¹ youâ€šÃ„Ã´re referring to when you suggest that â€šÃ„Ãºpart of Ortizâ€šÃ„Ã´s prodigious offense results from the time he spends in the clubhouse between at-bats, when he studies previous at-bats against the opposing pitcher and reviews what might lead to success.â€šÃ„Ã¹ Have you done a controlled study in which you deny Ortiz his precious clubhouse time for 3,000 at-bats and give it to him for another 3,000, and then compare the results? If not, Iâ€šÃ„Ã´m not exactly sure how youâ€šÃ„Ã´d go about supporting that hypothesis. I acknowledge myself that he might be more vulnerable to injury if he played the field. The question is, how much more? Is it 5% more likely, 10%, 15%? And how much time would he miss if he were hurt? The team needs to come up with its best estimate for a dollar cost of the added injury risk to Ortiz, and compare that to the dollar cost of leaving Ramâˆšâ‰ rez in left field. If the former is greater, then the Soxâ€šÃ„Ã´ current alignment is the correct one (although they would be well-advised to explore a trade, since Ramâˆšâ‰ rez would be more valuable to a team without a DH than he is to them). But if it isnâ€šÃ„Ã´t, then they really should consider making a switch.

SM: As for Ortiz, I have lots of evidence to support that. I’ve spoken with him. I’ve looked at his stats from when he was in Minnesota (and playing first base more regularly), both in terms of time on the DL and offensive numbers. I’ve spoken with the people who work with him on the team’s baseball operations crew. When Ortiz strikes out, goes back into the clubhouse, studies the pitcher, and comes back and hits a walk-off home run, a couple of things could have been happening, and one of them certainly could be that he’s convinced himself this clubhouse time helps his actual batting skill more than it does…but the psychological component of the game is enormous. And this is a situation where it clearly makes no sense to try and determine whether Ortiz is 10% more likely to get injured, determine a dollar amount to correlate to that figure, and determine a dollar amount to leaving Manny in left. What you’d actually need to do is figure out the extent to which Ortiz, who is one of the people who helps keep a stressed out and often disgruntled clubhouse relatively loose (and is also a remarkably underpaid player on a team full of overpaid prima donnas), would be bothered by a move to first. Then you’d need to put a dollar amount on that. Then you’d need to put a dollar amount on the impact of his unhappiness on the other players on the team. Then you’d to put a dollar amount on the impact of Ortiz’s impact on fans. Then you’d need to put a dollar amount on the impact of a disgruntled fan base on a team playing in the most over-oxygenated city in the country. Etc.

DR: Ortizâ€šÃ„Ã´s stats in Minnesota suggest he hits *better* not worse, as a 1B. From 1998-2002, he had a 120 OPS+ (a combined on-base and slugging percentage 20% higher than the park-adjusted league average) as a 1B, and just a 109 OPS+ as a DH. Thatâ€šÃ„Ã´s even more striking given that players usually hit worse when they DH, since they are often playing with a minor nagging injury that prevents them from playing the field. I donâ€šÃ„Ã´t know how you could use his Minnesota splits to argue *against* his playing first base.

Do you have any actual evidence that his being â€šÃ„Ãºbotheredâ€šÃ„Ã¹ by a move to first would actually make him hit worse? If not, then the dollar cost of the move is 0. Look, Craig Biggio didnâ€šÃ„Ã´t want to move to 2B, and his hitting didnâ€šÃ„Ã´t suffer with the switch. Youâ€šÃ„Ã´d have to have a pretty low opinion of Ortizâ€šÃ„Ã´s character to think he would intentionally sabotage his hitting just to protest a position swap. Do you have any actual evidence that his being unhappy would make the *other* players on the team play worse? Thatâ€šÃ„Ã´s an even less credible, and completely unsupported, assumption. These guys may not be robots, but they are professionals, and they know that their paychecks are tied to their performance. Plenty of successful teams have fought with each otherâ€šÃ„Ã®the 77-78 Yankees leap to mind. Winning creates chemistry, and winning attracts fansâ€šÃ„Ã®not the other way around. I donâ€šÃ„Ã´t see any more reason to take it on faith that Boston would actually win fewer games as a result of Ortizâ€šÃ„Ã´s displeasure at being moved to first base than I do to believe that Derek Jeter raises his game in clutch situations, or that non-knuckleball pitchers â€šÃ„Ãºinduceâ€šÃ„Ã¹ weak contact to a large degree, or that only a special club of psychologically superior relievers can pitch in the 9th inning of close games. Iâ€šÃ„Ã´m willing to be convinced of any and all these things, but youâ€šÃ„Ã´ll have to show me something more concrete than vague comments about a â€šÃ„Ãºcomfort level.â€šÃ„Ã¹ And you know what? If Ortizâ€šÃ„Ã´s OPS suddenly dropped 150 points while playing first, they could always just switch him back. Seems to me like itâ€šÃ„Ã´s worth a try, no?

***

At this point, it seemed clear Dan and I were talking past each other, although at least we were doing it civilly. Since this is my blog, I’ll sum up my points in the next post…and Dan, goshdarnit, if you still wanna rumble — and lord knows I’m fully capable of going overboard when I decide to get into it — let’s do it in the comments section.
* There are some relatively esoteric terms in here (for example: PBP, which stands for play-by-play, and connotes precisely that: the effort to examine every defensive play and then find a way to systematically examine that information. Baseball Prospectus has a decent glossary of sabermetric terms, although you won’t find PBP in there.)

Post Categories: Dan Rosenheck & Keeping Score & Manny Ramirez & New York Times & Statistics

In which I get the final word: Keeping score of Manny, pt. 3

March 21st, 2007 → 10:57 am @ Seth Mnookin

Right. So: here’s part 3 of this whole imbroglio. I’ll sum up my position as follows: stats are enormously useful, and the intelligent use and analysis of statistical information has revolutionized the appreciation of and understanding of baseball, just as it’s revolutionized the way smart front offices put together their on-field teams. As Murray Chass knows, I think people who argue to the contrary are, to put it bluntly, Neanderthals.

But we should be as careful in our use of (and as wary of our over-reliance on) statistics as we should be about old-fashioned scouting. The statistical analysis of baseball is most effective when it’s one part of an integrated-arsenal, and that arsenal includes scouting, player relations, and a healthy dose of skepticism. Ergo: when the very people compiling the raw data for defensive stats tell me there are situations in which it’s all but useless, I’ll be skeptical.

And we should also take care to consider the human element of the game. There are plenty of times when a player’s comfort level/happiness can effect his play and/or the general mood around the team (see: Garciaparra, Nomar). Ortiz has been vocal enough about what he sees as the advantages of DHing and the disadvantages of his playing in the field that I think that option is pretty much a non-starter.

Does that mean, as I likely implied, it hasn’t been considered? Absolutely not. But I think it’s been discarded as a serious option. And if it hasn’t, I think it should be.

Post Categories: Dan Rosenheck & Keeping Score & Manny Ramirez & New York Times & Statistics

Keeping Score: The Times weighs in with another boneheaded Sox story

March 18th, 2007 → 9:30 am @ Seth Mnookin

Only the most dyed-in-the-wool fanboys would ever claim that Manny Ramirez is a good outfielder; those folks that claim that he ouwits opponents by goading them into running on his (at best) average arm are as bad as folks that argue that Bush actually has a plan for getting out of Iraq. But his outfield play is not responsible for the Red Sox’s woes over the past six years, as some would argue.

Like, for instance, today’s “Keeping Score” column in the Times, which, in its own way, is just as dumb as Murray Chass’s “I refuse to learn anything about statistics because I’m a lazy toad, er, it would ruin my enjoyment of the game” gem. In today’s piece, Dan Rosenheck tries to smokescreen the reader with lots of impressive sounding, supposed truisms to argue that Manny’s defense is so bad it basically brings him down to the level of a mid-level All-Star. His central argument is this: “Accurate numerical evaluations of defense only became possible in 1987, when Stats Inc. began sending observers to every game to record the location and speed of every batted ball. This play-by-play (P.B.P.) information made it possible to measure fielding ability much more precisely, by comparing the rates that players at the same position fielded various types of balls…”

I’ve spent a fair amount of time speaking with those Stats Inc. “observers.” They are, for the most part, college kids who are given little training and are paid poorly to sit in the stands and carve up the field into zones belonging to each defensive position. The problem is, those zones are about as reliable as Mel Gibson once he’s gotten a few drinks in him. (Right, Leary?) To give Stats Inc’s P-B-P info this much weight is as dumb as, say, giving Derek Jeter the Gold Glove because you think he looks good in the field. Smart observers — and smart teams — make every effort to create their own defensive metrics, and those same observers have made cogent arguments as to why their work should not, on the whole, be considering overly reliable.

Rosenheck solidifies his Chassness with the following, completely asinine suggestion:

“The other solution would be to move Ramâˆšâ‰ rez to designated hitter. That would require switching the incumbent D.H., David Ortiz, to first base. Ortiz is even less mobile than Ramâˆšâ‰ rez, and given his corpulence, the demands of playing the field may substantially increase his risk of injury.”

To which I can only say: Wow. Ortiz has said clearly he’s more comfortable as a full-time DH; there’s also plenty of evidence (anecdotal and actual) that at least part of Ortiz’s prodigious offense results from the time he spends in the clubhouse between at-bats, when he studies previous at-bats against the opposing pitcher and reviews what might lead to success. What’s more — what’s more important, in fact — is the evidence that Ortiz’s well-chronicled injury history resulted from the pounding he took in the field. And bad knees plus first base is a bad combo. Right, Buckner?
Finally, “corpulence”? That’s a fancy way of saying someone’s fat. “Given his size,” maybe. “Given his history of knee injuries and attendant immobility,” maybe. But fat? David Wells is fat. I’ve seen David Ortiz with his shirt off. He’s a big man. But he’s not fat. And I bet Rosenheck is glad he’s not ever going to risk saying that to Papi’s face.

“Keeping Score” is often one of the Times‘s most interesting sports columns, especially when David Leonhardt is weighing in. Today’s is embarrassing. At the end of the day, Manny’s play in the field undoubtedly hurts the Sox. It’d be interesting to find out just what the cumulative effect of this is. We’re not going to learn that from the Times.

Post Categories: Dan Rosenheck & Keeping Score & Manny Ramirez & Murray Chass & New York Times & Statistics

Browsing Category Keeping Score

I reprint, you decide: the Times on Manny, pt. 2

In which I get the final word: Keeping score of Manny, pt. 3

Keeping Score: The Times weighs in with another boneheaded Sox story

Site Search