March 21st, 2007 → 11:00 am @ Seth Mnookin // No Comments
On Sunday, I took issue with a New York Times “Keeping Score” article on the extent to which Manny Ramirez defensive deficiencies detracted from his overall value. Specifically, I called the piece “boneheaded,” “stupid,” and “embarrassing.”
Dan Rosenheck, the author of said piece, took issue with my criticisms, and in at least two situations, he’s totally right, First, it was stupid and boneheaded for me to call his piece stupid and boneheaded. (Ed: But isn’t the Interweb supposed to be about baseless accusations and unwarranted vitriol? Yes, but if the Interweb was jumping off a bridge, does that mean I should do it too?) (Ed: Aren’t you blatantly ripping the whole “Ed:” gimmick from Kausfiles, Mickey Kaus’s mostly political blog? Yes, except he usually has the editor assume the voice of reason. Which, you know, makes more sense.) Another good point Dan makes: from my post, it sounds as if I’m arguing that the problems with defensive metrics mean they’d (potentially) take a +30 player and make him a -30 player and that therefore PBP* stats are totally useless. I’m not arguing that…but you should read the back and forths below, starting with Dan’s initial e-mail, my response, and his response to my response. And then weigh in with your comments. There are certainly holes to be poked in my arguments. Although I’m still right…
(At this point, it probably makes sense to read Dan’s piece and my post, in that order. Or else this is all going to seem a bit obtuse.)
On using defensive metrics to evaluate Manny’s worth:
DR: I just have to ask: Are you actually familiar with the output of PBP systems? If the raw data they are based on were as haphazard as you suggest, the results would be inscrutable—you’d have guys going from +30 to -30 from one year to the next, guys who are clearly superlative defenders coming out poorly, and guys who are clearly horrific coming out well. Instead, the PBP systems pass every conceivable “smell test.” They show year-to-year consistency, with a clearly distinguishable aging pattern: players’ defense tends to improve until they are about 24, decline slowly until around 30, and then fall off a cliff. By and large, they square with anecdotal evidence: Gold Glove winner Orlando Hudson is indeed a great second baseman; the universally panned Alfonso Soriano was indeed a terrible one. And, perhaps most tellingly, they line up with each other. There are two different companies that each send observers to games, STATS Inc. and Baseball Info Solutions (BIS). Despite being based on entirely different data sets compiled by entirely different groups of observers, the PBP metrics based on STATS’ data show an exceedingly high correlation to those based on BIS’ results. None of this would be the case if the systems were half as unreliable as you suggest they are.
SM: The year-to-year consistency only shows consistency on the part of the individual scorers in each park. I can predict a counter-argument: but what about when players are traded, etc? Well, one crucial part of the equation that I left out of my post — probably stupidly — is that the hired-gun defensive scorers are actually fined (or docked pay) when their assessments vary too much from other assessments. That, combined with the fact that the training of the people who compile the PBP stats is so haphazard and varied means you get a huge amount of self-perpetuation. Finally, I know you know that I’m not arguing that a player would go from +30 to -30; I’m arguing a player could really be a -10 and, because of a combination of factors (including calcified notions of that person’s fielding), he’d consistently get rated a -20.
DR: That still doesn’t account for the agreement between BIS’ and STATS’ data sets, which are compiled independently by entirely different people and seems to me to be the biggest feather in the PBP systems’ cap. I certainly did *not* know that you were not arguing that a player would be from +30 to -30—I had nothing to go on but your blog post, which made it sound like PBP systems were completely worthless.
That said, I think you’re only slightly underrating the systems’ accuracy in your email. The consensus I was given in my interviews is that the 95% confidence interval is probably about 7 runs, meaning that if somebody is rated a -20 there’s a 95% chance he’s “truly” between a -13 and a -27. So it seems to me highly (less than 5%) unlikely that “a player could really be a -10 and, because of a combination of factors (including calcified notions of that person’s fielding), he’d consistently get rated a -20.” But if you changed your estimate to “could really be a -13,” or that a player who was rated -10 could consistently get rated a -17, I’d buy that. PBP metrics are not yet a fine-toothed comb—the main factors we are missing are positioning (extremely important!) and lots of sample size—but it’s more than precise enough to make the kind of estimates I did in my story (if the best guess we have is that Ramírez is about a -18, he’ll probably be about as valuable as Nick Swisher). I don’t think I overstated the case for the PBP stats’ reliability, either, since I made clear that the disagreement between systems was as big as 19 runs on Manny in 2006!
On moving Ortiz to first and Manny to DH:
DR: As for my proposal that the Red Sox move David Ortiz to first: I’m sure he is “more comfortable as a full-time DH.” But I find it hard to believe that from the team’s perspective, his “comfort” level is really worth 15 runs/1.5 wins/ $4 million per season. I’m not sure what “actual evidence” you’re referring to when you suggest that “part of Ortiz’s prodigious offense results from the time he spends in the clubhouse between at-bats, when he studies previous at-bats against the opposing pitcher and reviews what might lead to success.” Have you done a controlled study in which you deny Ortiz his precious clubhouse time for 3,000 at-bats and give it to him for another 3,000, and then compare the results? If not, I’m not exactly sure how you’d go about supporting that hypothesis. I acknowledge myself that he might be more vulnerable to injury if he played the field. The question is, how much more? Is it 5% more likely, 10%, 15%? And how much time would he miss if he were hurt? The team needs to come up with its best estimate for a dollar cost of the added injury risk to Ortiz, and compare that to the dollar cost of leaving Ramírez in left field. If the former is greater, then the Sox’ current alignment is the correct one (although they would be well-advised to explore a trade, since Ramírez would be more valuable to a team without a DH than he is to them). But if it isn’t, then they really should consider making a switch.
SM: As for Ortiz, I have lots of evidence to support that. I’ve spoken with him. I’ve looked at his stats from when he was in Minnesota (and playing first base more regularly), both in terms of time on the DL and offensive numbers. I’ve spoken with the people who work with him on the team’s baseball operations crew. When Ortiz strikes out, goes back into the clubhouse, studies the pitcher, and comes back and hits a walk-off home run, a couple of things could have been happening, and one of them certainly could be that he’s convinced himself this clubhouse time helps his actual batting skill more than it does…but the psychological component of the game is enormous. And this is a situation where it clearly makes no sense to try and determine whether Ortiz is 10% more likely to get injured, determine a dollar amount to correlate to that figure, and determine a dollar amount to leaving Manny in left. What you’d actually need to do is figure out the extent to which Ortiz, who is one of the people who helps keep a stressed out and often disgruntled clubhouse relatively loose (and is also a remarkably underpaid player on a team full of overpaid prima donnas), would be bothered by a move to first. Then you’d need to put a dollar amount on that. Then you’d need to put a dollar amount on the impact of his unhappiness on the other players on the team. Then you’d to put a dollar amount on the impact of Ortiz’s impact on fans. Then you’d need to put a dollar amount on the impact of a disgruntled fan base on a team playing in the most over-oxygenated city in the country. Etc.
DR: Ortiz’s stats in Minnesota suggest he hits *better* not worse, as a 1B. From 1998-2002, he had a 120 OPS+ (a combined on-base and slugging percentage 20% higher than the park-adjusted league average) as a 1B, and just a 109 OPS+ as a DH. That’s even more striking given that players usually hit worse when they DH, since they are often playing with a minor nagging injury that prevents them from playing the field. I don’t know how you could use his Minnesota splits to argue *against* his playing first base.
Do you have any actual evidence that his being “bothered” by a move to first would actually make him hit worse? If not, then the dollar cost of the move is 0. Look, Craig Biggio didn’t want to move to 2B, and his hitting didn’t suffer with the switch. You’d have to have a pretty low opinion of Ortiz’s character to think he would intentionally sabotage his hitting just to protest a position swap. Do you have any actual evidence that his being unhappy would make the *other* players on the team play worse? That’s an even less credible, and completely unsupported, assumption. These guys may not be robots, but they are professionals, and they know that their paychecks are tied to their performance. Plenty of successful teams have fought with each other—the 77-78 Yankees leap to mind. Winning creates chemistry, and winning attracts fans—not the other way around. I don’t see any more reason to take it on faith that Boston would actually win fewer games as a result of Ortiz’s displeasure at being moved to first base than I do to believe that Derek Jeter raises his game in clutch situations, or that non-knuckleball pitchers “induce” weak contact to a large degree, or that only a special club of psychologically superior relievers can pitch in the 9th inning of close games. I’m willing to be convinced of any and all these things, but you’ll have to show me something more concrete than vague comments about a “comfort level.” And you know what? If Ortiz’s OPS suddenly dropped 150 points while playing first, they could always just switch him back. Seems to me like it’s worth a try, no?
***
At this point, it seemed clear Dan and I were talking past each other, although at least we were doing it civilly. Since this is my blog, I’ll sum up my points in the next post…and Dan, goshdarnit, if you still wanna rumble — and lord knows I’m fully capable of going overboard when I decide to get into it — let’s do it in the comments section.
* There are some relatively esoteric terms in here (for example: PBP, which stands for play-by-play, and connotes precisely that: the effort to examine every defensive play and then find a way to systematically examine that information. Baseball Prospectus has a decent glossary of sabermetric terms, although you won’t find PBP in there.)
Post Categories: Dan Rosenheck & Keeping Score & Manny Ramirez & New York Times & Statistics
yerfatma
17 years ago
Blog fights always turn into an NBA-style brawl and it’s damned disappointing to the plebeians sitting comment-side. Somebody put a foot in somebody else. These things always stay civil because implicit in the argument is the fact the other party is reading your blog.
At the end of the day, the stats say use of the word “corpulence” for comedic effect gives me a 95% confidence someone is a tool, within a couple standard deviations of Screech from Saved by the Bell. Or Dustin Diamond. Take your pick.
DanRosenheck
17 years ago
I just realized I got one point backwards. Since players tend to hit worse as DH’s because they often have nagging injuries that prevent them from playing the field, that would partially *explain* why he hit better as a 1B in Minnesota, rather than counteract it. Which somewhat weakens my argument, although the point that he *did* hit better as a 1B than as a DH still stands. Just for the record.
yerfatma
17 years ago
Thank God you clarified that. The lion sleeps soundly tonight.