Health Care Health Care Reform

Do not use the Surgeon Scorecard

(I have updated this post with a link below to my essay on Medscape.)

No data is better than bad data.

The ProPublica Surgeon Scorecard is not ready for prime time.

It was a good effort. I support investigative journalism and ProPublica. We need more transparency in Medicine. I despise the utter lack of meritocracy in healthcare.

But that doesn’t mean we should embrace flawed data.

My friend, electrophysiologist Dr. Jay Schloss, has written a detailed review of the criticism of the ProPublica project on his blog, Left to My Own Devices.

Jay is one of the sanest, smartest physician voices on the Internet. Here is an excerpt.

So is Surgeon Scorecard bad data? Strong words, but I say yes. This analysis was a great idea, but it fails to deliver on its goals. The data and methodology both have significant flaws. I say that from the perspective of a working clinician and clinical researcher with over 20 years experience. This project is as much science as it is journalism. I therefore see no reason this work should not be peer-reviewed and discussed as would any scientific outcomes study. As I suggested to ProPublica, we need to kick the tires.

Please read Jay’s piece. Then think a bit about data and methods. For instance, if you like your conclusion, that surgeons vary in skill, it’s easy to see past flaws in the experiment.

You can read my take on the Surgeon Scorecard on Medscape: Failing Grade for ProPublica’s Surgeon Scorecard


9 replies on “Do not use the Surgeon Scorecard”

The link you have for “Jay’s piece” goes to the ProPublica article on the data.
I wish they had posted some data from cardiac operations/procedures so I could assess how they compare with my first hand experience of the surgeon’s skills and outcomes.

The medical community can learn from the Real Estate world on this one.

Real Estate professionals loathe Zillow and the inaccurate individual property estimates they provide directly to the public.

Consumers, on the other hand, can’t get enough of the data Zillow offers. The thirst for data is so strong that people are clearly deciding that some information is better than none.

For example:

In the absence of accurate data, most people will go with whatever they can find. Remember anchoring bias.

The only way to improve the situation is to present accurate data in an open and easily accessible format.

Does anyone doubt that ProPublica wouldn’t have taken on this project if accurate data were already available?


That’s a really interesting observation. There is no doubt that the medical profession has done a poor job in getting information to our patients. This goes well beyond physician performance, or course. There is opacity all over medicine including medical records, prescription pricing, hospital billing, etc. Most doctors, myself included, want this to go away. We have limited control, but I agree we need to do better.

I’d like to extend your real estate analogy a bit. To me, the Zillow data on home value seems analogous to the Medicare administrative data on billing. Both are interesting markers, but have significant methodologic limitations. They are poor estimates for reality.

Imagine if a major news organization created a Real Estate Broker Scorecard. They would look at individual brokers and compare their actual selling prices to the Zillow estimate for the same house. The journalists then would construct a large database plotting each agent into red, yellow and green categories based on an agent’s ability to exceed the “Zestimate” valuation for the house. This would serve as a Scorecard for potential home buyers to use when they select their agent. Complex statistical methodology would be published in a supplementary white paper. Due to relatively low numbers of house sales per agent, confidence intervals are very broad on the data.

How do you suppose the broker community would react when they are publicly scored this way?


I’m not a RE Broker (I’m not involved in either medicine or real estate), so I can only guess the reaction to individual broker rantings. My guess is that the reactions would range from horrid outrage to complete indifference.

A few years ago the Real Estate profession was the focus on an unfriendly analysis in chapter 2 of the book Freakonomics. The analysis focuses on RE, but talks generally about a group of professions where the practitioner knows substantially more about the situation than the customer (patient) and their incentives are not aligned.

An excerpt:
“As the world has grown more specialized, countless such experts have made themselves similarly indispensable. Doctors, lawyers, contractors, stockbrokers, auto mechanics, mortgage brokers, financial planners: they all enjoy a gigantic informational advantage. And they [RE AGENTS] use that advantage to help you, the person who hired them, get exactly what you want for the best price. Right? It would be lovely to think so. But experts are human, and humans respond to incentives. How any given expert treats you, therefore, will depend on how that expert’s incentives are set up. Sometimes his incentives may work in your favor. For instance: a study of California auto mechanics found they often passed up a small repair bill by letting failing cars pass emissions inspections—the reason being that lenient mechanics are rewarded with repeat business. But in a different case, an expert’s incentives may work against you. In a medical study, it turned out that obstetricians in areas with declining birth rates are much more likely to perform cesarean-section deliveries than obstetricians in growing areas—suggesting that, when business is tough, doctors try to ring up more expensive procedures. It is one thing to muse about experts’ abusing their position and another to prove it. The best way to do so would be to measure how an expert treats you versus how he performs the same service for himself. Unfortunately a surgeon doesn’t operate on himself. Nor is his medical file a matter of public record; neither is an auto mechanic’s repair log for his own car. Real-estate sales, however, are a matter of public record.”

The upshot – agents handled client houses differently than their own:

RE Industry reaction was mostly explaining and arm waving, with little to no actual data or thoughtful rebuttal. The did exactly the wrong thing, IMO.

It’s a complicated problem, but I still believe that openly publishing the best data available is the best solution. We should stop asking if the data are perfect and start asking if they’re better than what we already have. Secrecy won’t work anymore.

2 extra concerns:
(1) Observer effect – measuring something changes whatever is being observed. If low complication rates become the ultimate scorecard, many will avoid complicated cases.

(2) Giving people the ability/safety to take chances and make mistakes has big benefits. Practicing from a position of fear or a desire to manage the scorecard won’t get the absolute best work out of people.

It’s a tough problem.

As Ed Livingston pointed out in the article, the skill level of the surgeon is frequently inversely related to outcome, owing to referral patterns. For example, the best cardiac surgeon often gets the most difficult cases referred to him/her, owing to advanced skills. These are often pts whom no other doc would touch. Very complex cases, with high comorbidities, often have worse outcomes–regardless of technical skill.
For elective surgeries, many PCPs refer to the surgeons whom they trust as a result of years of interactions and observation; not based upon flawed data.

Randy, I agree with all these points. We’ve been using baseball analogies a lot for this story.

To illustrate your point, imagine you are making a wager to determine who will win the 2015 National League batting title in April. You can get your advice from a seasoned baseball scout, or you can look at the batting averages for the first week of the season, picking your winner based on those stats. You get two different players using these methods.

The baseball scout is your PCP. The week one batting average is Surgeon Scorecard.

OK, place your bet.


Very interesting discussion here. I think Dr. Schloss has an excellent point here – there’s definitely got to be a better way to do this than the Surgeon Scorecard. Thanks for sharing!

Comments are closed.