ProPublica: So why can’t the government analyze the data? And what about commercial insurance plans? What questions should we ask the data?

There was a recent set of articles quoting ProPublica’s data analysis of Medicare Part D data. ProPublica acquired the data through the Freedom of Information Act and integrated the data with drug information and provider data. As a side note there has also been the recent publishing of CMS pricing data using socrata dataset publishing model (an API factory). (Side Note: You can plug into the data navigator at CMS).

You can view various trends and aggregations of data to compare a provider against others and navigate the data to gain insight into the script behavior of Medicare Part D providers. If you review the methodology used to create the data, you’ll realize that there are many caveats and just reading through some of the analysis, you realize that a simple evaluation of the data is insufficient to identify actionable responses to insights. You have to dig deep to see if a trend is really significant or an artifact of incomplete analysis. Yes, there is danger in not understanding the data enough.

But the ProPublica analysis is a great example of analysis by external groups. It is an example of simple provider profiling that helps detect variations in standards of care as well as outright fraud. The medical community continues to improve standards of care but it is a challenging problem with few incentives and governance structures.

The question we may ask ourselves is, “Why does the government not perform more analysis?”

The short answer is that they do. The government performs extensive analysis in a variety of ways. What the ProPublica publications really show us is that there is alot more analysis that could be performed and that could be useful in managing healthcare. Some of it is quite complex and cost-wise we should not, either through expectations of infinite funding which do not exist or by law as set by congress, expect the government to perform all the various types of analysis that one could imagine should be performed. Everyone admit, that there is more there and I am sure we all have an opinion about top priorities that conflict with others.

And the government does act as publisher of data. The Chronic Condition Warehouse (CCW) is a good example of data publication. The CMS also has plans to do more in the information management space that should make the sharing easier. I am concerned about pricing though. Based on a very small sampling, it appears that extract costs are still quite high and cumbersome–on the order of $80K for several extracts covering just 2 years. This needs to flatline to $0 per extract since we already pay for CMS already and our funds should be wisely used to enable this service from the start. Both anonymized and identified datasets are available. Comprehensive, anonymized datasets should be available for free.

This publication of the data, including the pricing data, is a great example of “democratizing” data. Many companies use this language to describe the ability to access datasets in a way that any analyst (with sufficient HIPAA safeguards) can gain insight and do their jobs better through information driven guidance. We can see from these examples that just publishing the raw data is not enough. You must have the information management technology to join it together with other datasets. This is what makes analysis so expensive and is the main barrier to data democratization.

So what can’t commercial health plans publish their data? There is really no benefit to them for publishing. Although one could argue that individual state subsidies such as non-profit status, and hence a state entitlement that the residents pay for, should motivate the ability to force data publishing, there is really no benefit for commercial health plans to publish data. Commercial plans do analyze Provider data and create Pay for Performance (P4P) programs used to manage their networks. P4P often ranks Providers and provides incentives to provide more “value.”

Of course, the free agency theory applies here and P4P can really only ever be marginally helpful. Sometimes marginally helpful is good of course so I am not dismissing it. However, the same issues around the ProPublica analysis applies to health plans’ data.

First, the information technology of many plans is fairly immature despite the billions these plans handle. This is because they focus on claims processing versus analytics.
Second, they have the same data integration issues that everyone else has–and its hard work to get it right after 30 years of extremely bad IT implementations and a lack management talent.

Things are changing now but I predict that even with better “data democratization” information management technology there is still not enough coverage of analytics to be transformational. It is possible that if the government really wants to get serious about managing costs of healthcare and gaining insights from data to help drive transformational cost changes, it really needs to have all the plans publish their data together.

Again, you run into competitiveness issue fairly quickly since the “network” for a plan and the prices they pay are a big part of a plan’s competitive advantage.

But I like to blue-sky it here on my blog.

As a truly blue-sky thought, if the U.S. is really, albeit slowly, moving towards single payer (the government is already the largest payer already anyway) then as compromise to keep off true single payer, perhaps the government can force publishing of claim data for anyone to analyze (following HIPPA of course). This could stave-off the march towards a single-payer model and introduce consistency in the networks. This would shift the competitive focus the plans have and force them to compete in other healthcare areas that need more focus, such as sales & marketing, member/patient education outreach, etc.

Of course, there is another blue-sky thought–that Providers will start their own plans (a micro-patchwork of 1,000s of plans) publish their own data according to government standards and democratize the data to help the health of our citizenry. There are already examples for this model. The ACO model provided by the new parts of the Patient Protection Act as well as Medicaid programs where MCO have sprung up attached to hospital systems to serve the Medicaid population profitably.

As a final note, what surprised me the most about Part D prescriptions is that 3% of the script writers, wrote more than 1/2 of all prescriptions. This could mean that these Providers are concentrated around those that need the most help. Perhaps some government focus on these “super-scripters” could help help manage their costs down.

There are some other thought provoking bits of information as well just in the topline numbers. Based on the report, the ratio of providers to beneficiaries is 1 out of 15. This seems like a really high concentration of providers to beneficiaries in the sense that each physician who wrote Part D scripts saw 15 beneficiaries. In the Michael Porter world where specialists focus more on their specialty and become highly efficient at it (better outcomes, lower costs), I would think that a higher ratio would reflect focus and perhaps the opportunity for innovation. Perhaps not.

Also, what’s astounding is that the total cost was roughly $77 billion dollars. This is for prescriptions including the costs of the visits. This helps prop up the pharmaceutical industry. Many of the top drugs are still branded drugs versus generics. But regardless of the industry it helps, that’s alot of money. Drugs are wonderful healthcare productivity boosters (they truly make a difference in people’s quality and duration of life) but we need to continually attach these cost bumps to shrink them.

It would also be instructive to bounce the medicare Part D data against provider quality scores, say at an aggregate level.

We could then answer the ultimate questions, which centers on value. That is, for the $ we invest in drugs under Part D, are we getting good outcomes? Is the Effort-Return equation producing the best numbers for us? That’s the question we really need to answer.

Leave a Reply