Analyst Report: The challenge of understanding our health care data


There is a huge need for data in the health care sector, given the reforms underway and a renewed focus on providing better care. Broadly speaking, the more data we have the more intelligence we can derive from it to improve the quality and cost of care in the U.S. But vast quantities of data alone are not helpful. We need insight into that data, and that’s where the real challenges lie.

The data dearth

In today’s health care system there is a huge lack of quality data — that is, clean, structured, standardized, and codified data available for analysis.

There are efforts underway at the state level to set up Health Insurance Exchanges (HIEs) that aggregate data from federal and state agencies as well as from health plans, consumers, and employers. Central to an HIE is the All Payer Claims Databases (APCD) council, which U.S. states use to collect pricing data from insurance companies in order to provide a more transparent look into the cost of procedures.

A state would be able to see that procedure X is reimbursed at a rate of Y. Any insurer charging more would raise a flag in the APCD system. It sounds good, right? Unfortunately, the insurance companies don’t like this, for a couple of reasons: It costs them money to program their data for APCDs, and each state asks for the data to be presented in a different way, which takes time and causes errors that have to be corrected. Insurers are also reluctant to hand over their pricing data, which, after all, is their secret sauce. Such a level of transparency never existed before, and it’s one that few — if any — insurance companies want.

Large and small insurance plans alike describe themselves as data factories, coding and shipping data out the door to the state and federal government on a near-constant basis. One place data goes is the National Committee for Quality Insurance (NCQA), which audits health plans on the care they provide their members. The NCQA uses the Healthcare Effectiveness Data and Information Set (HEDIS) to measure the performance of health plans. Diabetes, for example, has certain codes and procedures associated with it (e.g., blood sugar tests). Lab results also have codes. The HEDIS tool looks for certain codes to make sure a case of diabetes is being managed properly, and if the codes are not there a flag goes up in the system. It sounds reasonable until you look more closely and discover that there is a lot of variability in medical coding that requires health care providers to redo their codes to be inline with HEDIS.

Medical billing is another area rife with coding errors. ABC News and others have reported that the error rate on medical bills may be as high as 80 percent. A survey conducted by Consumer Reports, however, found that only 5 percent of respondents had found an error on their bill. The overwhelming majority of estimated errors are simply paid by both insurance companies and consumers, and many are the result of assigning incorrect medical codes to the patient’s bill.

Health data: a land of silos

Then there’s the silo problem. Most of the data about our health exists in privatized silos at hospitals and insurance companies. Many patients fill out the same form every time they see a different doctor — even their primary care providers.

We should be able to authorize and share one version of our electronic medical records (EMRs) with whomever we choose and whenever we choose, and not have to schlep copies of lab results, X-rays, or whatever else on a CD from provider to provider. But EMR systems do not interoperate, and health care systems in general are inflexible and proprietary. The Health Level 7 (HL7) standard was meant to enable one health care system to speak to another. But according to Stephen Hau, CEO of Shareable Ink, a startup that’s digitizing paper medical records, there has been so much customization by EMR vendors it’s almost impossible to share data between systems. At best, information can be shared between hospitals within a health plan but not with hospitals outside that plan. And that’s if you’re lucky.

There is also a problem with the documentation itself, which is not electronic to begin with. According to the Health Information and Management Systems Society (HIMSS), less than a quarter of hospitals in the U.S. document electronically, which means most of the data being collected is opaque to the industry. Many doctors still use digital voice recording and transcription services that convert their speech to text, but this data is not structured, which means it can’t be read by another system.

And with all the new mobile devices and apps that record our calories in and out, schedule our exercise routines, and record our weight and other health metrics, there are more companies collecting our data than ever before. But that data is kept mostly in proprietary stores, compounding the silos and the fragmentation problem. If you’re building an iPhone app targeting the health care market without providing a way to responsibly anonymize and share the data you are collecting, you are part of the problem, not the solution.

Privacy considerations: What not to do with health data

Which brings us to the thorny topic of privacy. There are obvious risks once we are able to create data liquidity in health care. We do not, presumably, want our health care data used by providers so they can assess the potential risks associated with each of us and cherry-pick individuals that will generate the most profit.

It’s questionable whether these young startups creating apps for mobile devices collecting health data have legal departments or know how to stick to the existing HIPAA rules around data privacy. But such ethical considerations can’t be ignored. Consider the Facebook privacy debacle and then imagine a company doing something questionable with your health care data and apologizing for it after the fact.

Some companies today are committed to responsibly collecting data, although they are small startups and hardly make up the majority of industry players. For example, 23andMe, a database for personal genetic information, goes to great lengths to ensure the privacy of its participants’ data. The company pools and strips data of any identifying information before making it available to researchers and other third parties to study.

The future is still a long way off

In order to get to a future where we can understand health patterns across an entire population and potentially revolutionize medicine, data must be shared seamlessly between customers, providers, and payers.

To think that consumers will drive this change is naïve. If Steve Jobs taught us anything it was that consumers do what they are told to do. Very few people are willing to wade through health treatment plans or medical bills to figure out if they are getting the best price for their care. It will be government (through new regulations) incentivizing large insurance companies and doctors to adopt new health IT tools and best practices for data management that will drive change. This is already underway, but as with all major technology shifts, it’s at least a decade in the making, and given the sensitivity of the data we are talking about, probably much longer.

Table of Contents

  1. Summary

Join Gigaom Research! Become a subscriber and get reports like these, plus our collection of over 1,700 reports from world-class analysts for just $995 a year.