12 Comments

Summary:

Slowly but surely, health care is becoming a killer app for big data. Whether it’s Hadoop, machine learning or natural-language processing, folks in the worlds of medicine and hospital administration understand that data is the key to helping them take their fields to the next level.

Slowly but surely, health care is becoming a killer app for big data. Whether it’s Hadoop, machine learning, natural-language processing or some other technique, folks in the worlds of medicine and hospital administration understand that new types of data analysis are the key to helping them take their fields to the next level.

Here are some of the interesting use cases we’ve written about over the past year or so, and a few others I’ve just come across recently. If you have a cool one — or a suggestion for a new use of big data within the healthcare space — share it in the comments:

  • BI for doctors. Doctors and staff at Seattle Children’s Hospital are using Tableau to analyze and visualize terabytes of data dispersed across the institution’s servers and databases. Not only does visualizing the data help reduce medical errors and help the hospital plan trials but, as of this time last year, its focus on data had saved the hospital $3 million on supply chain costs.
  • Semantic search. Imagine you’re a doctor trying to learn about a new patient or figure out who among your patients might benenfit from a new technique. But patient records have been scattered throughout departments, vary in format and, perhaps worst of all, all use the ontologies of the department that created the record. A startup called Apixio is trying to fix this by centralizing records in the cloud and applying semantic analysis to uncover everything doctors need, regardless who wrote it.
  • Hadoop for everything. Cloudera is partnering with the Mount Sinai School of Medicine to help it develop new methods and systems for analyzing biological data. But that’s just the latest of Cloudera medical efforts, which also include working with the Food and Drug Administration to detect unsuspected adverse side effects from multi-drug combinations, and Emory University on helping pathologists more accurately analyze medical images. One of Cloudera’s customers, Explorys, built a business around aggregating and analyzing medical records, and Intel and NextBio are teaming to tune Hadoop for processing genomic datasets.
  • Watson. IBM has dozens of irons in the healthcare fire, but its coolest might well be a partnership with WellPoint to put the Jeopardy! champion question-answering system in doctors’ offices. Watson could help doctors answer questions posed in natural language by analyzing them against mountains of medical research data that no individual doctor could possibly read and digest.
  • Getting ahead of disease. It’s always good if you figure out how to diagnose diseases early without expensive tests, and that’s just what Seton Healthcare was able to do thanks to its big data efforts. Trying to find better ways to detect congestive heart failure early in order to save the exorbitant costs of treatment as the disease progresses, a team found that a distended jugular vein — something that can be spotted during any routine physical exam — is a particularly high risk factor.
  • Data scientist in residence. Here’s a new title for a healthcare organization — chief data scientist. Yet, that’s exactly the position Alliance Health Networks just added in May. The company, which provides social networks focused on specific medical conditions, acquired medical research database Medify and decided it needed someone to lead the effort of analyzing all that data and providing valuable feedback to network users.
  • Crowdsourced science. In a field where controlled experiments can be expensive and sometimes ineffective, it’s turning out there might be no substitute like the real-world data. Probably the most widely known company in this space is PatientsLikeMe, a social network designed to let individuals share their medical conditions so they can learn from others like themselves what treatments might work best in their particular circumstances. As a side effect, the company is able to conduct observational trials based on data users willingly volunteer.

Feature image courtesy of Shutterstock user lenetstan; Tableau graph courtesy of Perceptual Edge; exam image courtesy of Shutterstock user Blaj Gabriel

You’re subscribed! If you like, you can update your settings

  1. Steve Ardire Sunday, July 15, 2012

    Agree but when you said “some other technique” you should have been more explicit by saying something like sophisticated semantic analysis of linked data which trumps Hadoop, machine learning, natural-language processing……Cheers !

  2. Reblogged this on txwikinger's blog.

  3. Suprised there was no reference to Rateadrug or other user public data base systems.

  4. Derrick Harris Sunday, July 15, 2012

    Does RateADrug do any analysis of the database? User reviews are great, but they’re better when combined with correlations between drugs, effects, demographics, etc. That’s where I think the real value of consumer data lies.

  5. Dave Mackey Sunday, July 15, 2012

    I’d like to see meta-analysis of nationwide (worldwide?) data. I’ll give permission to my medical records, then correlate my health with that of others who grew up in the same locale, ate the same foods, underwent the same vaccinations, worked the same jobs, and so on and look for common patterns in disease. I think this could turn up interesting stuff. Folks spread across the US suffering from diverse symptoms might discover they were supposed to x substance in x location at x time…and this would only become apparent when the entire third grade class from x shows up with similar symptoms – even though spread across the nation.

    1. I love that idea, actually. I would imagine you could actually get a lot of local support from small town residents on that.

      1. The adoption of electronic health records is starting to reach a critical mass that will enable the type of analysis you describe. I’m a data scientist at Practice Fusion, a free web-based electronic health record company that covers over 150,000 healthcare professionals and 40 million patients. In effect what you’re describing in the last few sentences is a form of syndromic surveillance. We’ve started to do some work on this at research.practicefusion.com and there’s also great work being done on this by the Primary Care Information Project in New York. Huge potential to eventually replace ad-hoc case reporting for outbreak detection with real-time reports aggregated from EHR data.

  6. Actian’s vectorwise is helping many companies, they just started a relationship with companies that provide database management support for hospitals. http://www.actian.com/newsroom/press/medical-data-vision

  7. Alineh Haidery Monday, July 16, 2012

    How about making sense of large amounts of data to understand the root causes of patient flow barriers causing delays in healthcare and inhibiting patient safety?

  8. Jo Prichard Monday, July 16, 2012

    Hi Derrick

    At LexisNexis Risk Solutions we are actively engaged in using the open source HPCCSystems data intensive compute platform along with the massive LexisNexis Public Data Social Graph to tackle everything from fraud waste and abuse, drug seeking behavior, provider collusion to disease management and community healthcare interventions.

    We have invested in analytics that help map the social context (not social media) of events through trusted relationships to create better understanding of the big picture that surrounds each healthcare event, patient, provider, business, assets and more.

    If you have time to touch base for a more detailed overview, please feel free to give me a shout!

  9. So where does a young grad student go to get involved in making sense of all this data that will be pouring in as more and more folks adopt EHR systems? I’m excited about companies such as Apixio and Allscripts. It seems it’ll be quite a bit of work to structure this data and apply it in a meaningful way. Perhaps you can infiltrate one of the teams that is doing this and offer up some further insight?

  10. speyersroadnews Monday, September 17, 2012

    At the Institute for Health Metrics and Evaluation, we are currently finalizing analysis on the Global Burden of Disease. We are estimating the burden from morbidity and premature mortality for 240 causes in 187 countries by age and sex, along with 60+ related risk factors. We are using tens of thousands of data sets and data points from censuses, surveys, vital statistics, hospital records, disease registries, literature reviews, and many other sources. Not big data in terms of terabytes, but certainly big data in terms of number of data sources with different taxonomies and ontologies, and issues around quality and documentation.

Comments have been disabled for this post