While currently using a NoSQL, CouchDB database backend, the ASCO is considering using Cassandra with Hadoop for the full build. That database is expected to be completed in 12 to 18 months.
Beyond helping an individual patient, big data will allow the healthcare community as a whole detect poor drug interactions quickly. "So this gives us the ability to look at that [common cancer] population and figure out the best dosages and cycles of treatment," Hauser said.
While the ASCO is among the largest cancer research organizations, it is by no means alone in its use of big data in determining best practices.
Cleveland Clinic - a 4,500-bed healthcare system - uses an EHR from Epic Systems and a SQL transactional database for retrospective data analysis of its EHRs to improve patient treatment.
"We think first about outcomes: what data can we collect and make available to clinicians so they know how well they're doing in treating their patient," said Dr. C Marin Harris, CIO of Cleveland Clinic.
Cleveland Clinic is also starting to use Hadoop, but it's still a small part of the research because data is internally confined.
"It may appear if we only analyze Cleveland Clinic data that we're doing well with regard to a patient, but in fact if the patient went to someone else's emergency room 10 times, we didn't know that," Harris said.
Cleveland Clinic is working with other state health plans to collect a broader swath of patient data. Along with Ohio's other largest healthcare provider, University Hospitals, Cleveland Clinic is preparing to share data across Ohio's statewide electronic medical records exchange, CliniSync .
Once on the exchange, Cleveland Clinic will be electronically linked to 21 other hospitals already using the system.
One chronic disease targeted by the Clinic's data analytics engine is diabetes. The analytics engine searches EHRs for the results of A1C tests, which is a long-term measurement of glucose in red blood cells. Knowing a person's average, long-term glucose level can predict their likelihood of suffering other diseases such as kidney failure or stroke.
Cleveland Clinic knows the problem is multi-faceted. Patients must follow treatment regimes and choose healthy lifestyles, and physicians must have long-term data to tailor treatment. But, as Harris notes, if the patient doesn't know how they're doing at a macro level, it's more difficult for them to change their behavior.
"...That information is used to not only send alerts to the physician but also [to] the patient," Harris said. "They can become stewards of own healthcare at some level."
To more directly engage patients, Cleveland Clinic allows them to enter their own data via glucose readers, ether manually or having it automatically entered via a mobile device to a personal health record (PHR). Cleveland Clinic uses Microsoft's free HealthVault cloud service as its PHR. The HealthVault application can then transfer that data to the clinic's EHR for physician and data analytics use.
Sign up for CIO Asia eNewsletters.