Shah works in biomedical informatics, meaning he works toward making sense of the information in clinical data warehouses.
Sequencing of a human genome yields a massive amount of data, and storing one person's genetic code can require up to 1TB of data storage capacity, Shah said.
The human genome contains 3.2 billion lines of code, which means that finding a flaw in that code requires sophisticated computer algorithms and massive, clustered server farms. Adding to the complexity is that disease is often the result of multiple mutations, according to Shah.
While diseases such as Huntington's or Alzheimer's disease are caused by common genetic mutations, and are more easily spotted, most illnesses are caused by rare mutations. Diabetes, for example, is thought to be caused by a number of genetic mutations, which on their own confer a small amount of risk, but in combination can be more serious.
"If you genome type someone, and out of the 50 [mutations associated with diabetes] you have 10 of them, it's very hard to say what's going to happen to you," Shah said. "Part of the problem is that we just need to do more research and collect more data, and some of it we just need better methods."
But tremendous progress has been made. To date, scientists now know the genetic causes of about 5,000 rare diseases.
One of the most promising areas of genetic research is pharmacogenomics, which uses a person's genetic makeup to determine how they'll respond to drugs, tailoring treatments to specific mutations -- even mutations found in cancer tumors.
For example, the drug Zelboraf was developed by New York University's Cancer Institute a couple of years ago through genetic tests to target melanoma skin cancer tumors that express a gene mutation called BRAF V600E. Researchers found patients taking Zelboraf were 64% less likely to die from the advanced form of skin cancer than patients who received only standard chemotherapy.
"Looking at your genome does help in saying, 'For you, we should give half the dose of this drug, but for this other person we'll give you a double dose of that drug,'" Shah said.
Linking EHRs with genomes
Currently, there are several projects underway to link EHRs and human genomic data. Among the most promising is the Electronic Medical Records and Genomics (eMERGE) Network.
Funded by the National Human Genome Research Institute, the eMERGE network joins researchers from nine healthcare research organizations and hospitals with a wide range of expertise in genomics, statistics, ethics, informatics and clinical medicine. Up to 10,000 patients will have sequencing performed on them in reference to 83 specific genes, with another 50,000 to 80,000 patients getting more general genotypes.
Sign up for CIO Asia eNewsletters.