Office of the Vice-Principal (Research)

Office of the Vice-Principal (Research)
Office of the Vice-Principal (Research)

Big Data: Transforming Medicine


Today, some of the most significant advances in medicine stem from digital information – ranging from detailed genetic data to high-level administrative data – captured during a patient’s various encounters with the medical system. By capturing and comparing these diverse data holdings, medical researchers are learning more about disease and developing more effective drugs and treatment protocols.

Obviously, data collection and analysis in medicine is hardly new. What is relatively novel is the sheer scale of it. The amount of data being collected is almost inconceivable, and only the combined skills of medical professionals, signal processing engineers, computer scientists, experts in genomics and bioinformatics and others can make sense of it all.

A number of medical researchers at Queen’s are on the forefront of this multidisciplinary “big data” work in Canada. We feature two of them here.

Richard Birtwhistle

Richard Birtwhistle is a professor in the Queen’s Department of Family Medicine and Public Health Sciences, the director of the university’s Centre for Studies in Primary Care, and the chair and principal investigator of the Canadian Primary Care Sentinel Surveillance Network (CPCSSN). The network collects patient information stored in electronic medical records (EMR) of primary care practitioners across Canada. Using complex algorithms, CPCSSN brings the data from these different EMR systems together into a consistent format. This enables researchers to use those data to answer questions about the incidence and treatment of diabetes, hypertension, depression, chronic obstructive lung disease,osteoarthritis and other chronic diseases that Canadian family physicians commonly deal with.

Launched in 2008, with funding from the Public Health Agency of Canada, CPCSSN now consists of more than 800 primary care practitioners – or “sentinels” – in seven provinces and one territory and the de-identified records of almost one million patients across Canada. Each doctor uses an EMR to record their clinical care of patients by inputting information such as body weight, blood pressure, body mass index, health conditions, referrals, risk factors for disease, lab investigations and any prescribed medications. Before any of this information is uploaded to CPCSSN, each patient is assigned a unique CPCSSN number that links them with their personal information, but this information does not leave the practice. Therefore, any data actually used for research remains anonymous.

The type of information collected in EMRs is difficult to get from other data sources (such as the Canada Health Survey), which is why a centralized repository holds such great potential for researchers and makers of health policy. The data are also useful to the network’s family doctors. Remarkably, although EMRs contain loads of information about individual patients, most systems don’t provide physicians with reports that shed light on all their patients as a group. The CPCSSN database provides this capability, thus allowing the doctors to track their patients better and provide better, more personalized care. This, by itself, is enormously useful.

“We have a system where doctors can find out how many people with out-of-control diabetes haven’t been seen in the last six months, then go back and link the CPCSSN numbers with the patients’ IDs and then contact them and get them into the clinic,” says Birtwhistle. “From a quality improvement point of view, it’s actually pretty important.”

Birtwhistle says EMR data that CPCSSN has collected is a gold mine for researchers seeking to learn more about chronic disease in primary care in Canada. Much of the data remain untapped. But CPCSSN’s greatest value may ultimately stem from enabling the data to be linked with other types of medical data, he says. “Linking patients’ primary care data to genomic data, for example, could open up tremendous potential for understanding not only chronic diseases, but other diseases as well.”

David Maslove

The intensive care unit (ICU) in every hospital contains a bewildering array of sophisticated devices that track patients’ bodily functions. The bleeping and blipping monitors let doctors and nurses know what’s happening in the bodies of very sick patients whose conditions can turn on a dime. Each of these devices constantly generates data, and some readings, such as blood pressure and heart rate, for example, are recorded in the patient’s EMR. Traditionally, however, most of these data have been discarded.

Thanks to the work of researchers like David Maslove, this situation is starting to change. Maslove is a clinician scientist in the Queen’s Department of Medicine and Critical Care Program and a critical care physician at Kingston General Hospital. His work involves capturing and analyzing massive volumes of detailed electronic data derived from patients in the hospital’s ICU to understand more about the nature and progression of acute illnesses.

Much of these data come from blood samples, which Maslove has been collecting for whole-genome transcription profiling for about a year. What sets this project apart from other genomics projects is the sheer number of samples collected over the course of a patient’s stay in the ICU. Starting this year, high-frequency waveforms from bedside monitors and ventilators will be added to the mix, generating gigabyte-scale data that must be assembled and analyzed using novel computational approaches.

The data-intensive work in Maslove’s field has the potential to transform critical care medicine. For decades, diagnoses in emergency and ICU medicine have been syndromic – that is, patients are assessed according to a set of criteria, and if their conditions meet those criteria, the doctors and nurses follow a certain treatment course. As with all treatments used in hospitals, the utility of that treatment will have been previously verified via a highly-regulated randomized clinical trial. In other words, the treatment used in the ICU is based on aggregated similarities exhibited by the group of patients in the trial. The physiologic individuality of each patient is largely lost.

In contrast, by analyzing detailed genomic data from the ICU, Maslove hopes to identify the differences, rather than the similarities, between critically ill patients who may have been categorized as having the same condition (such as sepsis, which is Maslove’s particular interest).

The Holy Grail for Maslove and others is a database of genomic and other ICU patient data that can be used by physicians and nurses at the bedside to make better treatment decisions in real time.

“From a scientific standpoint, it’s a very exciting cross disciplinary endeavour that involves bringing together expertise from clinical medicine, computer science, signal processing, epidemiology, genomics,” says Maslove. “We’re trying to find a way to bring all those data under the same roof so that they can be made available to clinicians at the bedside who are treating patients with rapidly evolving illnesses.”