Feature | Information Technology | September 09, 2015

Understanding How Big Data Will Change Healthcare

New ways to mine data analytics will enable new avenues of research, identifying new patients prior to acute episodes and improving efficiency.

Big Data from Twitter showing stressed language posts compared to CDC reported heart attacks in study by Johannes Eichstaedt.

Figure 1. Big data, showing correlation between a CDC study on cardiovascular disease and a study conducted based on hostility in Twitter tweets. This demonstrates how big data from social media might be used in new ways to evaluate population health.

Understanding How Big Data Will Change Healthcare

Figure 2. A population health map created with big data analytics showing the incidence of low birth weight babies based on zip codes in South Carolina. This type of data might be used to determine locations for new women’s health clinics or for the allocation of funds for public health outreach programs.

The buzz term “big data” has made a rapid entry onto the healthcare scene in the past couple years with promises of improving healthcare, but there are still many trying to figure out how exactly it will accomplish this. Efforts were made to explain big data and its application to healthcare at the American College of Cardiology (ACC) and Healthcare Information and Management Systems Society (HIMSS) meetings earlier this year.

This perception is mainly due to the current state of health IT, which leaves a lot to be desired due to problems with interoperability and bottlenecks to sharing data from disparate software systems. However, as enterprise imaging and information systems become the new normal in healthcare, enabling the free flow of structured data across healthcare systems and health information exchanges, big data will play a major role. At the department level, analytical software can now identify workflow or standard-of-care issues so they can be corrected or made more efficient. This includes identifying and quantifying workflow improvements of new versus old equipment, to justify return on investment (ROI) for capital replacement costs (such as CT, CR, angiography systems, MRI, etc.). The analytics also can monitor data and set alerts for everything from radiation dose and patient throughput at clinics on specific machines using specific exam protocols, to procedural times in the cath lab based on each lab, physician and patient type.

These sorts of comparisons will become very important in the coming years for comparisons in care and how one center or department does something compared to others in the health system to improve ROI, efficiency and outcomes. As Medicare moves from fee-for-service to a value-based and bundled payment system, this data can be used to figure out the best strategy to diagnose or treat specific types of patients to reduce costs. 

“We are falling short of achieving meaningful use of health information technology,” said James Tcheng, M.D., FACC, FSCAI, Duke University, North Carolina, during a presentation on big data at ACC 2015. “We are focused on administrative click-off boxes, but data is now being used to help identify which patients we need to engage, and this is a paradigm shift. I think big data will become a large part of what cardiovascular care will look like in the future,” he explained. “Big data is really a vision of an interoperable health data infrastructure, and it will be a big driver in the future for research and how we provide care in the future.”  


Looking at Healthcare Data Differently

With the conversion to electronic medical records (EMR), Tcheng said the amount of data produced is vast and contains information that can now be easily accessed with electronic data mining, which was impossible with paper records. He said a typical tertiary care hospital generates about 100 terabytes (TB) of data per year. By comparison, he said the Library of Congress is estimated to contain only about 10 TB of text data. “Your healthcare institutions each generate more data per month than the entire Library of Congress,” he said.

Tcheng said the Library of Congress audio and video file collections add an additional 20 pedibytes (PB) of data to the overall count. By comparison, he said Google uses about 24 PB of data per day and the Duke Heart Center where he works processes about 30 TB of clinical data per year, including reports, images and waveforms.

This volume of data offers new sources of healthcare insights that previously were very difficult and time-consuming to manually tabulate. Today, electronic data mining allows fast-access data points that are currently buried in mountains of documents. Take, for example, the ability to find heart failure (HF) population trends based on region, state, county or individual hospital. These factors might include pulling vital sign trends from yearly exams or family history from EMRs to predict which patients should be screened for HF years before they present with HF symptoms. Analytics showing low-income areas with a high incidence of HF acute care episodes in emergency departments might help target prevention outreach programs or help earmark federal funds. This large amount of data might also help identify new ways to curb HF readmission rates or better engage patients to manage their condition.

Tcheng said one roadblock to processing existing data is that 95 percent of it in the world today is unstructured. This leads to issues when different doctors or institutions use varying terminology for the same thing. This variability makes meaningful data mining very difficult, and is why structured reporting with standardized taxonomy and definitions is so critical moving into the future.

“Unfortunately, most healthcare data is also unstructured, so how do we sort all that data?” Tcheng said. “It’s not about what the doctor collects in the office anymore, it’s about how we take all this data from various sources and aggregate it.”


New Sources of Data and Research Models

Tcheng said big data from outside healthcare may offer new avenues of research. He pointed to a study1 that used mass data mining from Twitter tweets in the northeastern United States. The study looked at measures for stress based on the use of hostile language, profanity and the expression of negative feelings using key words or terms. This was used to create a color-coded map by county to predict the incidence of heart disease. Tcheng showed the resulting map side by side with a similar map created by the Centers for Disease Control (CDC) based on tedious clinical data analysis from hospitals. (See Figure 1.) The two maps nearly matched and showed a compelling example of how mainstream big data sources like Google, Facebook and other popular electronic media can be leveraged for serious healthcare research.  

“Social media might be a better predictor than all the studies we do to better identify areas where more care is needed for cardiovascular disease,” Tcheng explained.


Population Health 

A subset of healthcare big data is “population health,” where data from a hospital or the whole healthcare system is mined for numerous data points to determine patients who may need additional care. (See Figure 2.) This can enable a hospital to boost revenue with additional services to its existing patient bases, while helping to reduce healthcare costs downstream by preventing acute care episodes. This may include reviewing all cardiovascular patient data to determine which patients have symptoms of peripheral artery disease to bring them in for a screening that will likely lead to additional lab, exam and procedural workups that are needed.

On a higher level, population health data for thousands of patients can be monitored by software to identify outbreaks of disease, or disease hot spots. This includes identifying outbreaks of flu or other epidemics in real time, or cancer hot spots of patients who all live or work in the same location.

Big data will help identify patients who currently fall through the cracks and help clinicians follow up with them. This includes patients who have not had a doctor’s visit in years, those with disease symptoms who never came back for treatment or follow-up, or those who were treated for an acute condition (heart attack) but the patient has not seen a doctor in 2-3 years since and requires follow-up care to prevent another heart attack. These types of things seem small, but will go a long way to preventing patients from showing up in the emergency room, which reduces overall healthcare costs in the long run. This also helps boost income for health systems to keep patients coming back for checkups. 


Decision Support For the Best Outcomes

Clinical decision support (CDS) software can help decide appropriate medications, detect potential interactions and determine the most appropriate tests, imaging exams and procedures for patients with specific symptoms or disease states. Doctors have long argued CDS might take away their decision-making power to manage patients. However, as healthcare continues the trend of following evidence-based medicine, some physicians still do not follow appropriate use criteria (AUC) set by their own specialty societies, leading to poorer patient outcomes. From this perspective, CDS looks like a more attractive option to hospital administrators from a liability and reimbursement standpoint. This is being further reinforced by Stage 3 Meaningful Use requirements for EMRs and the latest amendments to Medicare, which now require CDS records to support reimbursement claims by 2018. In the future, CDS justification documentation might be required for full Medicare reimbursement.

The major issue with creating a single software source for CDS information is that guidelines constantly change and there is a constant flow of new clinical data that can rapidly change the standard of care. 

“We know when you comply with medical guidelines you have better outcomes,”  Tcheng said. However, he said one obstacle to CDS use has been the software’s ability to keep track of all the new information in real time. 

He said big data search engines will increasingly be used by software vendors to aggregate data from the latest clinical trials, studies and society AUC guidelines for various societies into applications that monitor orders physicians enter electronically. These types of CDS software will flag any orders than do not meet AUC and help clinicians make more appropriate decisions on everything from which medications to which imaging tests are best suited for a specific patient. It is hoped following AUC based on constantly updated patient outcomes data will help reduce the number of tests ordered and better guide therapy choices known to offer the best outcomes. Medicare expects this will help reduce healthcare costs in the future, both by reducing the number of tests performed and by picking treatments that have the best measure of success. 


Read the article "Understanding Population Health and its Future Applications."



1.    Johannes C. Eichstaedt, Hansen A. Schwartz, Margaret L. Kern, et al. “Psychological Language on Twitter Predicts County-Level Heart Disease Mortality.” Psychological Science. Feb. 2015 26: 159-169, 2015 doi:10.1177/0956797614557867

Related Content

Sectra Signs Enterprise Imaging Contract With Vanderbilt Health
News | Enterprise Imaging | August 21, 2019
Sectra will install its enterprise imaging picture archiving and communication system (PACS) and vendor neutral archive...
Videos | Radiology Business | August 02, 2019
Association for Medical Imaging Management (AHRA) President ...
Feature | Information Technology | July 31, 2019 | By Greg Freiherr
Innovation is trending toward improved efficiency — but not at the expense of patient safety, according to...
Demand for ultrasound scans at U.S. outpatient centers could grow by double digits over the next five years, according to a speaker at AHRA 2019. A variety of factors, however, could cause projections for this and other modalities to change. Graphic courtesy of Pixabay

Demand for ultrasound scans at U.S. outpatient centers could grow by double digits over the next five years, according to a speaker at AHRA 2019. A variety of factors, however, could cause projections for this and other modalities to change. Graphic courtesy of Pixabay

Feature | Radiology Imaging | July 29, 2019 | By Greg Freiherr
The coming years may be good for the medical imaging community in the United States. But they will not be easy.
Body language expert Traci Brown spoke at the AHRA 2019 meeting on how to identify when a person is not being honest by their body language. She said medical imaging department administrators can use this knowledge to help in hiring decisions and managing staff.

Body language expert Traci Brown spoke at the AHRA 2019 meeting on how to identify when a person is not being honest by their body language. She said medical imaging department administrators can use this knowledge to help in hiring decisions and managing staff. 

Feature | Radiology Business | July 23, 2019 | Greg Freiherr
Can you tell when someone is lying?
John Carrino, M.D., M.Ph., presents “Challenges and Opportunities for Radiology to Prove Value in Alternative Payment Models” at AHRA 2019

John Carrino, M.D., M.Ph., presents “Challenges and Opportunities for Radiology to Prove Value in Alternative Payment Models” at AHRA 2019. Photo by Greg Freiherr

Feature | Radiology Business | July 22, 2019 | By Greg Freiherr
Efforts to reform healthcare are booming, b
Radiology, medical imaging, is facing declining reimbursements, imaging departments continue to comprise a significant portion of the revenue stream in most healthcare organizations. Stewards of these departments are continuously looking for ways to optimize efficiency, increase patient and staff satisfaction, and lower costs without compromising the delivery of excellent patient care. Image by rawpixel from Pixabay

Image by rawpixel from Pixabay 

Feature | Radiology Business | July 18, 2019 | By Stefanie Manack and Judy Zakutny
Approximately 30 percent of a hospital or health system’s profit comes from imaging according, to...