Big Data and Artificial Intelligence in Healthcare

Just Starting to Learn More About Big Data and Artificial Intelligence in Healthcare?

by Gregory Tidanian, Haig Barrett Associate

Big data, artificial intelligence, machine learning…these topics seem to be everywhere in both the business and consumer news media. But these terms can be confusing, misused, and misunderstood. Regardless, the use of big data to drive artificial intelligence technologies is revolutionizing the healthcare sector, including pharmaceutical and therapeutics developers and medical service providers.

What is Big Data, Really?

To effectively start this discussion, the question must be asked—what really is big data?

Big data is large amounts of information that can be processed to find patterns, trends, and associations. Most of us have increasingly experienced the effects of big-data management strategies maybe without even realizing it. For some time now, internet pioneers such as Facebook and Amazon have continued to evolve the sophistication with which they mine and manage the vast sums of data they collect each day. Through analysis of this data, Amazon predicts which products and services are likely to be of most interest to you, while Facebook’s algorithms mine user data generated in large part by likes and posts in an attempt to determine which ads you will be most likely to interact with. They do this on macro and individual levels, to provide a service that maximizes each user’s preferences and likely actions.

In addition to our individual contributions to the ever-expanding quantities of data generated, massive amounts of machine-generated data are constantly being produced as well. Smart home devices, industrial machinery, industrial communication devices, transaction devices at retail locations, and so much more are all creating a huge volume of data.

This said, a lot of data is not really all that interesting—data is just data, but actionable data is highly valuable. Although we have been generating a lot of data for a good while now, it is only recently that technologies have advanced enough to allow us to mine this data and extract insights from it. Additionally, while historical insights into what has happened are often quite valuable, it is easy to see how much more valuable intelligence that predicts what is likely to happen and helps us to make better decisions is.

Bringing conclusion to the question at hand—what is big data? Technically, big data is data sets that are so large and complicated that new data-processing technologies are required because traditional data processing technologies and approaches lack the needed robustness. However, in thinking about big data from a business value standpoint, the following definition is perhaps a bit more relevant:

Big data is sets of data that are so large and complicated that new data processing technologies are required to extract historical and predictive insights that can be used to benefit individuals, organizations, industries, and society at large.

Big Data and Artificial Intelligence in Healthcare

Healthcare is one of the industries that has been carefully studying technology leaders to determine how the sector can best incorporate big-data management approaches to better serve patients, increase efficiencies, and improve quality of care. A revolution has been developing in how data is managed and utilized within medical service provider organizations, pharmaceutical and other therapeutics developers, and many other links within the healthcare value chain. Specifically, big data is facilitating the utilization of artificial technologies within many aspects of the healthcare ecosystem.

Artificial intelligence…wait, what’s that? True, there is no lack of confusing and much-debated terms. While there is debate as to what the scope of artificial intelligence is, for the purposes of this discussion, we will take a fairly pragmatic approach to the term.

Artificial intelligence is any device, software program, or other technological construct that processes big data to learn from past events or outcomes, then takes actions or makes conclusions that maximize the chances of success of a defined goal.

AI and Big Data Applied—Streamlining Medical Discoveries

As just one example, AI technologies can carry out the task of scanning large databases to make breakthroughs in healthcare. One of the benefits of these systems is the ability to analyze massive volumes of research data far more efficiently than humans. Additionally, AI technologies can identify intricate correlations that a human being would either not be able or take much longer to identify. This is particularly beneficial for research into rare diseases, where teams of researchers are working to identify patterns to predict the most effective treatments, identify genome alterations that might treat a given disease state, and many other objectives of this nature.

An example of the use of big data fueling artificial intelligence was a study published in Cancer Discovery in 2013 by Stanford researchers Atul Butte, MD, PhD, and Julien Sage, PhD.  In their paper, Butte and Sage outlined research involving an algorithm that uses big data to find a new way of treating small-cell lung cancer. This research involved running information on several hundreds of thousands of gene-expression profiles on different types of cells, both diseased and normal, and analyzing trends and patterns. This process found that a drug called imipramine, an FDA-approved drug currently being used as an antidepressant, can also effectively treat small-cell lung cancer, a particularly deadly form of cancer.

This was a profound finding as imipramine has already been approved by the Food and Drug Administration for use as an antidepressant. As such, imipramine could move into human trials faster and with less expense than if it were a newly developed therapeutic.

“We are cutting down the decade or more and the $1 billion it can typically take to translate a laboratory finding into a successful drug treatment to about one to two years and are spending about $100,000,” Butte said, discussing their findings.

This finding alone illustrates the promise of big-data analysis in healthcare. Strategic data analysis and use of artificial intelligence streamlined a medical breakthrough that has the potential to save many lives by several years and tens of millions of dollars.

The opportunity this technology brings to save money on unnecessary testing is further illustrated by large pharmaceutical companies such as GlaxoSmithKline, Merck & Co, Johnson and Johnson, and Sanofi, which are turning to AI systems to streamline their research processes. This year, GSK began a research collaboration with Dundee, Scotland-based Exscientia, whose AI platform is used to analyze new molecule discoveries and their compatibility for medicine. Exscientia Chief Executive Andrew Hopkins believes big data will change the way early-phase projects are carried out, allowing for improved target selection.

“Delivering efficiencies to drug discovery has the potential to revolutionize the way early projects are executed, enabling more dynamic target selections from the burgeoning set of opportunities,” Hopkins said.

Improved Diagnoses and Prognoses

It is well known that inaccurate diagnoses and prognoses can lead to ineffective treatment, which results in substandard patient treatment and additional costs. Although medical professionals are rightly trusted with making the correct decisions, AI systems can help them make better, faster, and more accurate decisions.

For example, this is being put into practice with AI systems such as Watson, from IBM, which is currently being trained by the Memorial Sloan Kettering Cancer Center in New York. Such computer systems are providing clinical recommendations to clinicians using information obtained by scanning large biological databases. This information will provide a higher level of accuracy, as currently a clinician’s decisions are based only on their experience and the limited information to which they have access. With a system such as Watson, the clinician suddenly has access to and is able to quickly analyze a far greater volume of research.

Potential Big Pitfalls of Big Data

Like all technological breakthroughs, big data in healthcare has some drawbacks. These can be divided into three categories:

 1. Data Security

With more and more information being stored on cloud-based systems, patients’ healthcare information can be left vulnerable if it is not safeguarded effectively. These risks come from hackers, who can use patients’ private information for a range of purposes, and the rise of big data will no doubt result in a greater risk of damaging cyberattacks. Jean-Frederic Karcher, head of security for Maintel, a leading communications provider, sees healthcare-related hacking as a much larger threat than bank information theft.

“Medical information can be worth ten times more than credit card numbers on the deep web,” Karcher said. “Fraudsters can use this data to create fake IDs to buy medical equipment or drugs, or combine a patient number with a false provider number and file fictional claims with insurers.”

In addition to individuals being susceptible to the risk of hacking, organizations are also at great risk of cyberattacks. These risks will only grow as the use and cumulation of big data grows. As recently as July 2017, pharma giant Merck announced it had been breached by a hack, stating on Twitter, “We confirm our company’s computer network was compromised today as part of global hack. Other organizations have also been affected.”

2. Misplacement of Information

Another risk big data brings to healthcare is the risk of any information being stored inaccurately or in the wrong place due to human error. As more and more information from patients is uploaded to cloud-based storage, more clinics and doctors will be reliant on this information. There is a risk of inputting data for a given patient incorrectly, or accidentally inputting incorrect values resulting in incorrect prescriptions, diagnoses, and prognoses, causing health risks for patients.

3. Imperfect Information

Yet another problem that comes with an increase in big data is how information will be interpreted. For example, an online patient profile can show the amount of prescribed medication someone was expected to have taken, but whether the patient decided to take it would not be seen on the profile. Although this is something that exists throughout healthcare today, the risks associated with relying on this imperfect data this could increase with an increased reliance on patient data.


There are risks associated with the rise of any new technology paradigm and big data and artificial intelligence are no exception. However, in this case, the rewards outweigh the risks. AI systems, utilizing big data, are becoming increasingly powerful contributors to more effective decision-making and for uncovering patterns and correlations that might never have been uncovered by a human team.

Human intelligence is excellent at interpreting nuance and drawing insights a computer cannot. However, computers far surpass humans’ ability to quickly find information and patterns within enormous sets of data that are next to impossible for the human mind to process.

By getting increasingly better at pairing human and computer intelligence, we will be able to find solutions, treat disease, and save lives at a continually rapid pace. Those of us watching the advancements within the space should prepare to be continually awestruck as we find solutions and fight diseases we previously only dreamed of being possible.


  1. Krista Conger. (2013). Big data = big finds: Clinical trial for deadly lung cancer launched by Stanford study. Available: Last accessed 11th Sep 2017.
  2. Kitty Knowles. (2016). 5 amazing ways IBM Watson is transforming healthcare. Available: Last accessed 12th September 2017.
  3. Ben Hirschler. (2017). Big pharma turns to AI to speed drug discovery, GSK signs deal. Available: Last accessed 29th September 2017.
  4. Amirah Al Idrus. (2017). GlaxoSmithKline, Exscientia ink AI-based drug discovery deal worth up to $42M. Available: Last accessed 29th September 2017.
  5. Linda Massarella. (2017). Europe cyberattack also breaches Merck headquarters in US. Available: Last accessed 30th September 2017.


Tagged on: