Greg Freiherr has reported on developments in radiology since 1983. He runs the consulting service, The Freiherr Group.

Blog | Artificial Intelligence | February 03, 2020

Why AI Is More Human Than You Might Believe

It is not that smart algorithms will one day become too smart, as some fear; not that smart machines will one day overshadow human intellect. Rather the danger is that artificial intelligence (AI) machines are viewed by people as more impartial than they are; that their decisions are more objective than those of people. They are not.

I have heard wise people speak of AI with reverence, almost as if it were superhuman. They are wrong to do so. This is not to say that AI should be trivialized. AI can offer important clinical insights. And smart algorithms can save time.

Some pundits predict that AI will be fundamentally necessary for the next generation of physicians. While smart algorithms may not replace physicians, those who use them may replace those who don’t. If this statement is true, it is all the more important that the limitations of AI be appreciated.

Seeing AI Clearly

Put simply, AI has the same vulnerabilities as people do. This applies especially to machine learning (ML), the most modern form of artificial intelligence. In ML, algorithms dive deep into data sets. Their development may be weakly supervised by people. Or it may not be supervised at all.

This laissez-faire approach has led some to believe that the decisions of ML algorithms are free from human failings. But they are wrong. Here’s why.

First, even among self-taught deep learning algorithms, the parameters of their learning are established by people. Second, the data used to train these algorithms is gathered by people. Either instance can lead to the incorporation of human biases and prejudices in algorithms.

This has already happened — with negative results — in other fields of work. For example, algorithms intended as sentencing aides for judges have shown “an unnerving propensity for racial discrimination,” wrote David Magnus, Ph.D., director of the Stanford University Center for Biomedical Ethics, and his Stanford colleagues in a March 2018 issue of the New England Journal of Medicine^.1 Healthcare delivery already varies by race. “Racial biases could inadvertently be built into healthcare algorithms, wrote Magnus and colleagues. And there is strong potential for purposeful bias.

A third-party vendor, hoping to sell an algorithm to a healthcare system, could design an algorithm to align with the priorities of the health systems — priorities that may be very different from those of patients or physicians. Alignment of product and buyer is an accepted tenet of commerce.

One high priority of health systems might be the ability of the patient — either personally or through insurance — to pay for medical services. It is hard to believe that for-profit developers of algorithms would not consider this. And institutional priorities may not even be knowingly expressed.

Magnus and colleagues wrote in the NEJM article that ethical challenges to AI “need to be guarded against.” Unfortunately, such challenges could arise even when algorithms are not supervised by people. When algorithms do deep dives into data sets to discover “truths” on their own, the data might not have included some patient populations.

This could happen due to the influence of the “health-wealth” gradient, Magnus said last fall during his presidential keynote delivered at the annual meeting of the American Society for Radiation Oncology (ASTRO). This gradient can occur when patient data is only included if patients had the ability or insurance to pay for care.

And this is what could happen inadvertently. What if algorithm developers give in to greed and corruption? “Given the growing importance of quality indicators for public evaluations and determining reimbursement rates, there may be a temptation to teach machine learning systems to guide users toward clinical actions that would improve quality metrics, but not necessarily reflect better care,” the Stanford authors wrote in the NEJM. “Clinical decision support systems could also be programmed in ways that would generate increased profits for their designers or purchasers without clinical users being aware of it.”

Factoring in Quality of Life

Even if precautions are taken, and the developers of ML algorithms are more disciplined than software engineers elsewhere, there is still plenty of reason to be wary of AI. It bears noting again that the data on which ML algorithms are trained and/or do their analyses are gathered by people. As a result, this data may reflect the biases and prejudices of these people.

Additionally, results could be skewed if data is not included on certain specific patient groups, for example, the elderly or very young. It should be noted that most clinical testing is done on adults. Yet that doesn’t keep the makers of OTC drugs from extrapolating dosages for children.

But algorithms trained on or just analyzing incomplete data sets would not generate results applicable to the very young or very old. Notably I was told by one mega-vendor that its AI algorithm had not been cleared by the FDA for the analysis of pediatric cases. Its workaround for emergency departments? Report the results and state that the age of the patient could not be identified, leaving the final decision up to the attending physician.

As this algorithm is intended to identify suspicious cases, it seems reasonable to do so. But would the same apply if the algorithm is designed to help radiologists balance the risk and benefit of exposing patients to ionizing radiation? If cancer is suspected, doing so makes sense. But what about routine screening for the recurrence of cancer? What if the patient is very young? Or very old? These are just some of the myriad concerns that underscore the main point — that smart algorithms may not be so smart. At the very least, they are vulnerable to the same biases and prejudices as people are, if not in their actual design then in their analysis of clinical data. Recognizing these shortcomings is all the more important when radiologists are brought in to help manage patient care.

Seeing AI for What it is

In summary, then, AI is not the answer to human shortcomings. Believing it is will at best lead to disappointment. The deep learning algorithms that dive into data sets hundreds, thousands or even millions of times will be only as good as the data into which they dive. The Stanford wrote that it may be difficult to prevent algorithms from learning and, consequently, incorporating bias. If gathered by people, the data almost assuredly will reflect the shortcomings of those who gathered it.

So, while ML algorithms may discover patterns that would otherwise escape people, their conclusions will likely be tainted. The risk presented by AI, therefore, is not that its algorithms are inhuman — but that they are, in fact, too human.

Greg Freiherr is consulting editor for ITN, and has reported on developments in radiology since 1983. He runs the consulting service, The Freiherr Group.

Reference:

1. Char DS, Shah NH, Magnus D. Implementing Machine Learning in Health Care — Addressing Ethical Challenges NEJM. 2018 Mar 15; 378(11): 981–983 — doi:10.1056/NEJMp1714229)