Monkey's Paw: How AI in Healthcare Can Go Wrong

In W.W. Jacob’s 1902 horror story The Monkey’s Paw, a man’s friend gives him a mummified monkey’s paw that will grant three wishes with terrible consequences. The man wishes for £200, and the next day, his son is killed by a machine at work. The company makes a goodwill payment to the family of £200.

Most innovations are some version of a modern-day monkey’s paw, with great power and terrible unintended consequences. Artificial intelligence, in both large language and predictive forms, is the most recent and perhaps greatest example of a monkey’s paw.

While most healthcare technology is built with the best intentions, without careful monitoring and planning, these tools can have unintended consequences that reveal our biases. A company may design artificial intelligence to read resumes for job openings and make decisions based on past hires, but because of historical biases, the AI may prefer men over women. If a credit company uses past data to determine how much credit a client gets, a husband may get more credit than his wife because of patriarchal finance history. Black people may see fewer ads about obtaining a mortgage because they have been historically denied home loans at higher rates.

D CEO Healthcare spoke with Dr. Christoph Lehmann about some of the pitfalls of artificial intelligence in the healthcare industry and how it can limit the impact of bias and preserve equity. He is a professor of pediatrics, population and data sciences, and bioinformatics at UT Southwestern, where he directs the Clinical Informatics Center. He has spoken and written about the power of artificial intelligence in healthcare and recently published an article in the Journal of the American Medical Informatics Association, where he defines the artificial intelligence principles of the association.

The Belmont Principles

Lehmann and his co-authors decided that a well-known standard for medical research called the Belmont Principles can also be applied to AI. First is beneficence, which means that all AI should be designed to help people. Lehmann pointed to Isaac Asimov’s 1942 “Three Laws for Robotics” to guide this principle. They state that “A robot may not injure a human being or, through inaction, allow a human being to come to harm. A robot must obey orders given to it by human beings except where such orders would conflict with the First Law. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.”

The next Belmont Principle is that of autonomy. Rather than ensure the AI operates autonomously from humans, this principle aims to protect people’s autonomy and ensure the AI isn’t used to do things to people they don’t want.

Thirdly, Lehmann says AI shouldn’t be used to do cruel things to people. “It would be unethical, in his opinion, to design AI to maximize what people have to pay for their healthcare. “It would be unethical, in my opinion, to allow AI to maximize what people have to pay for their health care,” he says.

Lastly, Lehmann prizes the principle of justice, which means that AI should be applied equally to everyone so that all might benefit. Patients should also know they are dealing with artificial intelligence. Finally, it should support social justice in our society.

Recognize Bias

AI built by people with bias and based on data that results from a biased society will inevitably reflect those values. For example, a provider could build or use an AI tool that denies services to a population because that population is expected to default on payments based on historical data. A tool based on data from past patients who are asked to predict who will not pay service fees and who will end up paying will discriminate against populations statistically less likely to pay for the service. With the prevalence of medical bankruptcy, an organization could use that to weed out patients they don’t want based on demographics like race, ZIP code, or other factors.

“There may be an AI model for healthcare expenditures that is used to predict what patients would need and what healthcare interventions, while only people who could afford these healthcare interventions actually got them,” Lehmann says. “The model is based on flawed data and disadvantaged people that are poor. That’s not that’s not what we should be doing in healthcare.” AI must be designed to account for and prevent these biases to provide equitable care.

Algorithmic Drift

Predictive AI creates a model trained on a particular data set, but that data set might not accurately reflect current conditions over time. If there are changes in the patient population or a new medication that makes the data set obsolete, the algorithm guiding the AI will perform more poorly conditions change.

A model built to guide the care given to diabetic or obese patients before the widespread use and prescription of Ozempic or Wegovy likely needs to be given a new set of data to reflect current conditions. Similarly, if a state expands Medicaid and a population that previously didn’t have health insurance can now get regular and preventative care, that will impact the data on which the AI model is based.

“As you apply AI, you are responsible for making sure that the model continues to perform the way you anticipated before, and you have to retrain it or modify it if it stops performing as well,” Lehmann says. “That’s a big ethical obligation as well.”

Accountability

AI doesn’t have a moral compass and will do what we tell it. Humans, on the other hand, are capable of empathy and can regulate their behavior based on societal norms or laws. We wouldn’t want our worst desires reported to the world, so we temper ourselves and have built-in guardrails to our actions. AI doesn’t have those same problems unless the ones writing the code find ways to limit them.

“People who have used AI for crime, and it doesn’t perceive that it is criminal behavior,” Lehmann says. “AI needs to have oversight, and it has to have ways for people to complain if they think they have been harmed, and people have to be accountable for the output of AI. AI let loose by itself is a terrible thing.”

What We Don’t Know

Right now, we are in the Wild West era of AI. Just as snake oil salesman toured from town to town and took advantage of others while the steam engine extended the reach of industrialism in the Old West, AI is both expanding our capabilities and being used nefariously today. Jobs will change, and people may get hurt, but it will also allow society to do things more effectively, safely, and quickly. “Right now, we are in the hype cycle, and everybody is out there doing their thing, and nobody thinks much about making sure it works properly,” Lehmann says.

Today, much of predictive AI is a black box. Even those who created the technology aren’t sure which bit of information is responsible for specific outcomes. In medicine, physicians will often “show their work” to nurses or pharmacists to explain how they came up with a certain dose of medicine. It is called “decision support.” We are still learning how to do the same thing with predictive AI.

Much like the Monkey’s Paw, using a tool we don’t fully understand can have terrible results. To maximize benefit and reduce harm, society needs to understand the why behind the results in addition to benefitting from them. “In AI, it’s hard to show your work because we often don’t understand the model. There are AI models where we don’t know if a factor you identify as important has a negative or positive effect, and we have a natural distrust of things we don’t understand.”