Monday, July 8, 2024

How Good AIs Make Tough Choices

Rushworth Kidder, the ethicist, died 12 years ago. I never met him, but his book "How Good People Make Tough Choices" left a mark. It was required reading in many of my classes, and I still think it is the best book available on the application of philosophy to the moral problems of today.  

Why?  For a start, it is well-organized and easy to read.  Most importantly, though, it doesn't get lost in the back-and-forth that plague some philosophical discussions.  Instead, it tries to provide a modicum of useful structure to help normal people make hard decisions.  In the tradition of some of the earliest philosophers, it is about the application of philosophical thinking to everyday life, not about abstract theorizing.

Don't get me wrong.  I am not against abstract theorizing.  I'm a futurist.  Speculation masquerading as analysis is what I do for a living, after all.  It is just, at some point, we are all faced with tough decisions and we can either let the wisdom of hundreds of philosophers over thousands of years inform that thinking or we can go on instinct.  William Irvine put the consequences even more directly: 

"Why is it important to have such a philosophy? Because without one, there is a danger that you will mislive—that despite all your activity, despite all the pleasant diversions you might have enjoyed while alive, you will end up living a bad life. There is, in other words, a danger that when you are on your deathbed, you will look back and realize that you wasted your one chance at living."

One of the most common questions I get asked these days sits at the intersection of these "tough choices" Kidder was talking about and artificial intelligence.  There is a lot of (justifiable) hand-wringing over the questions of what can we, should we, turn over to AIs on the one hand, and what are the consequences of not turning over enough to the AIs on the other.

For me, these questions begin with another:  What can AIs do already?  In other words, where can AIs clearly outperform humans today?  Fortunately, Stanford collates exactly these kinds of results in an annual AI index (Note:  They don't just collate them, they also put them in plain english with clear charts--well done Stanford!).  The results are summarized in the table below:

Items in dark red are where AIs have already surpassed humans.  The light red is where there is evidence that AIs will surpass humans soon.  This table was put together with help from Claude 3, the AI I think does the best job of reading papers.  I spot checked a number of the results and they were accurate but your mileage may vary.  The estimated time to surpass humans is all Claude, but the time frames seem reasonable to me as well.  If you want the full details, you should check out the Stanford AI Index, which you should do even if you don't want the full details.

The most interesting row (for this post, at least) is the "Moral Reasoning" row.  Here there is a new benchmark, the MoCa benchmark for moral reasoning.  The index highlighted the emergence of harder benchmarks over the last year, stating, "AI models have reached performance saturation on established benchmarks such as ImageNet, SQuAD, and SuperGLUE, prompting researchers to develop more challenging ones."  In other words, AIs were getting so good, so fast that researchers had to come up with a whole slew of new tests for them to take, including the MoCa benchmark.

MoCa is a clever little benchmark that uses moral and causal challenges from existing cognitive science papers where humans tended to agree on factors and outcomes.  The authors of the paper then present these same challenges to a wide variety of AIs and score the AIs based on something called "discrete agreement" with human judges.  Discrete agreement appears, by the way, to be the scientific name for just plain "agreement"--go figure.  The chart below is from the AI Index not the original paper but summarizes the results:

From the Stanford AI Index.  Scores are from 0-100 with higher scores equaling higher agreement with human judgement.  

If you are scoring things at home, this chart makes AIs look pretty good until you realize that the y axis doesn't include the full range of possible values (A little data-viz sleight of hand there...).  This sort of professorial nit-picking might not matter, though.  This was a study published in late 2023 and there is already a 2024 study out of the University of North Carolina and the Allen Institute that shows significant improvement--albeit on a different benchmark and with a new LLM.  Specifically, the researchers found, "that advice from GPT-4o is rated as more moral, trustworthy, thoughtful, and correct than that of the popular The New York Times advice column, The Ethicist."  See the full chart from the paper below:

Taken from "Large Language Models as Moral Experts? GPT-4o Outperforms Expert Ethicist in Providing Moral Guidance" in pre-print here:  https://europepmc.org/article/PPR/PPR859558 

While these results suggest improvement as models get larger and more sophisticated, I don't think I would be ready to turn over moral authority for the kinds of complex, time-sensitive, and often deadly decisions that military professionals routinely have to make to the AIs anytime soon.

OK.  

Stop reading now.

Take a breath.

(I am trying to keep you from jumping to a conclusion.)  

As you read the paragraph above (the one that begins, "While these results..."), you probably thought one of two things.  Some of you may have thought, "Yeah, the AIs aren't ready now, but they will be and soon.  It's inevitable."  Others of you may have thought, "Never.  It will never happen.  AIs simply cannot replace humans for these kinds of complex moral decisions."  Both positions have good arguments in favor of them.  Both positions also suffer from some major weaknesses.  In classic Kidder-ian fashion, I want to offer you a third way--a more nuanced way--out of this dilemma.

Kidder called this "third way forward, a middle ground between two seemingly implacable alternatives" a trilemma.  He felt that taking the time to try to re-frame problems as trilemmas was an enormously useful way to help solve them. It was about stepping back long enough to imagine a new way forward.  The role of the process, he said, "is not always to determine which of two courses to take. It is sometimes to let the mind work long enough to uncover a third."

What is this third way? Once again, Kidder comes in handy.  He outlined three broad approaches to moral questions:

  • Rules-based thinking (e.g. Kant and the deontologists, etc.)
  • Ends-based thinking (e.g. Bentham and the utilitarians, etc.)
  • Care-based thinking (e.g. The Golden Rule and virtually every religion in the world)
Each of these ways of looking at moral dilemmas intersect with AIs and humans in different ways.

AI is already extremely good at rules-based thinking, for example.  We see this in instances as trivial as programs that play Chess and Go, and we see it in military systems as sophisticated as Patriot and Phalanx.  If we can define a comprehensive rule set (a big “if”) that reliably generates fair and good outcomes, then machines likely can and should be allowed to operate independently.

Ends-based thinking, on the other hand, requires machines to be able to reliably forecast outcomes derived from actions, including second, third, fourth, etc. order consequences.  Complexity Theory (specifically the concept of sensitive dependence on initial conditions) suggests that perfect forecasting is a mathematical impossibility, at least in complex scenarios.  Beyond the math, practical experience indicates that perfection in forecasting is an unrealistic standard.  All this, in turn, suggests that the standard for a machine cannot be perfection.  Rather, it should be “Can it do the job better than a human?”

The “Can the machine do the job better than a human?” question is actually composed of at least three different sub-questions:
  • Can the machine do the job better than all humans?  An appropriate standard for zero-defect environments.
  • Can the machine do the job better than the best humans?  An appropriate standard for environments where there is irreducible uncertainty.
  • Can the machine do the job better than most humans?  A standard that is appropriate where solutions need to be implemented at scale.
If "the job" we are talking about is forecasting, in turns out that the answer, currently, is: Not so much. Philipp Schoenegger, from the London School of Economics, and Peter Park from MIT recently posted a paper to ArXiv where they showed the results of entering GPT-4 into a series of forecasting challenges on Metaculus. For those unfamiliar with Metaculus, it is a public prediction market that looks to crowdsource answers to questions such as Will the People's Republic of China control at least half of Taiwan before 2050? or Will there be Human-machine intelligence parity before 2040?

The results of the study? Here, I'll let them tell you:
"Our findings from entering GPT-4 into a real-world forecasting tournament on the Metaculus platform suggest that even this state-of-the-art LLM has unimpressive forecasting capabilities. Despite being prompted with established superforecasting techniques and best-practice prompting approaches, GPT-4 was heavily outperformed by the forecasts of the human crowd, and did not even outperform a no-information baseline of predicting 50% on every question."

Ouch.

Ends-based thinking is very much a part of most military decisions. If AIs don't forecast well and ends-based thinking requires good forecasting skills, then it might be tempting to write AIs off, at least for now. The trilemma approach helps us out in this situation as well, however. There are powerful stories of hybrid human/machine teams accomplishing more than machines or humans alone that are starting to appear. As more and more of these stories accumulate, it should be possible to detect the "golden threads," the key factors that allow the human and machine to optimally integrate.

Finally, Kidder defined care-based thinking as “putting love for others first.”  It is here that machines are at their weakest against humans.  There are no benchmarks (yet) for concepts such as “care” and “love.”  Furthermore, no one seems to expect these kinds of true feelings from an AI anytime soon.  Likewise, care-based thinking requires a deep and intuitive understanding of the multitude of networks in which all humans find themselves embedded.  

While the machines have no true ability to demonstrate love or compassion, they can simulate these emotions quite readily.  Whether it is because of anthropomorphic bias, the loneliness epidemic, or other factors, humans can and do fall in love with AIs regularly.  This tendency turns the AIs' weakness into a strength in the hands of a bad faith actor.  AIs optimized to elicit sensitive information from unsuspecting people are likely already available or will be soon.

Beyond the three ways of thinking about moral problems, Kidder went on to define four scenarios that are particularly difficult for humans and are likely to be equally challenging for AIs. Kidder refers to these as “right vs right” scenarios, “genuine dilemmas precisely because each side is firmly rooted in one of our basic, core values.” They include:
  • Truth vs. loyalty
  • Individual vs. community
  • Short-term vs. long term
  • Justice v. mercy
Resolving these kinds of dilemmas involves more than just intelligence. These kinds of problems seem to require a different characteristic–wisdom–and wisdom, like intelligence can, theoretically at least, be artificial.

Artificial Wisdom is a relatively new field (almost 75% of the articles in Google Scholar that mention Artificial Wisdom have been written since 2020). The impetus behind this research seems to be a genuine concern that intelligence is not sufficient for the challenges that face humanity. As Jeste, et al. put it, “The term “intelligence” does not best represent the technological needs of advancing society, because it is “wisdom”, rather than intelligence, that is associated with greater well-being, happiness, health, and perhaps even longevity of the individual and the society.”

I have written about artificial wisdom elsewhere and I still think it is a useful way to think about the problem of morality and AIs. For leaders, "wisdom" is a useful shorthand for communicating many of the concerns they have about turning operations, particularly strategic operations, over to AIs. I think it is equally useful for software developers, however. Wisdom, conceptually, is very different from intelligence but no less desirable. Using the deep literature about wisdom to help reframe problems will likely lead to novel and useful solutions.