Monday, July 24, 2023

Generative AI Is Like A ...

This will make sense in a minute...
Don't worry!  I'm going to fill in the blank, but before I do, have you played around with generative AI yet?  

If not, let's solve that problem first.

Go to Peplexity.ai--right now and before your read any further--and ask it a question.  Don't ask it a question it can't know the answer to (like, "What did I have for lunch?"), but do ask it a hard question that you do know the answer to (or for which you are at least able to recognize a patently bad answer).  Then, ask Perplexity some follow up questions.  One or two should be enough.

Come back when you are finished.

Now rate the answers you got on a scale from 1-10.  One or two is a dangerous answer, one that could get someone hurt or cause real problems.  Give a nine or ten to an actionable answer, one that you could use right now, as is.

I have had the opportunity to run this exercise with a large number of people at a variety of conferences and training events over the last six months.  First, I consistently find that only about a third of the crowd have ever used any generative AIs (like Perplexity or ChatGPT) though that number seems to be going up (as you would expect) over time.

I have rarely heard anyone give an answer a one or two and always have at least a couple of people give the answer they received a nine or ten.  Other members of the each audience typically gave scores that range across the spectrum, of course, but the average seemed to be about a six.  

Yesterday, I gave this same exercise to about 30 people and there were no 1 or 2's and three people (10%) gave their answer a 9 or 10.  No one gave the answer less than a 5.  No one.  

While anecdotal, it captures a trend that has been thoroughly documented across a number of different domains:  Generative AI isn't hitting like a freight train.  It's hitting like one of those high-speed, Japanese bullet trains, vaporizing traditional paradigms so quickly that they still don't know that they are already dead (For example...).

Or is it?

Thanks to some forward-thinking policy guidance from the leadership here at the Army War College, I, along with my colleagues Dr. Kathleen Moore and LTC Matt Rasmussen, were able to teach a class for most of last year with the generative AI switch set to "on."  

The class is called the Futures Seminar and is explicitly designed to explore futures relevant to the Army, so it was perfectly appropriate for an exploration of AI.  It is also an all year elective course so we were able to start using these tools when they first hit the street in November 22 and continue to use them until the school year ended in June.  Finally, Futures Seminar students work on research questions posed by Army senior leaders, so lessons learned from this experience ought to apply to the real world as well.

We used generative AIs for everything.  We used them for brainstorming.  We used them to critique our analysis.  We used them to red-team.  We created our own bots, like DigitalXi, that was designed to take the perspective of Xi Jinping and answer our questions as he would.  We visualized using Midjourney and Dalle-2 (see picture above made with Midjourney).  We cloned people's voices and created custom videos.  We tapped into AI aggregation sites like Futurepedia and There's An AI For That to find tools to help create everything from custom soundtracks to spreadsheets.

We got lots of feedback from the students and faculty, of course, both formal and informal.  We saw two big trends.  The first is that people either start at the "AI is going to save the earth" end of the spectrum or the "AI is going to destroy the earth" end.  For people who haven't tried it yet, there seems to be little middle ground.  

The second thing we saw is that, over time and sort of as you would expect, people develop a more nuanced view of AI the more they use it.  

In the end, if I had to boil down all of the comments and feedback it would be, generative AI is like a blazingly fast, incredibly average staff officer.

Let me break that down a bit.  Generative AI is incredibly fast at generating an answer.  I think this fools people, though.  It makes it seem like it is better than it actually is.  On real world problems, with second and third order causes and consequences that have to be considered, the AIs (and we tried many) were never able to just nail it.  They were particularly bad at seeing and managing the relationships between the moving pieces of complex problems and particularly good at doing administrivia (I got it to write a great safety SOP).  In the end, the products were average, sometimes better, sometimes worse, but, overall, average.  That said, the best work tended to come not from an AI alone or a student alone, but with the human and machine working together.  

I think this is a good place for USAWC students to be right now.  The students here are 25 year military professionals who have all been successful staff officers and commanders.  They know what good, great, average, and bad staff work looks like.  They also know that, no matter what the staff recommends, if the commander accepts it, the work becomes the commander's.  In other words, if a commander signs off on a recommendation, it doesn't matter if it came from two tired majors or a shiny new AI.  That commander now owns it.  Finally, our students are comfortable working with a staff.  Seeing the AI as a staff officer instead of as an answer machine is not only a good place for them to be mentally, but also likely to be the place where the best work is generated.

Finally, everyone--students and faculty alike--noted that this is where AI currently is.  Everyone expects it to get better over time, for all those 1's and 2's from the exercise above to disappear and for the 9's and 10's to grow in number.  No one knows what that truly means, but I will share my thoughts on this in the next post. 

While all this evidence is anecdotal, we also took some time to run some more formal studies and more controlled tests.  Much of that is still being written or shopped around to various journals, but two bits of evidence jumped out at me from a survey conducted by Dr. Moore.

First, she found that our students, who had worked with AI all year, perceived it likely to be 20% more useful to the Army than the rest of the student body (and 31% more useful than the faculty).  Second, she also found that 74% of Futures Seminar students walked away from the experience thinking that the benefits of developing AI outweigh the risks with only 26% unsure.  General population students were much more risk averse with only 8% convinced the benefits outweigh the risks with a whopping 55% unsure and 37% saying the risks outweigh the benefit.

This last finding highlights something of which I am now virtually certain:  The only real way to learn about generative AI is to use it.  No amount of lecture, discussion, powerpoints, what have you will replace just sitting down at a computer and using these tools.  What you will find is that your own view will become much more informed, much more quickly, and in much greater detail than any other approach you might take to understand this new technology.

Gaining this understanding is critical.  Generative AI is currently moving at a lightning pace.  While there is already some talk that the current approach will reach a point of diminishing returns in the future due to data quality, data availability, and cost of training, I don't think we will reach this point anytime soon.  Widely applicable, low-cost AI solutions are no longer theoretical.  Strategic decisionmakers have to start integrating their impact into their plans now.

2 comments:

D Mitchell said...

Kris, did you test the students’ beliefs about AI before they started working with it to get a baseline? Your conclusion rests on the idea that those that work with the AI tools gain a greater understanding of it and as a result become believers. Did your students start with an implicit bias towards AI adoption, as might be the case for individuals who signed up to be in the Futures Seminar? Also, to what extend did you see the problem of ‘facts-adjacent’ impact the quality of the AI’s performance in the research studies? This could have grave impact on the results of command decisions regardless of whether people felt like the results were not a 1 or 2 - I hardly need point out that a commander making a good decision on poor information combined with probabilistic impacts can get people killed.

Kristan J. Wheaton said...

D Mitchell, we did not. The Dean wanted a survey of students current attitudes and we realized that that the Futures Seminar cohort would likely throw off the averages, so we separated them out. All of your concerns are legit and this should be viewed as preliminary analysis/results. I do have to say that I kind of doubt the size of the difference will not be significant but I am happy to wait until Kathleen runs all the numbers. I also don't find the difference that surprising. There is almost nothing that people don't understand better once they get a chance to do it versus just talk about it. I would also say that "believer" is the wrong word. They become "better informed consumers." No one jumped on the AI train without reservations. Everyone still had concerns, but they were also able to see possibilities as well. Have you tried out generative AIs? What has been your experience?