Tuesday, March 4, 2008

Part 5 -- A Surprise Ending (What Do Words Of Estimative Probability Mean?)

Part 1 -- Introduction
Part 2 -- To Kent And Beyond
Part 3 -- The Exercise And Its Learning Objectives
Part 4 -- Teaching Points

So far in this series, I have discussed the issues surrounding the use of Words Of Estimative Probability as a way of communicating the results of intelligence analysis to real-world decisionmakers. I have tried to devise an exercise that can demonstrate to intelligence studies students that, while a consistent and limited series of so called "good" WEPs (like the ones the National Intelligence Council (NIC) has adopted for use in its recent National Intelligence Estimates (NIEs)) constitute the current "best practice" in communicating the results of analysis, it is far from a perfect system. Studies both within the intelligence community and from fields such as medicine, finance and meteorology have all demonstrated that people assign only roughly consistent meanings to WEPs -- that one person's "likely" is another person's "virtually certain".

As I began to look at the data from my recent round of this classroom exercise, I began to notice something interesting, though. There seemed to be a level of consistency in the data that I had not noticed before. Was it there previously and I just missed it? I don't know. I don't typically keep the data from these exercises and the only reason I had this batch of data was because it was buried in one of the many piles of paper I have in my office (I believe in that ancient organizational system -- mounding).

I decided to take a closer look at the data. I was surprised by what I saw. While some individuals were throwing the full range out of whack (and keeping the teaching points in the exercise relevant), these were clearly statistical outliers. The bulk of the students were congregating quite nicely around an approximately ideal trendline. To be sure, the results were still off in places, but the results were much closer to optimal than I expected.

I have reproduced the aggregate results in a chart below. I have used what financial analysts call a high-low-close chart that marks the average high score, the average low score and the average point value for each WEP. I have also included the idealized trendline and have connected the high and low averages so you can see how the range fluctuates as the probabilities associated with each WEP increases.

If you want to see the raw data, I have included it in the chart below:

(Notes on the chart: The "High" column represents the average high score while the "Low" column represents the average low score for each WEP. The "Odds" column represents the average point value given for each WEP. The "High-Low" column represents the range (difference between high and low score) for each WEP. The "Odds-odds" column represents the difference between the average point value from one WEP to another. N=18)

While I know there are statistical nuances that I have not accounted for in the way I have calculated and displayed the data, the overall pattern seems to suggest to me that there may be something interesting going on here. We can be pretty adamant about the use of good WEPs here at Mercyhurst. The students in this exercise have been exposed to that thinking and it seems to have calibrated their use of WEPs to a certain degree.

There is, in fact, precedent for this kind of calibration. According to Rachel Kesselman's early results, the medical profession, with outside pressure from the insurance industry, has adopted a more or less "accepted" meaning for a number of WEPs (used primarily in prognostic statements to patients and their families). The same thing might well be happening here (Note: My colleague, Steve Marrin, has done a number of papers on the more general aspects of the medical analogy to the intelligence profession. All are worth checking out).

The key seems to be, in all these cases, outside pressure. In the case of our students the pressure comes from the professors. In the case of the medical profession, the pressure comes from the insurance companies. I have already argued that the potential for public exposure of the results of NIEs is one of the primary drivers behind a more consistent and rigorous approach to the communication of estimates in general. It may well be that this potential for public exposure will force the meanings of WEPs to collapse around certain estimative ranges as well.


ctwardy said...

I'm intrigued by the consistency chart. Have you kept data in subsequent years?

Kristan J. Wheaton said...

I have, without any particularly good reason, failed to do so. Probably because I do not teach that particular class as much anymore. Sorry!