Showing posts with label bias. Show all posts
Showing posts with label bias. Show all posts

Tuesday, March 25, 2014

Reduce Bias In Analysis: Why Should We Care? (Or: The Effects Of Evidence Weighting On Cognitive Bias And Forecasting Accuracy)

We have done much work in the past on mitigating the effects of cognitive biases in intelligence analysis, as have others. 

(For some of our work, see Biases With Friends, Strawman, Reduce Bias In Analysis By Using A Second Language or Your New Favorite Analytic Methodology: Structured Role Playing.)
(For the work of others, see (as if this weren't obvious) The Psychology of Intelligence Analysis or Expert Political Judgment or IARPA's SIRIUS program.)

This post, however, is indicative of where we think cognitive bias research should go (and in our case, is going) in the future. 

Bottomline: Reducing bias in intelligence analysis is not enough and may not be important at all. 

What analysts should focus on is forecasting accuracy. In fact, our current research suggests that a less biased forecast is not necessarily a more accurate forecast.  More importantly, if indeed bias does not correlate with forecasting accuracy, why should we care about mitigating its effects?

In a recent experiment with 115 intel students, I investigated a mechanism that I think operates at the root of the cognitive bias polemic: Evidence weighting. 

Having surveyed the cognitive bias literature, key phrases began to stand out such as:
A positive-test strategy (Ed. Note: we are talking about confirmation bias here) is "the tendency to give greater weight to information that is supportive of existing beliefs" (Nickerson 1998). In this way, confirmation bias not only appears in the process of searching for evidence, but in the weighting and diagnosticity we assign to that evidence once located.
The research of Cheikes et al. (2004) and Tolcot et al. (1989) states that confirmation bias "was manifested principally as a weighting and not as a distortion bias." Further, the Cheikes article indicates that "ACH had no impact on the weight effect," having tested both elicitations of the bias (in both evidence selection and evidence weighting). 
Emily Pronin (2007), the leading authority on Bias Blind Spot, presents a similar conclusion: "Participants not only weighted introspective information more in the case of self than others, but they concurrently weighted behavioral information more in the case of others than self."
Robert Jervis, professor of International Affairs at Columbia University, speaks about evidence-weighting issues in the context of the Fundamental Attribution Error in his 1989 work Strategic Intelligence and Effective Policy.
What if the impact of bias in analysis is less about deciding which pieces of evidence to use and more about deciding how much influence to allocate towards each specific piece?  This would mean that to mitigate the effects of cognitive bias and to improve forecasting accuracy, training programs should focus on teaching analysts how to weight and assess critical pieces of evidence.

With that question in mind, I designed a simple experiment with four distinctly testable groups to assess the effects of evidence weighting on a) cognitive bias and b) forecasting accuracy. 

Each of the four groups were required to spend approximately one hour conducting research on the then-upcoming Honduran presidential election to determine a) who was most likely to win and b) how likely they were to win (in the form of a numerical probability estimate, e.g. "X is 60 percent likely to win"). Each group, however, used varying degrees of Analysis of Competing Hypotheses (ACH), allowing me to manipulate how much or how little the participants could weight the evidence. A description of each of the four groups is below:
  • Control group (Cont, N=28). The control group was not permitted to use ACH at all. They had one hour to conduct research independently with no decision support tools. 
  • ACH no weighting (ACH-NW, N=30). This group used ACH.   Participants used the PARC 2.0.5 ACH software without the ability to use II (highly inconsistent) or CC (highly consistent) functions. Nor were they allowed to use the credibility or the relevance functions.
  • ACH with weighting (ACH-W, N=30). This group used ACH as they had been instructed, including II, CC and relevance, but not credibility.
  • ACH with training (ACH-T, N=27). This was the focus group for the experiment. Participants in this group, which used ACH with full functionality (excluding credibility), first underwent a 20-minute instructional session on evidence weighting and source reliability employing the Dax Norman Source Evaluation Scale and other instructional material. In other words, these participants were instructed how to weight evidence properly. 
While the election prediction served as the metric for assessing forecasting accuracy (the experiment was conducted two weeks before the election), five separate instruments were used in the form of a post-test in order to elicit bias, three of which corresponded with confirmation bias, one which addressed the framing effect and then, finally, representativeness. 

The results were intriguing:

The group with the most accurate forecasts (79 percent) was the control group, or the group that did not use ACH at all (See Figure 1). The next most accurate group (65 percent) was the ACH-T group, or the ACH "with training." 


Figure 1. The Effects of Evidence Weighting Across Four Groups on Forecasting Accuracy and Cognitive Bias.
Note: The percentage for each bias represents the percentage of unbiased responses obtained in that group.
Due to the small sample sizes, these differences did not turn out to be statistically significant which, in turn, suggests the first major point:  That training in cognitive bias mitigation and some structured analysis techniques might not be as useful as originally thought.  

If this were the first time these kinds of results had been found, it might be possible to chalk it up to some sort of sampling error.  But Drew Brasfield found much the same thing when he ran a similar experiment back in 2009  (The relevant charts and texts are on pages 38-39).  In Brasfield's case, participants were statistically significantly less biased when they used ACH but forecasting accuracy remained statistically significantly the same (though, in this experiment, the ACH group technically outperformed the Control).

These results also suggest that more accurate estimates came from analysts who either a) intuitively weighted evidence without the help of a decision tool or b) were instructed how to use the decision tool with special focus on diagnosticity and evidence weighting. This could mean that analysts, when given the opportunity to weight evidence without knowing how much weighting and diagnosticity impacts results, weight incorrectly out of perceived obligation to do so or misunderstanding.

Finally, the next lowest forecasting accuracy was obtained by the group ACH-NW (53 percent) in which the analysts were not allowed to weight evidence at all (no IIs or CCs). The lowest accuracy (only 45 percent) was obtained by the group that was permitted to weight evidence with the ACH decision tool but were not instructed how to do so nor were they informed how this weighting might influence the final inconsistency scores of their hypotheses.  This final difference was statistically significant from the control suggesting that a failure to train how to weight evidence appropriately actually generates lower forecasting accuracy.

If that weren't enough, let's take one more interesting look at the data...

In terms of analytic accuracy, the hierarchy is as follows (from most to least accurate): Control, ACH-T, ACH-NW, ACH-W.

Now, in terms of most biased, the hierarchy looks something like this (from least to most biased):
  • Framing: ACH-T, ACH-W, ACH-NW, Control
  • Confirmation: ACH-W, ACH-T, Control, ACH-NW
  • Representativeness: ACH-W, ACH-NW, Control, ACH-T
What this shows is an (albeit imperfect) inverse to analytic accuracy. In other words, the more accurate groups were also more biased, and while ACH generally helped mitigate bias, it did not improve forecasting accuracy (in fact, it may have done the opposite). If this experiment achieved its goal and effectively measured evidence weighting as an underlying mechanism of forecasting accuracy and cognitive bias, it supports the claim made by Cheikes et al. above: "ACH had no impact on the weight effect" (again, talking about confirmation bias) and, as mentioned, recreates the results found by Brasfield. 

While the evidence weighting hypothesis is obviously in need of further investigation, this preliminary experiment provided initial results with some intriguing implications, the most impactful of which is that, while the use of ACH reduces the effects of cognitive bias, it may not improve forecasting accuracy. A less biased forecast is not necessarily a more accurate forecast. 



***
As a side note, I wanted to include this self-reported data which shows the components that the 115 analysts in this experiment indicated were most influential in their final analytic estimates in general. Note that they indicate that source reliability and availability of information seem to be the top two (See Figure 2).

Figure 2. Self-Reported Survey Data of 115 Analysts Indicating Factors That Most Influence Their Analytic Process
Scale = 1 - 4

REFERENCES

Cheikes, B. A., Brown, M. J., Lehner, P. E., & Adelman, L. MITRE, Center for integrated intelligence systems. (2004). Confirmation bias in complex analyses (51MSR114-A4). Bedford, MA.

Jervis, R. (1989). Strategic intelligence and effective policy. In Frank Cass (Ed.), Intelligence and security perspectives for the 1990s. London, UK.

Nickerson, R. S. (1998). Confirmation bias: An ubiquitous phenomenon in many guises. Review of general psychology, 2(2), 175-220.

Pronin, E., Gilovich, & Ross, L. (2002). The bias blind spot: Perceptions of bias in the self versus others. Personality and social psychology bulletin, 28, 369-381.

Pronin, E., & Kugler, M. B. (2007). Valuing thoughts, ignoring behavior: The introspection illusion as a source of the bias blind spot. Journal of experimental social psychology, 43, 565-578.

Tolcott, M. A., Marvin, F. F., & Lehner, P. E. (1989). Expert decisionmaking in evolving situations. IEEE transactions on systems, man, and cybernetics, 19(3), 606-615.

Tuesday, March 4, 2014

The Mind's Lie AKA "Biases With Friends" (Free Beta App)

BLUF:  The Mind's Lie is a free Android gaming app on the Google Play Store available now.  It is similar to games such as Words With Friends in that you are playing against real people and not against the machine.   

The game is designed to implicitly teach you and the other players (up to six players per game) to recognize confirmation bias, anchoring bias, stereotyping/representativeness bias, projection/mirror imaging bias, bias blind spot, or fundamental attribution error in more or less realistic situations. It is based off a successful tabletop game I designed.

Background:  A few years ago, I was inspired by IARPA's SIRIUS program (which seeks to develop a video game which will teach analysts to recognize and mitigate the effect of the six specific cognitive biases listed above) to try to come up with my own game that would do at least some of the same things.

I don't know how to design video games, though, so I did what I could do - design a tabletop game.  Called The Mind's Lie, it uses an argumentation mechanic to implicitly teach players how to recognize variants of the same six biases that IARPA is testing in the SIRIUS Project.  

Eventually, through some good fortune, I did get to be involved in SIRIUS as a part of a team that Boeing put together.  Mel Richey, who worked with me on that Boeing team, eventually tested The Mind's Lie using people from all over the US and showed that it seems to work - the more you play it, the better you get at identifying the presence of the six biases in more-or-less realistic scenarios.  


Since then, we have been using The Mind's Lie in a series of workshops and in class.  I have been encouraged by the fact that it seems to work best with people who understand that bias is a persistent risk in their day-to-day work - people like lawyers, soldiers and, yes, intelligence analysts.

At about the same time, I was asked to submit an idea for a senior project to the software engineers at Penn State (the Behrend Campus).  I have done this in the past and we had explored the possibilities of a balloon based surveillance system and a Bayesian calculator for analysts (among other ideas) together. 

What I wanted this time, though, was to turn The Mind's Lie into a Words With Friends-type game.  I wanted people to be able to re-create the experience of playing The Mind's Lie around a table while on the go.  The engineers, Steve Chalker, Joe Grise and Kit Torelli, along with their professor, Dr. Matt White, decided to turn my game into an android app.

Nearly a year later, the app is here.  It is not perfect - it's a beta version (at best), but it is out there and free to download and play.  Hope you like it!

Thursday, May 27, 2010

The Effects Of Labels On Analysis (Thesis Months)

(Note: At the risk of making this an all-Jeff-Welgan blog, I thought this week I would cover Jeff's thesis work on the effects of labels on analysis right on the heels of last week's discussion of his work embedded in the new book, Hyperformance).

Does a name matter? Shakespeare says, "No, a rose by any other name would smell as sweet" but most psychologists would disagree. The well known "framing effect" shows that the way a question is asked can determine how people will answer it. Likewise, psychological campaigns aimed at dehumanizing an enemy often accompany wars.

Jeff Welgan, in his thesis called, The Effects Of Labels On Analysis, tests these ideas in the realm of intelligence analysis. Some of you may remember taking Jeff's survey last year. In it, he presented a fictitious scenario set in the Horn of Africa. Each participant was asked to read an identical report of an activity. The only thing that changed was the word used to describe the group conducting the activity. Specifically, Jeff tested the words "group", "insurgent", "rebel", "militia", or "terrorist". He hypothesized that the specific word used would affect the analytic conclusions that participants would draw.

Jeff did not aim his study at a random sample of the general population, however. He took pains to engage analysts in the national security realm, in law enforcement or in business. The results in the image to the right are self-reported (the inevitable cost of a web-based survey...) but he was fairly careful in his approach to getting participants. In all, some 233 of you participated in the experiment (Many thanks!).

Despite his hypotheses, it was unclear what he would actually find. These psychological biases are deep-seated and robust but, on the other hand, there is good research to suggest that credible evidence helps overcome framing issues and intel analysts are typically trained to be on the lookout for sources of bias. As Jeff stated, "My thesis will examine to what extent the quality of analysis is at risk, if it is indeed at risk, as the differing connotations of these labels would suggest."

In the end, the labels wound up making little difference for trained intel analysts. As Jeff bluntly stated, "My hypothesis that these particular labels have significant meaning, and many individuals have a preconceived idea, or cognitive biases, regarding the kinds of actions each of these particular groups conduct must be rejected at this time due to an overall lack in statistical significance across the labels."

This is clearly good news for the intel community at large. It certainly suggests that at least some of the training to defeat at least some of the cognitive biases is working.

The full text of the thesis is below or can be downloaded from Scribd.com.

The Effect of Labels on Analysis

Reblog this post [with Zemanta]

Tuesday, March 25, 2008

COOLINT Part Deux: "The World Through The Eyes Of Editors In Chief" (L'Observatoire Des Medias via Boing Boing)

Check out the application below from L'Observatoire Des Medias (I first saw it on Boing Boing). It shows how different major world papers have elected to cover international news. Click on the red dots and countries get bigger or smaller depending on how they are covered by the various news outlets. For a hi-res version of the map, click on the link to the right of each paper. A few more news sources are available at the link to the L' Observatoire site above.