Monday, November 18, 2019

Chapter 2: In Which The Brilliant Hypothesis Is Confounded By Damnable Data

"Stop it, Barsdale!  You're introducing confounds into my experiment!"
A little over a month ago, I wrote a post that asked if the form of an estimative statement mattered in terms of communicating its content with regard to analytic confidence.  Specifically, I asked people to determine which of the following was "more clear" in response to the question, "Do you think the Patriots will win this week?":
  • "It's a low confidence estimate, but the Patriots are very likely to win this week."
  • "The Patriots are very likely to win this week.  This is a low confidence estimate, however."
I posted this as an informal survey and 72 people kindly took the time to take it.  Here are the results:



At first glance, the results appear to be less than robust.  The difference measured here is unlikely to be statistically significant.  Even if it is, the effect size does not appear to be that large.  The one thing that seems clear is that there is no clear preference.

Or is there?


Just like every PHD candidate who ever got disappointing results from an experiment, I have spent the last several weeks trying to rationalize the results away--to find some damn lipstick and get it on this pig!


I think I finally found something which soothes my aching ego a bit.  The fundamental assumption of these kinds of survey questions is that, in theory, both answers are equally likely.  Indeed, this sort of A/B testing is done precisely because the asker does not know which one the client/customer/etc. will prefer.

This assumption might not hold in this case.  Statements of analytic confidence are, in my experience, rare in any kind of estimative work (although they have become a bit more common in recent years).  When they are included, however, they are almost always included at the end of the estimate.  Indeed, one of those who took the survey (and preferred the first statement above) commented that putting the statement of analytic confidence at the end, "is actually how it would be presented in most IC agencies, but whipsaws the reader."

How might the comfort of this familiarity change the results?  On the one hand, I have no knowledge of who took my survey (though most of my readers seem to be at least acquainted in passing with intelligence and estimates).  On the other hand, there is some pretty good evidence (and some common sense thinking) that documents the power of the familiarity heuristic, or our preference for the familiar over the unfamiliar.  In experiments, the kind of thing that can throw your results off is known as a confound.

More important than familiarity with where the statement of analytic confidence traditionally goes in an estimate, however, might be another rule of estimative writing and another confound:  BLUF.

Bottomline Up Front (or BLUF) style writing is a staple of virtually every course on estimative or analytic writing.  "Answer the question and answer it in the first sentence" is something that is drummed into most analysts' heads from birth (or shortly thereafter).  Indeed, the single most common type of comment from those that preferred the version with the statement of analytic confidence at the end was, as this one survey taker said, "You asked about the Patriots winning - the...response mentions the Patriots - the topic - within the first few words."
Note:  Ellipses seem important these days and the ones in the sentence above mark where I took out the word "first."  I randomized the two statements in the survey so that they did not always come up in the same order.  Thus, this particular responder saw the second statement above (the one with the statement of analytic confidence at the end) first.
If the base rate of the two answers is not 50-50 but rather 40-60 (or worse in favor of the more familiar, more BLUFy answer) then these results could easily become very significant.  It would be like winning a football game you were expected to lose by 35 points!

Thus, like all good dissertations, the only real conclusion I have come to is that the "topic needs more study."

Joking aside, it is an important topic.  As you likely know, it is not enough to just make an estimate.  It is also important to include a statement of analytic confidence.  To do anything less in formal estimates is to be intellectually dishonest to whoever is making real decisions based on your analysis.  I don't think that anyone would disagree that form can have a significant impact on how the content is received.  The real questions are how does form impact content and to what degree?  Getting at those questions in the all important area of formal estimative writing is truly something well-worth additional study.

2 comments:

Unknown said...

Off-the-wall thought here: because the estimative confidence is a meta-statement, might it make sense to treat it in a manner similar to how classification levels are flagged in documents? So, instead of (U) at the beginning of your statement about the Patriots, it might start with [L]. Obviously, there are lots of issues - training people as to what the markings mean, and finding a way to communicate in non-written media, but maybe there's a seed of an approach there?

This could have a secondary value in indicating which statements are estimative - only those that are estimative would be marked (or perhaps there's a special mark for non-estimative statements).

Kristan J. Wheaton said...

I really like this idea! Let me think about it for a bit... (Thank you!)