Tuesday, September 30, 2008

Push Vs. Pull Sources And Killer Intelligence Apps (Indiscriminate Intelligence)

I had a nice talk with the good people at Laancor this morning (They have an interesting plain language search product called QUADRA that seems worth a look) and the discussion made me reflect (in the random, almost indiscriminate, way I think about this stuff ) on the nature of intelligence sources and how to describe them.

Of course, there are many ways to describe a source (not the least of which is along the lines of a traditional INT). One of the most useful ways to think about collection generally and sources specifically, however, is whether the source pushes information to the intelligence professional or whether the intelligence professional has to pull information from the source. Subscribing to a magazine or an email-based list are both good examples of push sources. In both cases, the information source pushes the information to the intelligence professional. The job of the intelligence professional then becomes one of filtering through the information to find bits and pieces that are relevant.

One of the most powerful features of the internet is its ability to push mounds of potentially relevant data to intelligence professionals. This has become particularly true since the advent of Really Simple Syndication (RSS) feeds. This feature of virtually all modern internet sites with dynamic content allows users to subscribe to particular feeds and have them sent directly to their email inboxes or to an RSS feed reader, such as Bloglines.com or Google Reader, in order to manage the influx of information. Modern services such as these allow users to tap into a variety of information sources including traditional news outlets but also social networks and personal and professional blogs.

Pull sources require effort on the part of the collector to acquire. Typically the collector must search for and identify a pull source and then attempt to make "contact" (though this contact might be as fleeting as clicking on a hyperlink) with that source in order to get the information necessary.

The question itself determines much of the difficulty inherent in collecting from a pull source. An easy question, such as the number of telephone lines in a foreign country can be pulled from a variety of sources including encyclopedias, the CIA World Factbook or the International Telecommunications Union’s database. A more difficult question, such as the specific brand name of the equipment purchased by a specific telephone company in a foreign country, would require a more specific pull source.

As a result, operational security is a much more important consideration, generally, with pull sources than with push sources. A specific question to a sensitive pull source is very likely to generate the question, “Why are YOU so interested in this?” from the source. The mere asking of a question, might, in this way, reveal or even compromise some of your own organization’s plans.

Push sources are generally easier to manage than pull sources. Getting websites, listening devices or agents to push information ensures that the intelligence unit is staying ‘in the loop”, that potentially relevant information is coming into the unit in a regular stream and that the main problem is sorting the wheat from the chaff. This, in turn, allows the intelligence unit to better focus its collection efforts on pull sources for information that is outside of the routine information flows and to manage the oftentimes considerable operational security risks associated with the most sensitive of these sources.

What struck me is that I could not think of a single system that dealt as effectively with push sources as Google Reader AND as effectively with pull sources as Zotero COMBINED with an easy storage, search and retreival function like Scrapbook AND a citation management system (think Zotero again). I think the first company that does this will have a killer app on their hands.


Al said...

Robert Heinlein briefly mentions the traffic analysis issues of
push-vs-pull INT sources in his novel 'Friday'. He specifically writes
about message boards (probably equivalent to analysing USENET traffic
in the present).

The basic countermeasure was: download everything. I guess the basic
justification was one could claim to be relaying it for others. The
major problem with this approach is the amount of storage needed. This
is the dilemma any Internet-packrat faces when seeing something
potentially interesting:
1) download it now, lose it on the local storage (disk) for a while
and maybe never get around to reading it, or
2) come back to it later and find out it's been DMCA'd out of
existence (or NSL'd or pulled-down-for-non-payment or...etc).

Kristan J. Wheaton said...

Many thanks for the comment!

I recently read Snow Crash and it had that same sort of fell to it. It is amazing how often good science fiction comes close to predicting the future.