I have occasionally read individual profiles on RightWeb, and they have this to say about their efforts to track the influence of those who advocate excessive application of military force:
Right Web is a program of the Institute for Policy Studies (IPS) that assesses the work of prominent organizations and individuals—both in and out of government—who promote militarist U.S. foreign and defense policies, with a special focus on the “war on terror” and the Middle East. Right Web aims to foster informed public debate about these policies with feature articles and profiles of individuals and organizations that examine political discourses and institutional allegiances over time.
I happened to be looking at a profile their yesterday morning and I noticed that the content was amenable to being processed using Maltego and the natural language processing features of Alchemy and OpenCalais. I was right about accessibility, but the well tended set of profiles on 325 individuals was so link rich that it choked Maltego.
I saved the profiles page and with about fifteen minutes work I got a CSV file containing the name of each person and the link to their profile. I think it took Maltego about ten minutes to process the calls to the external services which extracted a large set of names, locations, and phrases. Once the content was there the trouble began. I pruned obvious mistakes and merged entities that were clearly the same. As an example, some profiles mention “President Obama” and others says “Barack Obama”.
Even with this clean up the content was cumbersome – 325 individuals with between fifteen and twenty five entities found in each profile basically brings Maltego to a standstill. Trying to scroll in entity list is painful and the visualization modes freeze for minutes at a time, or simply fail to redraw all together.
There were a large number of countries and quite a few specific cities that appeared as locations. Afghanistan and Pakistan, Iraq and Iran, Egypt and Gaza; these were common pairings for the profiles. I sorted them by creating regional location points and linking them but all this accomplished was creating a dozen more nodes on an already overwhelming graph.
Selecting the Person entities that had been discovered and moving them along with the associated profiles to a new subgraph was equally problematic. I removed the Person entities with only one mention .. then two … and finally when I had eliminated everyone with fewer than ten mentions I had a graph that was tolerable to explore.
President Obama was mentioned often, as were both Bushes, Clinton, and Reagan. Funniest were #!5, Islamophobe Frank Gaffney and #16 Condoleezza Rice, with one less link. A man that Grover Norquist famously described as a “sick little bigot” swings as much weight as a former secretary of state? Other LoonWatch favorites mentioned requently include Pamela Geller and Robert Spencer.
Maltego was a good starting point thanks to the named entity recognition support but the size of the response chokes it. The next sensible thing to do is export the content, but a simple minded approach to handling it won’t yield a lot of value. Gephi can swallow a dataset that size, but this is a few groups of entities of a specific type, and that’s not really a place to use Gephi beyond initial reconnaissance. Sentinel Visualizer is the closest fit in my data visualization toolbox, but importation will be a lot of work – some of the linked individuals are there because they agree, while others get mention because they provide opposition.
This is the perennial problem – you can have content, you can get it into a system, but there is no substitute for a human who follows the happenings. Good tools expand the reach of good analysts, they are not a substitute for having and developing that good analyst in the first place.