In the aftermath of the recent Boston Marathon terrorist attacks, I stumbled across a very interesting article by FCW which provides insight into the latest technology and IT trends being deployed by government agencies. In prior blogs, I’ve mentioned on more than one occasion how the use of “big data” seems to be proliferating more than ever. Big data is defined in simple terms as a means to gather insights from large amounts of data sets and then disseminating those insights into strategic and tactical courses of action. It is actually not surprising that big data practices are being used to help solve crimes simply because in today’s modern age of counter-terrorism, what goes on “behind the firewall” is almost as crucial as what happens at the actual crime scene.
Ultimately, the FBI investigation provided the public a glimpse of how the deployment of big data and data analytics practices is just scratching the surface towards large scale use down the road. Here is a recap of the key takeaways from the article1
- Less than 24 hours after the two explosions killed three people and injured dozens more at the April 15 Boston Marathon, the FBI had compiled 10 terabytes (TB) of data in hopes of finding needles in haystacks of information that might lead to the suspects.
- The FBI-led investigation analyzed mountains of cell phone tower call logs, text messages, social media data, photographs and video surveillance footage to quickly pinpoint the suspects.
- Facial recognition software was being used to compare faces in photographs and video against visa, passport, driver’s license and other databases.
- While the 10TB of data gathered by investigators seems like a drop in the bucket (the Feds usually work with Petabytes of data), the investigation still presented officials with a large amount of data crunch due to the sheer volume, various types of media and overall complexity of information they were dealing with and requiring a tight window period to analyze.
Dealing with multiple terabytes or more of video, digital images, text message and cell phone records is complex enough as it is. Just imagine how much more of a quagmire is created when you bring social media into the fray? What I found most interesting about this article was that investigators utilized the services of a company called Topsy Labs to sift through billions of tweets. Topsy has stored every tweet generated since July of 2010, and in the case of this terrorist investigation, allowed investigators to run big-data analytics of Boston-related tweets against hundreds of billions of past and present messages. Topsy’s database analytic software allowed investigators to search every reference ever made to Twitter of the word “bomb” in a specific region including Boston and its adjoining suburbs.
Ultimately, this type of detailed search turned up deleted bomb references from both suspects’ Twitter accounts. This type of search through public records likely revealed additional clues that proved detrimental to the investigation, including which users re-tweeted the bomb mentions or engaged in incriminating dialogue with the terrorist suspects. Furthermore, Topsy has “geo-inferencing” technology which allowed the investigators to accurately map where specified tweets originated (pretty cool considering only about 1% of Twitter users geo-tag their tweets). According to Topsy, those capabilities make it 20 times more accurate than standard Twitter location data.
How amazing is that?
Emulex – We ‘get’ big data
At Emulex, we believe the heart of big data at the core, lies within the framework of an organizations’ network. There are thousands of servers performing parallel processing to create value and those servers talk to each other over Ethernet and Fibre Channel protocols. As such, the latency and throughput of the network’s traffic is the critical path for fast results in big data deployments. Emulex solves these latency issues and is the chosen vendor by organizations worldwide because we provide the right I/O solution to maximize data clusters and allow for the seamless deployment of big data solutions. For a more in-depth view on Emulex’s big data expertise please reference my earlier blog here.
It is unfortunate that we live in a world of uncertainty, fear and carnage at the behest of a few loathsome individuals. But it’s also refreshing to remember our community is capable of greatness and benevolence in times of need just as the citizens of Boston demonstrated during and after this terrible incident. Even though the deployment of big data practices were crucial towards the investigation, let’s not forget it was ultimately the tip of a citizen that finally led investigators to the two perpetrators. In the end, no technology, no matter how advanced, can replace the fortitude and good-will of mankind.
¹ FCW, APR, 2013