The tool that allowed a 3-letter agency to extract unstructured Facebook chat data for criminal investigations
Most people associate data prep and data analytics with business, finance, and accounting. However, utilizing data means a lot more than just crunching numbers.
Our lives are surrounded by data – every digital activity leaves a footprint that gives us the opportunity to understand behavioral patterns and predict outcomes in many areas.
How a 3-letter agency extracts Facebook chat data for criminal investigations
Recently, we have been working with an agency to extract Facebook chat data for use in criminal investigations.
This organization received massive PDF reports from Facebook that contain chat histories, IP addresses, timestamps, and similar information. This large amount of data was unstructured and could not be directly put into rows and columns for analysis.
The first step was to extract the data to a format that can be used for further manipulation and analysis.
Inputting the data manually into Excel would have taken days, if not weeks, and endless hours of manual labor.
Our colleague Geoff created a model and workspace in Datawatch Monarch to extract the data from these PDF reports and convert the information into rows and columns for easy investigation and reporting.
After importing the information into Monarch, we were able to categorize the data, such as contact names and numbers, photo comments and info, admin data, IP address and access time, and wallpost info.
After the information was formatted into rows and columns, it was exported to Excel for further analysis.
Unlocking the power of data for criminal investigation
There are many ways data can be used in criminal investigation.
Data mining extrapolates the hidden relational patterns from entities and their relationships to help investigators discover useful links.
Text mining helps generate insights from unstructured data and discover knowledge and associations from textual information.
Surveillance logs, telephone records, location-based social networks, financial transaction data, and crime incident reports can be used for relational analysis.
Social media data such as Facebook, Twitter, LinkedIn, and blogs can be examined for positional analysis.
The challenge isn’t the availability of raw data but the ability to transform it into a suitable format for further analysis so that meaningful insights can be extracted.
The use of self-service data prep software has made this possible by replacing hours of tedious, manual work and dramatically reducing the time it takes to extract insights from raw data.
In addition, Datawatch Monarch has the ability to link data directly from the Internet to help investigators tap into the most updated information.
In the criminal justice world, the reduction or elimination of time-consuming data preparation tasks means investigators can identify suspects in less time. Not only does this help close cases faster but can also prevent suspects from committing more crimes.