TellFinder simplifies the process of investigating publicly available Internet data. Through efficient visual analytics that automatically characterize and organize data, TellFinder helps users quickly locate or discover potentially illicit activity by the entities and organizations that post them.
Scraping and Feature Extraction
Each TellFinder deployment is powered by an extensive historical database of domain-specific postings scraped from publicly available websites. To ensure that users have the most current information, TellFinder supports regular updates with newly scraped and processed data.
Natural language processing techniques are applied to the scraped postings to extract key identifying attributes, which are stored and indexed with Elasticsearch. Additional extraction and cleaning captures obfuscated attributes (e.g., a phone number written as 508 5five5 38four1) and normalizes the extracted features.
Using the scraped and cleaned attributes, sophisticated clustering algorithms group postings that repeat the same identifying information (e.g., phone numbers, email addresses, websites or usernames). Each clustered group represents a “persona,” an entity that exhibits the common features in the related postings.
Each posting belongs to one persona, but because entities and organizations can share resources, the same identifying attributes can appear across multiple personas.
TellFinder is a document search engine with meaningful visual aggregations that enable users to research cases in significantly less time. Automatically aggregated personas give users access to all the data that matches a tip in seconds, eliminating the need to manually click through every result as they would with a traditional search engine.
Attributes extracted from postings allow users to visualize, slice, and combine results in many ways. A facets sidebar summarizes the top occurring attributes, which can be used to highlight, narrow down, or augment the most relevant entities or organizations. Other views plot results on an interactive geographic map that can reveal movements.
Simple case management capabilities allow users to store relevant results for future use and receive alerts when newly ingested postings match previously saved persona attributes.
Review the System Description for more information on the individual components that make up TellFinder workflow.