Skip to content
Case Study

Using AI to Identify Story Leads in Investigative Journalism

Using AI to Identify Story Leads in Investigative Journalism


Analytics Engines are supporting a market-leading media organisation, in particular its investigative journalism unit. Their News and Current Affairs programming is the most popular news source in its country – with 77% of the public regarding it as their main source of both national and international news.

Analytics Engines secured the contract to provide a data platform for journalists that queries and investigates multiple data sources. Further to the original scope of work, the Analytics Engines team has been augmenting the platform with additional feature sets.

The client required a system which could acquire, integrate, interrogate, and present large banks of openly accessible public-interest data to assist in generating story leads and assist in research. The customer also required a news, research and investigative database that could handle all big-data projects and internet research at a scale that was not previously possible. The aims of the project include finding, gathering and interrogating data that has news value, allow story leads to be found more efficiently and more consistently, and without the need for direct input from journalists.

The solution

The solution automates and replicates complex investigative queries to identify potential leads and draws from a variety of external sources as well as internal datasets. A data extraction layer transforms the underlying data structures to enable the combining and linking of datasets as well as making them understandable to the typical end user. The solution enables access by users of different skill levels and access permissions and allows underlying databases to be interrogated without any knowledge of coding on the user’s part, while also providing tools for advanced users to run complex queries across the database.

Powered by AI algorithms and trained using large volumes of English text, the solution contains a Thematic Search feature which allows users to search documents for concepts and ideas, rather than simple keywords.

The system provides a solution to one of the most difficult problems in contemporary research, how to efficiently and comprehensively interrogate large data for newsworthy content. Our solution opens up the possibility of finding and delivering, more targeted, relevant material to audiences in a more efficient, comprehensive, and cost-effective way.

The main difference was that Analytics Engines understood and listened to the problem, rather than trying to map our problem to a solution that already existed.

Investigative Journalist
National Broadcaster

Share this article
by PJ Kirk

Fancy a chat?

Get in touch