Skip to content
Insights

Pop, Soda or Coke? Understanding intent through Semantic Search

Pop, Soda or Coke? Understanding intent through Semantic Search

The US is a cultural and linguistic melting pot. How you describe your soft drink says a lot about who you are and where you’re from. So what is it: Pop, Soda or Coke?

The site popvssoda.com (what Alan McConchie describes as his 15-minutes of fame) contains an interactive data visualisation exploring the different ways soft drinks are described across the US. Vox described it as the ‘great American soft drink debate’.

Data visualisations

So what did Alan’s visualisation show?

Semantic Search Pop Soda Coke Map | Analytics Engines

Along the North Eastern and South Western Coasts, Soda is the winner. Move towards the Gulf of Mexico, and Coke is on top. Make your way through the Midwest, and you’ll be ordering a Pop.

This simple visualisation perfectly illustrates how individuals and groups can express the same idea in a variety of different ways.

It also serves to highlight the unique challenge that such inconsistencies present to data scientists and software engineers.

Meaningful insights

People often express the same idea in very different ways. What does that mean for organisations sifting through large volumes of documents trying to identify meaningful insights?

Take the great American soft drink debate.

A search for ‘Soda’ wouldn’t necessarily surface results for ‘Pop’ and ‘Coke’ due to regional linguistic variations. But those terms refer to the same thing. How do you connect these ideas and link them together?

Understanding intent and meaning

Semantic search allows users to explore text data using natural language. The technique can be applied to any group of digitised documents – internal records, meeting minutes, customer correspondence, news articles, scientific reports.

Our semantic search model ingests the required search term and converts it to a numeric representation that captures its meaning and context. The search engine then finds the documents in the dataset with a similar numeric representation and by extension, similar meaning.

Keyword v Semantic Search

For a recent project, we harvested hundreds of thousands of news articles on risks to the food supply chain. A typical search for this dataset would be “fraud”.

Traditional keyword search would return results like these:

Search Query: Fraud
1. … reported fraud in connection to a…
2. …there have been some frauds recorded…
3. …to be treated as fraudulent or deceptive…

For the same query, our Semantic Search engine would return results like these:

Search Query: Fraud
1. …reported fraud in connection to a…
2. …was known to be using false identities to facilitate…
3. …conspiracy in connection to a scheme to conceal

Semantic Search goes beyond keywords – it goes to the heart of the idea and its context. This approach provides deeper, richer, more insightful and useful results.

Fine-tuning

A common problem that can occur is that documents might contain words and phrases that have a particular meaning within a specific domain. To account for these differences, we use a process known as fine-tuning.

Fine-tuning works by taking a model that has been trained on documents from a more general domain such as Wikipedia and using Transfer Learning to adapt it to the specific language of the user’s documents. Fine-tuning enables the model to identify and filter results based on the specific requirements of the user.

Conclusions

Semantic search is an incredibly powerful tool that enables organisations to understand their documents in a more comprehensive way. It enables organisations to surface insights that might have been otherwise missed.

Semantic Search is just one of the ways by which data science is helping organisations to transform how they operate. In our Innovate UK White Paper, we looked at how Natural Language Processing has helped the Applications and Assessment Team at Innovate UK respond to some of the unique challenges they have faced as a result of the COVID-19 crisis.

Find out more

Our Semantic Search solutions add context to data in a way that enables users to quickly and easily extract the information that’s valuable and important to them. To find out more about us and how our experience with Semantic Search, Transfer Learning and Natural Language Processing might help your organisation, contact us, using the form below.

Share this article
Published
by PJ Kirk
13/06/2023

Fancy a chat?

Get in touch