Have you heard of Natural Language Processing (NLP)?
NLP has become a buzzword in eCommerce as well as in many other industries, such as customer service. It is by no means a new field, but there has been a new surge of interest due to the benefits that it appears to bring to organizations that adopt it.
Just as with other technological advancements, the innovation surrounding it captures the public’s imagination, as do the terms associated with it – such as deep learning, sentiment analysis, and topic modeling. However, while everyone joining the parade can boost development, it can also give rise to misconceptions too. In the case of NLP, many have embraced the promise it holds, but have also overlooked the complexity of human languages and the difficulty of working with them.
The realization that things aren’t as easy as first imagined has led some to be impatient and others to consider abandoning their efforts entirely. Pie-in-the-sky fantasies have led to disappointment with reality. According to Gartner, NLP has in fact entered a “trough of disillusionment” phase – yikes. Expectations of this technology could start to wane as experiments and implementations fail to deliver.
The true potential of this technology can be unlocked, but only by setting realistic expectations. Let’s get real about NLP and talk about what it really means in the context of eCommerce – it has a lot to offer.
What is Natural Language Processing (NLP)?
First things first: what is NLP then? Well, it’s not this.
Now that we’ve swept pseudoscience aside, we can talk about the real deal, starting with a natural language processing definition. NLP refers to the comprehension by computers of the structure and meaning of human languages, which in turn allows humans to successfully interact with the computer using natural sentences.
Simple, right? Maybe for an AI scientist, but enabling interactions between computers and humans is hardly straightforward. Your conversations with Siri are more complex than you think.
We use natural language as an everyday means to communicate with other humans, through our innate ability to understand, process and utilize words. We use English, French, Italian, and the language known only to you and your childhood best friend.
These languages are very different from formal languages, such as mathematical or logical notations, or computer languages, such as Java, LISP, and C++. This makes interactions between computer and human (natural) languages challenging. NLP is a multidisciplinary approach that brings machine learning, statistics, and linguistics together to make these valuable interactions possible.
NLU vs. NLG: A Meaningful Distinction
Given the broad definition of NLP presented above, and the fact that “interact” can mean quite a number of things, it should come as no surprise that NLP actually has many subfields.
An important distinction needs to be drawn between two of these subfields: Natural Language Understanding (NLU) and Natural Language Generation (NLG).
NLU refers to the process of reading and interpreting language (translating text) and NLG to the process of generating a natural language. NLU is what allows Siri to understand that you want to know what the weather is in Montreal and NLG allows her to respond with “really cold” or something even more complex:
In the context of eCommerce, NLU drives site search. This is where users speak, or type how they speak, expect to be understood and given the information they need. On the other hand, chatbots generate language. They speak back to the user, generate language and, therefore, are driven by NLG.
NLU and eCommerce Search: Handling the Five Finicky Features of Natural Language
Across different eCommerce verticals, the way we communicate with a website is very different from how we talk to other people – most of the time. The majority of all search queries actually contain two words or less, which is not very “natural” at all.
Given our unnatural search tendencies, natural language product search wasn’t really a motivating factor in the adoption of AI for eCommerce (Gartner 2019). Understanding natural language wasn’t prioritized because that’s not how people were communicating online, so applications of AI were focused elsewhere and simplistic machine translation remained the norm.
Things are changing though. While people searching how they speak is still far from being commonplace, this behavior has been increasing in recent years. Google’s search, Apple’s Siri, and Facebook’s Graph Search are perhaps some of the most well-known cases and are setting the bar very high for eCommerce players.
Analysts agree that we are approaching a conversational commerce revolution, which means NLU will prove to provide a competitive advantage. Microphone icons have appeared on eCommerce websites over the past few years, but these tools cannot really help unless the search has real NLU capabilities.
While speaking is something people do without thinking much, or sometimes at all, natural language is actually highly complex and can be broken down into many features. There are five features in particular that need to be taken into account for NLU to really work in eCommerce.
(1) Natural Languages are Creative
“The most striking aspect of linguistic competence is what we may call the ‘creativity of language,’ that is, the speaker’s ability to produce new sentences, sentences that are immediately understood by other speakers although they bear no physical resemblance to sentences which are ‘familiar’.” ~ Noam Chomsky, linguist extraordinaire.
You may not realize it, but the fact that you can generate and understand sentences that you’ve never heard before is incredible – and you do it all the time.
This makes life very unpredictable for a search engine. 15% of Google searches done by users on a daily basis have never been seen before. As we have seen, eCommerce searches are still fairly unnatural. However, as natural language searches are expected to become more common, the ability of your search engine to return relevant results to new and unpredictable queries will not be praised, but expected.
(2) Natural Languages are Structured
The fact that we arrange words in a sentence linearly one after the other shouldn’t cause us to forget that different words play very different roles in our languages and searches. In the context of eCommerce search, understanding the role and weight that different words play in different types of searches is the only way to uncover their true meaning.
For example, if I’m searching for “brown gloves” and there aren’t any in stock, I might be interested to see some orange gloves, but not brown purses. Not every word in my original search held the same importance. The product I was looking for (gloves) was far more important than the attribute (brown) I included.
The ability to distinguish between products and attributes, and attach the correct level of significance to different terms, is essential – especially as queries become more complex.
If I query, “how much will it cost me to buy a cashmere sweater”, the terms cost, cashmere, and sweater are the most important – they contain more “semantic weight”. Everything else is just noise, but those additional words can lead to irrelevant results if they are given the same level of importance.
NLP is centered around the reality that words carry meaning, and it can help determine which words are worth taking into account through word segmentation and weighting of terms based on significance. That way, I can happily spend my money rather than get a result for how a cashmere sweater…is made. I’m curious, but not that curious.
(3) Natural Languages are Inferential
Understanding a sentence also involves being able to draw a number of inferences. This ultimately comes down to using search words as clues to make additional conclusions about what someone may or may not want to buy.
For example, if you hear “Jane loves all kinds of shoes” you can draw the inference that “Jane loves ballerina shoes” as ballerina shoes fall within the category “shoes” or, more precisely, are a hypernym of shoes. Similarly, if I was looking for a “formal dress” you might be able to infer that I’m not interested in “casual dresses” as formal and casual are antonyms here.
The ability to understand synonyms is particularly important in eCommerce. There will always be a discrepancy between what a consumer calls a product and how the metadata describes it. All possible variations of an attribute or product must be understood and provide the same results. Whether a customer uses “waterproof” or “ gore tex” when looking for their winter boots, they expect to see what they want to buy.
(4) Natural Languages are Ambiguous
There are two kinds of ambiguity that make natural languages difficult to understand: lexical and syntactic/semantic.
Lexical ambiguity refers to words that share the same spelling but have different meanings. For example, the word “squash” can be used to refer to a vegetable, a sport, a British fruit drink, and so on, making it a highly polysemic word.
Understanding polysemy can be an important factor in determining user intent. It might not be much of a concern for smaller websites, but eCommerce sites with a wide catalogue of products will find that the same word can actually be used to describe a variety of different products. And someone looking for squash sports equipment doesn’t want to see results for the following:
In other cases, ambiguity is not lexical but syntactic and semantic. Ambiguities can for instance be generated by conjunctions, such as in “Jane is looking for discounted gloves and hats”. It’s not clear whether she’s looking for “discounted gloves and discounted hats” or “discounted gloves and both discounted and undiscounted hats”. However, in the world of eCommerce, Jane expects that you will know exactly which variation she intends to find.
(4) Natural Languages are Context-Based
Words and phrases take on different meanings depending on who is using them in a certain situation.
Consider the use of subjective qualifiers. Gradable adjectives like “tall” or “expensive” don’t have a single meaning. My conception of what “expensive” means might be very different from yours.
This conception might also change depending on what store I’m in at the moment. Something considered to be expensive at a dollar store would seem like a steal in comparison at a luxury shop.
Additionally, we typically rely on unarticulated and contextual information to understand what people say and mean. For example, if you’re in the bathroom section of a large department store, your statement, “I’m looking for a mirror” actually means, “I’m looking for a mirror for my bathroom”. If the store clerk showed you a mirror for living rooms, that would frustrate and confuse you.
There needs to be a way to tap into those contextual clues online to avoid customers having those negative feelings of irrelevance.
Why You Need Cognitive Search to Lead in eCommerce
Many eCommerce players still rely on legacy search technologies and age-old methods such as rule-based keyword search. These technologies fail to ensure relevance with simple queries, let alone when more complex and “natural” queries are involved. These methods don’t even come close to being able to accommodate the five finicky features of natural languages and natural language searches that we described above.
A wave of semantic search engines has made eCommerce search more meaningful and contextual by addressing some of the limitations of keyword search. However, what is truly needed to enable successful interactions between humans and eCommerce search engines, especially with the rise of conversational searches, is cognitive search.
That is why Coveo acquired Tooso – to deepen its natural language understanding, develop its NLP algorithms, and boost its cognitive search capabilities. Successful interactions are key to exceptional intelligent experiences, so this technology is the only way forward.
The above graphic displays just the beginning of what this technology has to offer. Cognitive search not only turbocharges search engines’ inferential and word-sense disambiguation capabilities, but it also learns individual customer characteristics and desires in real time by leveraging multiple digital data sources. This provides better results that only improve over time.
Only by combining machine learning, statistics, and linguistics does it become possible to process the spoken and unspoken signals to decode natural languages and ensure that Jane finds exactly what she wants when she searches for “a cheap elegant dress for my daughter”.
Jane already expects this. Are you able to deliver?