In our last article we looked at what enterprise search should do for the workplace. In this piece we will take a deep dive into how it works and, most importantly, how to make it work for you. 

As a reminder, Enterprise Search is a system that facilitates the collection and classification of both structured and unstructured data from across all enterprise data sources, also known as Federated Search, so as to enable it to be returned in search results to users that are looking for it. 

And really, enterprise search is actually made up of various subsystems operating simultaneously in the front and back end. Conceptualizing the system as a process helps break through the complexity without getting bogged down by technical details. And that process starts with gathering content. 


The promise of enterprise search is that you can search everything. So if you are going to search everything – you need to connect into everything. And that’s done through connectors.

A “connector” enables you to plug-in to a content source, and this can be done in two different ways. The first way involves pulling content from a source using a “crawler”. As its name implies, a “crawler” crawls through all sources to extract data – regardless of whether that data is structured or unstructured. 

  • Structured data is that which is formatted in a way that makes it easily searchable. For example: excel files, product inventory, and customer names. 
  • Unstructured data is that which is not formatted in a way that makes it easily searchable. For example: text files, audio, video, and social media postings. 

The second way involves pushing content by forming an API connection between two systems, allowing content to be gathered from its source and sent to where it is being collected without the source itself needing to be accessed. 


After all the content is pulled or pushed out of its source system, its data needs to be processed and stored in a single location: the index


If you are going to find, you need to have an index. The index can be simple – containing just the name of a piece of of content, let’s say. Or it can be comprehensive – containing every word, number or string within a piece of content. For the sake of this piece, we are assuming the more comprehensive type of index – called a Unified Index

Learn more about the difference between Index-time and Query-time Merging, here

Once everything is brought into the unified index, further processing takes place in order to identify and categorize content so as to make it searchable. Essentially, content is analyzed and annotated based on the information it contains, so that it can later be matched to a related query entered by a user in the search box. 

The “information” captured in this step is content metadata which includes the keywords, ownership, department, version history, etc. associated with a piece of content. Metadata helps to make content retrievable by telling the enterprise search software, “If you need to know something about x, choose me!”

For example, a document meant to help someone resolve issues they’re having with the battery life of the watch they just bought may be entitled “Common Customer Issues”. Neither “watch” nor “battery life” appears in the title, but they’re repeated constantly throughout the document itself. The metadata would indicate that this document would indeed be relevant to the query “issues with watch battery life”,  so that it can be returned to the user that entered it. 

After all content is collected and classified, and even further enriched through processes such as text summarization, concept extraction, and optical character recognition, search queries can then be processed and relevant information can be returned to the user in the search results. 

Processing Queries and Returning Results

Queries come in various forms: 

  • Questions: “why isn’t my watch battery working”
  • Phrases: “issues with watch battery life”
  • Or even just Keywords: “watch battery”

Enterprise search engines process any of the above query types and compare the terms used in the query (E.g. “issues with watch battery life”) to indexed information (E.g. “watch”, “battery life”) in order to match the query to any relevant content (E.g. “Common Customer Issues”). 

All relevant content is then ranked based on its degree of relevance to the query in question and returned to the user in that order as a list of search results on the search results page. 

Why Enterprise Search Often Fails 

If you have engineered the above system, you may wonder why your workplace brethren are still unhappy. Chances are it’s because the metadata you may have added at one time has become outdated and stale. 

Enterprise search helps users find content and information, which is valuable. However, static findability alone is not enough to improve user proficiency and satisfaction, as there is no guarantee that what users find is what they actually need – especially as their needs change in different contexts over time. 

For example, two different users might be looking for vastly different things when searching for “watch battery.” One may be looking for information regarding how long a battery will last and another may be looking to buy a new one. 

If enterprise search operates at baseline functionality, as described above, both users will be presented with the same content first – that which is ranked as being the most relevant to the query “watch battery” rather than to the context and intent of the individual that enters that query. 

Towards Intelligent Search with Machine Learning 

Individual relevance and personalized results is where the value-add potential of search lies, and that only becomes a guarantee when enterprise search is transformed into intelligent search through the application of machine learning. Only then will users be able to find exactly what they need to fulfill the intent that they carry. 

However, just as enterprise (or federated) search alone is not enough to achieve outcomes of proficiency and satisfaction, simply applying machine learning in name alone is not enough either. It must be done strategically to be effective. 

This is why it is necessary to become very familiar with the process behind how it works in and of itself (as described above). Because only then will you be able to understand where and how you need to apply ML in order to take site search to the next level – intelligent search – and truly make it work for you and your business. 

Independent Research Firm ReportThe Forrester Wave: Cognitive Search, Q2 2019

With 60% of shoppers reporting that they are frustrated when site search results aren’t tailored to their past online behavior or search query, it’s clear that this shift needs to happen right now to meet user expectations and make them stay. 

To learn more about the transformative qualities of machine learning, check out our next post on intelligent search

Share this story:

About Emily Hunt

Emily Hunt is a political scientist turned Senior Content Specialist at Coveo. As a data-driven content marketer, she combines her analytical skills with a passion for storytelling to produce compelling content across all areas. If she’s not writing, she’s exploring the great outdoors, jamming out to rock ‘n’ roll, or reading Harry Potter...again.

Read more from this author