Diagnosing Enterprise Search

As a digital workplace consultant, I often find myself in workshops with employees to gather requirements. Invariably within a few minutes someone will say “we can never find stuff, the search is awful”, and the whole group will nod agreement.

My company, ClearBox Consulting, has been working in the intranet and digital workplace space since 2007 with clients ranging from small charities to multinationals with over 100,000 employees. Although we don’t specialise in search, we fully appreciate that to most users the intranet is the front door to their enterprise search, and if it isn’t working then it is the intranet’s fault.

There are many reasons why enterprise search can fail to perform, but non-expert users tend to fixate on the search engine as the underlying culprit. To overcome this perception, we created a simple diagnostic tool. We use it with intranet managers, knowledge managers and content publishers to help them understand other potential causes, and – crucially – appreciate that there are positive actions they can take.

Searching step by step

Consider the search process as 4 basic steps:

Content is published
The search engine indexes it
A query retrieves a selection from the content
The user uses the query complete their search

This greatly simplifies what really happens, but from a diagnostic point of view it gives us four useful starting points for things that might go wrong.

Hierarchical view of search failure causes. — Figure 1:Search diagnostic

Using the tool

For each step in the process, there are things that need to go right, such as metadata, security settings and results presentation. In the attached figure (enlarged version), this is the second column. The last two columns reflect underlying symptoms.

It’s not practical to go through the diagnostic for all the content in an enterprise. Instead what I suggest is that when you get feedback that “search isn’t working”, use the tool to check for systemic issues that might broadly apply to sets of content. In particular, note that only a few underlying causes are ‘technical issues’ (green), indicating a search engine issue.

1. Failures of content

It sounds obvious, but often the big issue in enterprise search is that the thing somebody is searching for just doesn’t exist (1.1 in the figure).

Metadata (1.2) can often be poor or lacking. Just using good writing principles for headlines and subheads can help.

Language (1.3) can also present a barrier. A technical document may be written in jargon (“variable performance related pay”) when a user searches in plain English (“bonus”). Even harder, we may expect everything to be in our language and overlook other languages (“2016 sales results for Spain” wouldn’t necessarily find a document called “Resultados de ventas de Espana 2016” )

2. Indexing Failures

The first failure point for indexing (2.1) is that the content needed isn’t indexed. Unlike the web, a great deal of enterprise content might have security controls in place, blocking the indexer from seeing it.

More fundamentally, content may exist in a system that the crawler can’t access, such as a network drive or an application. For example, HR departments may move all their guidelines into an employee self-service system, but if there is no connector with the enterprise search engine then routine content like “Parental leave policy” won’t get indexed.

3. Retrieval Failures

Largely we rely on the search engine technology to get this right (3.1), however, too many results can be a symptom of duplicate content or ROT (Redundant, Outdated, Trivial), meaning a clean-up is in order. It may also mean the absence of good refiners, to whittle down results to the last six months, or only show sales collateral (see Metadata (1.2)).

Retrieval also relies on user search skills. Google is so good we’ve got lazy. But enterprise search sometimes needs very good search skills, such as the use of logical operators. If that’s unrealistic, consider ready-made search interfaces to reduce the cognitive load on the user.

4. Search results

Finally we get to the results page. If you’ve ever done observational user testing you’ll know that sometimes people seem fly straight past the answer and onto their phone to ask for help. So the layout of the results page matters (4.1), and the good news is that this can usually be readily changed.

Hits on documents can make scanning of the results harder (4.4). If the answer is on page 52 of a document, consider breaking it into HTML pages. If the document exists but isn’t shown, check the security settings (4.3).

Finally, users may find the right result, but carry on searching because they don’t trust it (4.5). Governance and publisher training can help here, such as owner and expiry details. Ratings and feedback can help too.