The Future of Search – Search Solutions Conference 26th November 2015

Future of Search - a theme of Search Solutions 2015

Search Solutions 2015 took place at the British Computer Society (BCS) Headquarters at Southampton Street in Covent Garden, London, UK. The event was jointly convened with the International Society for Knowledge Organization (ISKO). The event was a sell out with over 80 attendees from business and academia. The two sessions in the morning were focused on the consumer web (with talks from Google, Yahoo and Microsoft) followed by talks on search in the workplace. After lunch the sessions focused on systematic reviews and annotating & searching images, followed by the challenges posed with organizing and presenting news content. A lively panel discussion (with wine) closed out the conference, with topics ranging from Boolean Blackbelts to weak and strong Artificial Intelligence (AI).

Consumer web

Google: Behshad Behzadi from Google opened the conference with a talk on conversational search. The comment was made that Star Trek like conversations with a computer could be less than 20 years away. The changing nature of search was discussed, in 2015 for the first time the number of searches made on mobiles exceeded those made on the desktop. With mobile devices the need for conversational search is heightened by smaller screens and the fact people may be doing something else (e.g. cooking, driving). Recent breakthroughs in voice recognition have meant error rates in conversational search have halved (from 16% to about 8%) in the last couple of years. Behshad encouraged people who perhaps had a poor experience of conversational search a few years ago, to give it another go. The remainder of Behshad’s talks was a demonstration using his mobile phone of natural language conversational search. Questions such as “Where is my hotel?”, “I want a picture of the Empire State Building”, “Who Built It”, “When” were duly answered in an impressive demo.

The audience Q&A session raised some interesting points on the nature of ‘facts and knowledge’ and who decides what to pick, as many ‘facts’ involve some subjectivity. Google is extending the coverage of question areas for its mobile assistant, although these remain of the simpler ‘fact’ based type.

Yahoo: Fabrizio Silvestri from Yahoo followed with a talk on query rewriting. How the search engine can transform a query entered by a searcher into queries that are more likely to meet the intent through ‘deep learning’. This is of particular interest for sponsored search and advertising. A novel hybrid technique was presented based on word embedding concepts from Google’s word2vec https://code.google.com/p/word2vec/ with vector space analysis used to identify content and context similarity of not just words, or queries, but whole sessions. Methods were tested on 12Billion sessions from the US website of Yahoo, using 45 million of the most common frequent queries. Examples were shown, such as the query “snow boarding” resulting in an ad “snowboarding gear”. Results using Normal Discounted Cumulative Gain (NDCG) that measures ranking precision, indicated that the method outperformed non hybrid methods, could increase coverage by 20% and empirical evidence indicated the techniques had led to more (undisclosed) revenue through increased ad clickthrough.

Microsoft: Bhaskar Mitra of Microsoft Research/Bing continued the theme of ‘deep learning’ techniques to represent ‘text embedding’s’ (typically dense vector representations). A Google word2vec example was discussed where the user is able to use the query (‘king’ minus ‘man’ plus ‘woman’) and arrive at ‘queen’. Vector space was discussed, where some distance metric typically defines a notion of relatedness between items. Bhaskar also discussed transitions, which is the trend of the edges, for example the edge trend between ‘Seattle’ and ‘Seattle Seahawks’ is similar to the edge trend between ‘Denver’ and ‘Denver Bronco’s’. Search sessions as part of embedding spaces was discussed as a potential area for further research. Pointwise Mutual Information Measure (PMI) was also presented as a method to learn a text embedding that did not involve neural networks. The talk concluded on different notions of relatedness (type vs topical), which may suit different purposes and a word of caution on using pre-trained models that may already have a sense of relatedness embedded in them.

Search in the workplace

Tony Russell-Rose (UXLabs) discussed a tool http://www.2dsearch.com/ that had been built from an Innovate UK grant to look at complex query formulation. A further SmartAward funded project looked at the market for such a tool. A review was undertaken to look at business sectors that perform complex queries. Communities already studied in the literature include Patent Agents, Lawyers, Librarians, Media monitoring and Healthcare. A community that had not been studied before was identified, recruitment professionals. An online survey was conducted of 64 professional recruiters. Results included 3hrs on average for a search task, 5 queries on average. This is much shorter in length to Patent Agents. Boolean queries were commonplace and rated as most desirable. Queries were written on paper or in a text editor, so pretty basic. Evidence of satisficing was present, examining far fewer results on average compared to Patent searchers (30 compared to 100). Tony concluded his talk with a need to possibly create a new Text Retrieval Conference (TREC) test collection for this community, so they could be researched further and search solutions optimized.

Charlie Hull (Flax) called for improvements in methodology for how we test and improve search relevance. He presented four reasons ((1) search is magic, (2) new search tool better than the old one, (3) Search does’nt affect bottom line and (4) Can just fix this one issue) as to why testing may not be given the consideration it deserves. The use of Excel spreadsheets to track queries and score relevance was presented as commonplace and poor communication between the business and developers. For one client, Charlie adapted an OpenSource based tool called Quepid http://quepid.com/ which enabled the manual assessment of relevant queries to be captured in a more efficient way. In conclusion, Charlie suggested that we test search in a methodological way, collaboration between the business and developers is key, some tools already exist and these may be first steps to better relevance tuning.

Plenty of questions throughout the day

In the audience Q&A, the issue of personalization was raised. What may be relevant for one person, may be different for another. This level of complexity may be crucial to a retailer and other sectors. It was stated that many organizations that deploy enterprise search have yet to do even the basics, so personalization is a level of sophistication they have not got to (yet).

Systematic Reviews using text mining

Alison Weightman from the Cochrane Information Retrieval Methods Group at Cardiff University presented learnings for strategies and techniques used by healthcare, searching repositories such as PubMed. How to minimize the number of databases searched and boost ranking of the most relevant ‘jewels in the crown’. The emphasis was on text mining to find word combinations to find other related items. Recommendations to use automatic text extraction have been made in the industry. The area of ‘musculoskeletal care pathways for adults with hip and knee pain at the interface between primary and secondary care’ was used as an example. This was translated into a query of musculoskeletal+pathway+primary/secondary. The twelve relevant reports were used to help seed other queries. Tools such as PubReMiner, Termine and Yale MesSH analyser were used. Abstrackr was also found to be very useful http://abstrackr.cebm.brown.edu. Terms and phrases such as “conversion rate” (related to physio or surgery) and “Referral and Consultation” were identified, that had not been identified by the experts beforehand. In conclusion, free text mining tools saved time and improved accuracy of systematic reviews. A final point was made regarding whether using two screens (rather than one) speeds up the search process. Existing research indicates it does not, but Alison felt this was probably an area which would benefit from more research.

A question was asked about whether items were found that had potentially been misclassified or had incomplete metadata categories tagged. This was apparently commonplace in PubMed and relying on metadata searches alone was not sufficient.

Image annotation

Tom Crane (Digirati) presented the International Image Interoperability Framework (IIIF) http://iiif.io , which uses the same URI pattern to request pixels from any image library. The API allows parameters such as region, rotation and size to be returned. The IIIF supports sharing and citation of images and can drive interoperability of deep zoom viewers such as Mirador or the Universal Viewer. This enables easy comparison, annotation, transcribing and translation of images. Images could be combined from different locations (e.g. dispersed letters) for a single virtual object to view. There are over 100 libraries and cultural heritage institutions using IIIF, over one billion images available. Open annotation (WC3) is another standard that allows annotations to be added to images (e.g. a region in an image). These annotations may support classifying, highlighting, reviewing, commenting or bookmarking for example. These annotations can also be searched using a search API. There is a working Google group for IIIF with two weekly calls.

A question was asked regarding video, audio and 3d/4d objects. This is an area the community will be addressing over the next couple of years.

Dave Clarke (Synaptica) built on Tom’s talk, where a tool built by his company (OASIS Deep Semantic Indexing) combines image annotation (using IIIF), knowledge organization (e.g. taxonomies, ontologies), semantic indexing and linked data. The use case presented was artwork, such as Renaissance, highly symbolic, hard to understand, dense, a lot of figurative detail, iconography, (stories within stories). A clear need was presented to explain ideas behind imagery (also applies in medical, engineering and intelligence gathering). The method of content ingestion (into IIIF servers), open vocabularies into a triple store was explained, then custom Knowledge Organization System (KOS) work before annotating images, tagging to open vocabularies (Linked Data). Searching was achieved through SPARQL queries in real time. Examples were presented of annotating just ‘Mona Lisa’s Smile’ and how that can be expressed as an IIF URI. A demonstration was given of ‘browsing’, ‘searching’ and ‘discovering’ the images and annotations with visually impressive elements such as monochrome masking to emphasise regions with annotations during image browsing.

The audience asked questions around ‘subjectivity’ of annotation, who decides for curation and also full image recognition. Dave mentioned that full image recognition is an area that is being investigated.

Searching news content

Tessa Radwan (Newspaper Licensing Agency (NLA)) discussed the challenges of being a digital news distributor and a news archiver. Using the search tool ClipShare (based on SOLR) they allow search onto 56Million news articles. These articles can be about almost any topic, so categorization is a challenge as they also come from many different parties and journalists tagging is not standard or consistent across the industry. There are also diverse user groups with different perceptions of relevance. A big challenge is receiving multiple versions of news articles that evolve as the news story evolves. The collapsed field function in SOLR was originally used to match and only show a single version but was considered too slow. A set of heuristic rules was developed (looking at title, date, publication and common words) to assess duplicates, which was more effective. Another challenge was to consolidate 4,000 categories into a meaningful number that could be presented to users in a faceted search interface. This was reduced to 100 categories (e.g. World, Business, Lifestyle, Celebrities). For example, in some publications ‘Kardashians’ is a category, whereas in NLA ClipShare it is not, it is an ‘example’ within the category of ‘Celebrities’.

David Corney (Signal Media Ltd) presented his company’s software SIGNAL (a 3 year old tech start up) on how multiple versions (duplicates) of news articles were identified. They collect (one source is the NRA), analyse, distribute over 1 million news articles a day. Their search solution is based on Elastisearch with third party tools for semantic analysis and entity extraction based on Wikipedia. This detects that when a news article mentions ‘David Cameron’ it is the Prime Minister, not a different ‘David Cameron’ from the words used around the name. Clients use the service for brand monitoring, PR and market intelligence. Duplicate news articles can cause problems and get in the way. String matching works but is very sensitive to small changes, cosine similarity is better but would take too long based on the given infrastructure. A novel technique used involved a shortlist generation and pairwise analysis, hashing and search itself helping duplicate reduction. Of the one million news articles a day, 500,000 have a title already seen, 100,000 have content already seen and 400,000 are distinct articles. Social media was discussed and may be an area to integrate in some way in the future.

Panel session

Same procedure as every year: fishbowl session

The interactive panel session debated the tension between ‘old fashioned’ Boolean search which seems as prevalent today as it was twenty years ago, with the ‘state of the art’ conversational search. Some research was stated that indicated non-legal staff often found relevant items using ‘google like’ search that expert legal Boolean searchers did not. Also, by using Boolean brackets, most search engine ‘smarts’ are switched off automatically, it can be a harsh method.

It was argued that search techniques like conversational (AI) based search has “democratised’ search”, “we are all using AI”. At the same time it was felt that not everyone liked the search engine reformulating your initial query to meet what it considers to be the intent, there is something “black box” about it, ‘Lawyers would be aghast”. Is education a critical point?

The debate then centred on AI, with some practioners pushing back at the terms like “AI” and “cognitive computing” being used to sell software, “it is not true AI”. The point was made by Tony Russell-Rose that when he did his PhD in AI, voice recognition was considered AI. That is all but solved, so the goalposts for what is considered AI constantly move, but there is clearly weak AI (that we are all using today) and strong AI (that moves into the philosophical debate of consciousness). A point was made that to deny ‘conversational search’ is AI, may be to argue fine academic points. To most of the population, that is exactly what it is.

In summary, regarding the future of search, there are clearly many different wants and needs. There also appears to be a convergence of many traditionally separate disciplines such as information retrieval, apps/workflow tools, image, text & data analytics, social media, knowledge organization, AI and business intelligence. From looking up a known fact, through to pragmatic or systematic exploring & researching of a topic with thousands of results, through to being stimulated by the unexpected. How these are implemented on mobile devices and desktops, for the consumer, for the enterprise and information provider or broker may vary significantly. Whatever the future holds, it is likely to be rich, diverse and exciting!