Search Solutions 2022 Conference report

Search Solutions is managed by the Information Retrieval Specialist Group of the British Computer Society and is the only broad-spectrum search event outside of the USA. The conference was held at the BCS London office on 23 November, preceded by a day of Tutorials.

The conference was held at the BCS London office on 23 November, the first on-site Search Solutions event since 2019! This is a brief summary of the presentations, with a link to the author and also links to research papers and web sites mentioned by the authors in the course of their presentation. Heavy note taking!

The conference opened with a presentation by Natasha den Dekker on the approach being taken by LexisNexis to understand the expectations of users and the extent to which the search applications meet them. In the process Natasha gave a very good introduction to user research, describing the differences between behavioural and attitudinal techniques. with an emphasis on the benefits and challenges of A/B testing. She also highlighted the importance of diary studies, which take a lot of effort to set up and execute but bring substantial rewards in understanding the day-by-day use of a search application. (See also https://www.nngroup.com/articles/guide-ux-research-methods/)

The next paper was presented by Amy Walduck over a Zoom link from Brisbane, Australia. Amy started with a moving acknowledgement of the debt that Queensland owed to its antecendents. Amy described a topographical approach to understanding large-scale user logs of over 8 million searches a year on the Library Catalogue, all based on open source software and open data that had been redacted to remove any personal information. Amy remarked that there had been a steady trend over the last few years of queries being framed as questions, in particular ‘How’ and ‘What’ question formats. The software application was constructed with open source software.

After a break Brammert Ottens (Spotify) outlined the search strategy that had been adopted by the company, supporting both text and voice search. He framed his presentation around Mindsets (Focused, Open and Exploratory) and Intents (Listen, Organise, Share and Fact Check). Spotify are fortunate in being able to follow the history of a search as it has data on what the user then listened to and for how long, making it easier (but still very challenging at scale) to optimize the search experience. (See also https://dl.acm.org/doi/10.1145/3290605.3300529)

Another large scale search implementation was described by Mohamed Yahya from Bloomberg. He focused on recent efforts to develop question answering functionality, with the criterion that the outcome has to be correct at the time of presentation and explainable. The target was high precision rather than high recall. The system took a view on whether the question was answerable, given the scope of the repository, and if there was not adequate confidence the response was presented as a display of results rather than a narrative text response.

Of course, when it comes to scale Google takes the accolade. Filip Radinski talked about the increasingly blurred boundary between search and recommendation, focusing on the challenges of searching for film information based on soft attributes, such as scary, uplifting and boring. This comes down to the issue of subjectivity, which Filip discussed in terms of degree, semantic and compositional. Filip reflected on a number of overarching issues in his paper, including transparency (data, model and algorithm) and the lack of an adequate range of corpora to work on natural language search. (See also https://arxiv.org/abs/2205.09403 )

After lunch Farhad Shokraneh gave a quite impassioned paper about the problems that systematic searching gives rise to in a paper entitled ‘Futures of Systematic Searching’ in which the plural was not a spelling mistake! Farad started out describing the process of setting up a systematic review and the challenges of coping with a situation where the review process was in effect invalidated because of one or more research papers being published since the original scope of the review had been finalized. He emphasized that it was not just a matter of rerunning the search as more recent research might require the scope and strategy to be reconsidered. Another issue he mentioned when a machine learning routine decided to downgrade the relevance of papers that did not have an abstract. Farad concluded by presenting four versions of the future of systematic reviews. (See also https://www.sciencedirect.com/science/article/pii/S266730532200031X )

Gavin Moore (University Hospitals Coventry & Warwickshire NHS Trust) continued the healthcare theme with an application that he and Andrew Doyle had developed to be able to store and search clinical guidelines. I know from a project I carried out a few years ago for a major hospital that this is far from a trivial challenge as there are both Trust and NHS wide guidelines which up until March 2022 were maintained by NICE. The solution was based on the Google app and was an excellent example of how a very effective search solution could be developed with very limited resources.

The final session of the day was on enterprise search, which started out with Cedric UImer and Julien Massiera giving a demonstration of integrating Spacy into the Datafari open source application to give an enhanced semantic search capability, including entity extraction and refinement. (See also https://irsg.bcs.org/informer/2022/11/the-evolution-of-datafari-a-european-open-source-enterprise-search-application-cedric-ulmer-ceo/)

This was followed by Paul Lewis describing a project that he and his colleagues at Pureinsights were working on at the Publications Office of the European Commission. Currently this is working in just two languages (English and French) but in time will be expanded to most, if not all, of the official EU languages. What was notable about this implementation was the use of a knowledge graph developed out of the Oracle RDF repository, together with a quite complex content processing stack to deliver a very high-quality search experience. Both this presentation and the previous one from Datafari highlighted the move towards hybrid search applications built on a stack of individual components.

The conference concluded with a number of lightning presentations, each lasting five minutes, from Andy Neill and Richard Giazzi ( the Thompson Reuters HighQ deal support application), René Kreigler (OpenSource Connections) on the effective management of e-commerce search and Sean MacAvaney (University of Glasgow) on rethinking reranking. Cedric Ulmer reminded everyone of the four freedoms of open source software, namely the freedom to use, the freedom to distribute, the freedom to modify and the freedom to understand (exemplary documentation).

Next up were the Search Industry Awards, managed by Tony Russell-Rose.

The winners were

Best Search User Experience – Reza Rawassizadeh and Yi Rong working on ODSearch at Boston University https://paperswithcode.com/paper/odsearch-a-fast-and-resource-efficient-on

Most Promising Startup Giotto – Matteo Caorsi Chief Technology Officer https://compliance.giotto.ai/

Search Professional of the Year Adam Tocock The Hillingdon Hospitals Library Services

Best paper at Search Solutions 2022 (voted by the audience) Filip Radinski, with Farad Shokraneh and Phil Lewis tied for second place

Leave a comment Cancel reply