The Spring issue of Informer contained a number of articles on the very successful ECIR 2021 event, so I decided to carry over some reflections on the Industry Day to this issue. My first visit to ECIR was the 2011 Dublin event where I presented my SearchCheck methodology, which in the event was eventually launched in 2020. But that’s another story! The Industry Day has always been important to me as a search practitioner as it gives me a glimpse into how search have a direct impact on business and society. There were ten speakers at the Industry Day in ECIR2021 and several were of considerable direct interest to me. I’m going to focus on these as they are such good examples of search in the real world. The full programme for the day (1 April) can be found on the ECIR 2021 Program pages.
The day began with a keynote paper from Edgar Meij Head of AI Discovery at Bloomberg about the work that he and his colleagues were undertaking to maintain the company’s competitive advantage. The company has over 20,000 employees, of which 6,500 were software engineers and 200 were working on various aspects of AI. For Bloomberg the challenge is in extracting financially-relevant signals from noisy complex tangentially-related data sets.
Among the topics he covered were Named Entity Recognition (in particular temporally-informed analysis and multi-domain), co-reference resolution, word-sense disambiguation and autocomplete suggestions. I was also very interested in NSTM: Real-Time Query-Driven News Overview Composition as a means of managing the presentation of upwards of 1500 news stories about Apple that are released world-wide in a single day. You can read about the approach in this paper.
Equally interesting, as an active use of Amazon for book supplies (as well as my local Waterstones I would add!) was the presentation by Hugo Zaragoza. As with Bloomberg the scale of the operation at Amazon is immense. There are 10 million e-books in the USA with hundreds of releases each day and often multiple changes to the manuscript of an e-book in the course of production. Hugo talked about the differences between book searching and book shopping and how a wide range of query types were being managed, especially when the text of the book has been indexed and is searchable. The commonality between both these presentations was the challenges of working at scale. Much of IR research is carried out on fairly small collections and always the issue that interests me is how these technological innovations scale in the commercial world with multi-million and beyond collections and indexes.
Managing health services overloaded through the pandemic is a core objective of many innovative companies. Kira Radinsky, CTO of Diagnostic Robotics, described the development of an application to triage patients using conversational question – response application to ensure that they were directed to the most appropriate health care professional. Take a look at this short video which gives a good sense of the approach and of Kira’s commitment to this objective.
Of the other papers, I would encourage you to take a look at Jina, a neural open source search application and Aleksandra Piktus who talked about KILT,, an retrieval-augmented generator for knowledge intensive NLP tasks, described in this paper.
The Industry Day papers are not included in the ECIR 2021 Proceedings. Although I can appreciate the reasons for this is does mean that these contributions are not on record unless, like me, you decide to venture out into GoogleLand and find some related content and author links.