BIRDS is an interdisciplinary workshop for students, practitioners and researchers in Data Science, Information Retrieval and Information Science held at SIGIR 2020. The aim of BIRDS was to foster the cross-fertilization of Information Science (IS), Information Retrieval (IR) and Data Science (DS). The idea behind BIRDS was the observation that all disciplines operate in the well-known Data–Information–Knowledge continuum, with commonalities and differences. Simply put, DS is data-driven while IS is user-driven. Inbetween is IR, combining ideas from both worlds. With IR as a kind of bridge, some of the questions BIRDS tried to address were: how can we make Data Science more user-oriented, learning from IS and also IR (e.g., by introducing the intrinsic notion of best match vs exact match to general DS or by extending cognitive IR models to DS), and how can Information Science learn or benefit from the rapid developments in Data Science?
Due to the COVID-19 situation, BIRDS was held as an online event with at peak times 33 participants with 2 keynotes, 2 invited talks and a number of long, short and position papers that were selected after a peer-reviewing process. After a brief welcome note by the organisers Haiming Liu, Ingo Frommholz and Massimo Melucci, Carlos Castillo from the Universitat Pompeu Fabra in Barcelona presented his keynote on fairness and transparency in ranking. He first asked whether algorithms can discriminate and looks at different forms of biases, discrimination and fairness for searchers and those searched. He then discussed how we can measure fairness in rankings before looking at how we can create fairer rankings and improve ranking transparency. After the first keynote came the first invited talk by Riccardo Guidotti who discussed the lack of transparency in AI and Machine Learning systems and gave an overview of research in eXplainable AI (XAI). This talk was followed by four presentations of long, short and position papers. In their talk, Amit Kumar Jaiswal, Haiming Liu and Ingo Frommholz discussed how reinforcement learning and the formalism of quantum probabilities can be used to model information seeking based on Information Foraging. Steven Zimmerman, Stefan Herzog, Jon Chamberlain, David Elsweiler and Udo Kruschwitz presented their ideas of a framework for harm prevention in Web search. After a break, Kritika Agrawal and Vikram Pudi looked at how to find grand challenges and saturated problems in the scientific literature. The last presentation of the first session was given by Sehrish Sher Khan and Haiming Liu who explored the impact of user information search behaviour by Affective Design.
After a long break, the second BIRDS keynote was given by Nick Belkin. The title of Nick’s keynote was Challenges and Opportunities for IS, IR & DS in an Era of Information Ubiquity. He remarked that while IS and IR have a long history together, there seems to be less interaction between DS on the one hand and IR and IS, respectively, on the other hand. Apart from DS, IR and IS, Nick brought another important player into the game, Human-Computer Interaction (HCI), asking how IS, IR, DS and HCI can support Radical Personalisation — support for ubiquitous information and data interaction will need to become radically personalised. DS, IR, IS and HCI all have different expertise they can bring to the table to address different aspects of radical personalisation (posing pragmatical and ethical challenges) but no single discipline can address the goal as a whole. As a side note, Nick also recommended Fahrenheit 451 as a summer read to think about “how even the prescience of Ray Bradbury could not imagine the degree to which we are already immersed in information pushed at us.” Nick’s keynote was followed by an invited talk from Xi (Sunshine) Niu who introduced faceted search as an example where IS, IR and DS complement each other. Faceted search is offered by most search interfaces for instance in e-commerce websites, digital libraries, governmental open information and data portals, etc. In her talk, Xi looks at user real-time interactions with facets from an IR, DS and human factor perspective, adopting a Random Forest model to predict facet use. The final four BIRDS talks after a short break were presentations of long, short and position papers again. Hong Qing Yu discussed his approach for extracting causal knowledge from UK health web sites with the aim to create an AI-enabled healthcare system. Tuomas Ketola and Thomas Roelleke extended the well-known BM25 formula and proposed BM25-FIC as an enhanced BM25F method that combines information-oriented search and parameter estimation. Mahmoud Artemi and Haiming Liu discussed a new CBIR system design based on Vakkari’s three-stage model to capture user’s feedback at the query formulation stage for content-based image retrieval. In the final presentation, Massimo Melucci looked at Structural Equation Modelling as a methodology to investigate the causal relationships underlying search engines and recommender systems, for instance, to understand when the system produced biased results.
The closing session concluded the first BIRDS workshop. It contained a discussion about the overall interdisciplinary topic of BIRDS and the differences but also commonalities between the system/computer science aspects of DS on the one hand and the Information Science domain on the other hand. We had participants who migrated from an IS background to a computer science faculty and the other way around, which shows that some cross-fertilisation between the disciplines is already happening and IS, DS, IR and HCI are all contributing to the holistic view of data, information and knowledge and how to make sense of the flood of data and information we are facing today. The success of BIRDS@SIGIR2020 encourages to continue the series and go for a 2nd BIRDS event, hopefully being able to meet in person next time.
BIRDS 2020 was supported by the EU H2020 ITN QUARTZ (Quantum Information Access and Retrieval Theory).