There were three keynote presentations at ECIR 2021, carefully scheduled for the end of the day for the benefit of participants in North and South America and early risers in the Asia-Pacific region. I’m not going to try to summarise the talks in any depth but hopefully I will encourage you to sit down with your favourite video terminal and spend an hour or so in the company of three brilliant presenters and researchers.
The opening keynote came from Professor Ricardo Baeza-Yates, Northeastern University, who graciously stood in at the last moment when Francesca Rossi was unable to participate. The title of his presentation was Ethics in AI – A challenging task. The video of his presentation is on YouTube.
He explained at the outset that after many years looking at bias in search results he had now extended his interests into the ethics of AI. One of his many affiliations is with the Institute for Experiential AI at Northeastern University. He also made the interesting observation that bias is often regarded as a ‘bad thing’ whereas in reality it is neutral. We impose our own view on its potential negativity or positivity.
In the first section of the talk Ricardo Baeza-Yates commented relatively quickly on a range of current topics, including facial recognition, biometric-based predictions, unfair digital commerce and what he referred to as ‘stupid models’. These are models that cannot deal with ambiguous semantics or irrational behaviour or are too sensitive to changes in parameters. In the second section he discussed data provenance, completeness, usability, transparency and responsibility, also touching on cultural differences and privacy.
Not surprisingly the presentation gave rise to many questions and comments from the audience, all of which were handled with great care and insight.
The second keynote was given by Ahmed H. Awadallah, the winner of the Karen Spark Jones award and a Senior Principal Research Manager at Microsoft. The title of his presentation was “Learning with Limited Labeled Data: The Role of User Interactions”. The video of his presentation is on YouTube. The issue at the core of his presentation was that we still require massive amounts of hand-labelled training data even with pre-trained models. He also highlighted that training deep networks with noisy labels is challenging since they tend to fit and memorize the noise.
When it came to semantic parsing errors Ahmed showed that many of these are minor and can be corrected if humans have a way to continue interacting with the system. One of the threads of his research at Microsoft which he wove into the presentation was the development of more interactive natural language interfaces, and there is a very good presentation from his research team here [download] which is well worth reading if you want more detail.
The third keynote was presented by Ophir Frieder from the Department of Computer Sciences, Georgetown University, Washington DC. The title of his presentation was “Untraditional (Computer) Medicine” and as with the other keynotes can be found on YouTube. He focused on the use of the patient records in electronic health care to support the early (and hopefully more accurate) diagnosis of diseases and clinical conditions. The technology has been awarded a patent and has been taken up by Maxeler Technologies. There has of course been a great deal of interest in mining social media, in particular Twitter, to identify disease outbreaks, and indeed this was the theme of the Strix Award presentation in 2020 (ADD LINK). The goal is to develop a computer system that acts as an assistant, predicting the efficacy of a proposed prescription for a determined ailment for a specific patient. The predictive system must resist high noise and variance, support high accuracy and provide model interpretability. Black boxes have no role in medical care!
In the second half of his presentation Ophir discussed the role of machine learning in public health care, looking at word sets with a significant increase in prevalence over time. Inevitably this work has moved towards understanding the potential value in a pandemic like Covid-19. If you would like to read more about the underlying technology take a look at US Patent 10902955. Overall a very timely talk and one that emphasized the societal benefits of the technologies that are now being developed, trialed and brought into daily use.