Conference Review: SIGIR 2012

Choices, choices, choices: Bites of Oregon or SIGIR Tutorials?

SIGIR 2012 was hosted by Oregon Health & Science University in Portland Oregon. The conference was held at the Marriott Downtown Waterfront Hotel. While many conference attendees acquired residence at the conference venue, others were dispersed throughout lodgings in the downtown area. The busy five day schedule consisted of three days of main conference proceedings (including an industry day), bookended by tutorials and workshops respectively.

Sunday – Tutorials
Although the conference began on the Sunday at 8:45am with tutorials and a Doctoral Consortium, I took the chance to stroll around `The Bite of Oregon’ food, beer, and wine festival, which was in full swing just five minutes walk from the conference venue. While the beer and wine were refreshingly good, they did not come close to a nice cool Guinness (see SIGIR 2013) or a Spanish coffee which could be sampled at Hubers cafe (the oldest bar in Portland). In the afternoon, Emine Yilmaz, Evangelos Kanoulas and Ben Carterette delivered an excellent up-to-date overview of IR evaluation in their tutorial “Advances on the Development of Evaluation Measures”. Borrowing from the presenters’ quote of Lord Kelvin, “What you can’t measure you can’t improve” was a fair motivation to attend. The tutorial mainly focused on offline evaluation, covering the latest trends in formalising and defining new metrics and methodologies which capture search result effectiveness on wider and more realistic perspectives. The first part of the tutorial gave a good account of probabilistic models of user browsing behavior as a formal foundation for the definition of new IR metrics, as well as a retrospective unifying view on traditional ones. The tutorial went on to present the latest progress on evaluation techniques from a session perspective, the implications on the task definition, the test collections, and the metrics. The last part of the tutorial conciled clarity and depth in a well-presented overview of the recent advances in evaluating for IR diversity and novelty.

Monday – Main conference & Poster Session
On Monday morning a new Salton award winner was unveiled to the community. The Gerard Salton Award is a triennial award presented to those who have made “significant, sustained and continuing contributions to research in information retrieval”. Prof. Norbert Fuhr’s first duty was to delivery a keynote talk in which he outlined the importance of theory in a field in which there is a increasing immediacy for research advances that are practical and applicable. A very important point outlined by Prof. Fuhr was in the distinction between `How’ and `Why’ experiments. `Why’ experiments, he explained, are those that are aimed to validate a theory or explain phenomena, while, on the other hand, `How’ experiments aim to improve some aspect of the performance of the IR task regardless of a known theory behind `Why’ it works. The broad Science (theory) versus Engineering (practice) debate would be later re-ignited during the Industry day panel session.

SIGIR Audience

A personal highlight of the afternoon sessions was the paper “Time-Based Calibration of Effectiveness Measures” by Mark Smucker and Charlie Clarke, which received the best paper award. Smucker and Clarke showed an interesting extension of recent utility-based IR metric schemes based on user behavior models to account for the role of time in such behaviors. In the proposed model, time is the factor determining the cost for the user in his browsing experience, and the point at which the user stops examining (and drawing gain from) a result list. There may be here an interesting direction for further progress in the user model-based strand of research on IR metrics.
The demo and poster session began on Monday evening with a gaggle of presenters jostling for the best positions to affix their posters before the start of the session. The session lasted well into the evening, an indication of the success of the event. People then dispersed around the downtown area to discuss research and relax over some food.

Tuesday – Main Conference & Banquet
The banquet was held at the Portland World Trade Center. As the evening went by James Allan presented the SIGIR 2012 paper awards: the best paper distinction was awarded to Mark Smucker and Charles Clarke for “Time-Based Calibration of Effectiveness Measures”. Additionaly, Lidan Wang, Paul Bennett, and Kevyn Collins-Thompson received an honorable mention for their paper “Robust Ranking Models via Risk-Sensitive Optimization”. And the best student paper was awarded to Shuzi Niu, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng, for “Top-k Learning to Rank: Labeling, Ranking and Evaluation”.

Wednesday – Main Conference & Industry Day

Industry Track Panel

On Wednesday I attended the Industry Track and was treated to an entertaining and informative opening presentation entitled “The Jeopardy! Challenge and Beyond” by Eric Browne of the IBM T. J. Watson Research Center. The industry day culminated in a panel session that fielded questions regarding the type of research that should be conducted during a PhD. Stephen Robertson argued that the PhD should learn the general skills necessary to become a researcher rather than simply being encouraged to improve upon the best baseline. Diane Kelly espoused a more physiological/sociological line of research that aimed to discover if, and how, ‘search’ has led to a change in our behaviour. The ‘slow search’ movement may have be been born at this session.

Thursday – Workshops
On a compelling array of choices on the last conference day, I opted to come by the workshop on Open Source Information Retrieval, where I was pleased to find a high-quality programme of technical papers, invited talks and lively debate on open IR software development. Grant Ingersoll from LucidWorks presented a very interesting update of Lucene and the technologies (ecosystem) around it, significant latest improvements, and perspectives on open collaboration models. Jamie Callan gave an insightful and entertaining overview of the Lemur project and the ClueWeb09/12 datasets, sharing the motivations and experience in such initiatives, from the technical work to the administrative challenges and more than a few anecdotes involved in building and delivering a collection such as ClueWeb. A fair array of additional systems, tools, and initiatives was presented along the day by paper authors. The workshop provided a good sample of the latest progress in open source IR technologies and solutions at large, and a good perspective of where the bar is at in such issues as performance optimization and scaling.

See you in Dublin next year!

 

Authors:

Ronan Cummins (National University of Ireland, Galway) & Pablo Castells (Universidad Autónonoma de Madrid)