Conference Review – Haystack US 2019 – Relevance Avengers Assemble!

Last year I attended the Haystack search relevance conference in Charlottesville, USA as a guest of our partners OpenSource Connections (OSC). In 2019 we merged my old business Flax with OSC so I returned as one of the conference organisers.

Haystack is a conference all about search relevance – making sure that the results your users see fit their requirements and your business needs. Unlike some events Haystack has no sponsors and no vendor pitches and we try hard to keep the price low to promote accessibility. It’s a great chance to network with other search people – and no-one will ask you if what you do for a living ‘is a bit like Google’!

This year the venue was a cinema in downtown Charlottesville, which gave us much needed extra space and easier access to the Downtown Mall and its array of restaurants, snack shops and bars. Plus points included reclining seats, an onsite cafe and some very big screens, although we did discover some issues with WiFi coverage (perhaps an aid to concentration however?) and the movie projector didn’t always play nice with presenters’ laptops. We’ll sort this out for next time I’m sure – of course the affected presenters were professionals and coped admirably with the glitches. Also, I’m hoping none of the conference attendees felt they missed out on seeing Avengers Endgame, on show in one of the other theatres…but just in case they did I’ll introduce some of the marvel-lous characters we saw onstage at Haystack.

The first day was introduced by Max “Ironman” Irwin of OSC who gave us a keynote on What is Search Relevance?. Max showed us the three aspects of search quality: performance, experience and of course relevance, and went on to discuss how we can score judgements, cope with disagreements between human raters and fold in user engagement data. He also showed us a list of the speakers to come and welcomed over 140 attendees from the USA and Europe to Haystack.

The next talk I saw was by Alessandro “Dr Strange” Benedetti of Sease Ltd. (OK, I’ll stop the Avengers references now before I infer one of our speakers was green and angry) on the Rated Ranking Evaluator relevance testing tool. He showed us the heirarchical model for test queries they have developed and how the open source RRE can be used to run a huge amount of tests on a Solr or Elasticsearch instance as part of the Maven build process, producing a set of relevance metrics. These metrics in turn can be emitted to a spreadsheet, RRE’s own server dashboard or as JSON (RRE also uses JSON for the relevance judgements that must be provided to it).

Tara Diedrichsen & Tito Sierra of LexisNexis followed with a fascinating talk on best practices for gathering human judgements for relevance testing. It’s clear that LexisNexis have put huge amounts of work into this area to help them identify problem areas to focus on and to evaluate new algorithms. I’m pleased they stressed that it’s important to record why a search result is good or bad – this is essential information for relevance engineers who may be unfamiliar with the subject area.

Lunch followed, and conference attendees scattered to the various restaurants on the Downtown Mall – luckily as far as I can tell they all came back afterwards. The next talk I saw came from René Kriegler on Query Relaxation which was fascinating – René showed us various ways to remove terms from a query to increase the number of results and eventually suggested using a neural network to work out the best term to lose.

Unfortunately I missed the next session as I was preparing to run the Lightning Talks, our last session of the day. The Lightning Talks started with a moving tribute to Ted Sullivan by his friend and colleague Eric Hatcher – sadly we lost Ted this year, I was very privileged to be able to meet him at last year’s Haystack.

The talks featured speakers on subjects including Zookeeper on AWS, the new Quaerite relevance test tool, Solr on Kubernetes and the challenges of full text search at the Hathi Trust over 17 million documents. Thanks to everyone who volunteered to speak at such short notice!

The first day concluded with dinner at Kardinal Hall nearby which was great fun and a chance to network and chat with other attendees.

Jeremiah Via of the New York Times was the first presentation I attended on Day 2. Jeremiah described how Elasticsearch is used to index 18 million items at the Times and how they developed both online and offline metrics to improve relevance. The Times’ index contains over 22 million unique tokens and nearly 2 million tags. He stressed the importance of being able to easily iterate through configuration changes – as he said “improving search is about making lots of little improvements”.

Next up was Tom Burgmans, describing how his team established a relevance focused culture at Wolters Kluwer. I particularly enjoyed seeing a screenshot of their advanced relevance testing tool which showed relevance judgements and also broke down the various contributions to relevance scores – I hope as he did that this tool eventually becomes open source. Wolters Kluwer have also developed a set of loosely coupled reusable search components which help to share knowledge and experience across the organisation. His last point was ‘don’t stop’ – relevance improvement is never finished!

My colleague Bertrand Rigaldies of OSC then talked about Solr query parsers (he noted that there are no less than 29 different query parsers supplied with Solr, including a good few I’d never heard of). He showed how to build a simple proximity query parser (to handle queries like “‘fish’ within 3 words of ‘chips’”) and stressed that although custom parsers can be very powerful, they are complex to write and one should try to use an out-of-the-box parser where possible.

Lunch followed, attendees again taking advantage of the various outlets in Charlottesville’s Downtown Mall.

John Berryman, one half of the team behind the Relevant Search book and now at Eventbrite, gave an engaging talk on automatic tagging using search logs and machine learning. His system creates a training set from user interactions (the events that users clicked after a particular query) then attempts to predict what tags to apply to other events – the tags being the search queries themselves.

The next session was a panel discussion on Does Learning to Rank Actually Work (my alternative title ‘Learning to Rank – or learning to tank?’ was sadly discarded 🙂 with René Kriegler, Doug Turnbull, Xun Wang (Snag) and Erik Bernhardson (Wikimedia). The audience provided some great questions for the panel.

I sadly missed most of Simon Hughes of DHI’s talk on Search with Vectors but what I did see was very interesting, including how he had built a special query parser for Lucene that stored vectors as payloads. Luckily there’s lots of detail in this Github repository.

The conference ended with thanks to all the speakers, organisers and most importantly the attendees – without whom Haystack would of course not be possible! Thanks to everyone who came and made it such a great event. Haystack will return!

If you’d like a richer description of the conference including some of the talks I missed please do read Jettro Coenradie’s blog. Alessandro Benedetti of Sease has also written about his experience of the event. You can also join many of the conference attendees in Relevance Slack – there’s a #haystack-conference channel.

You’ll be glad to know we will be releasing the slides for all the main talks and the Lightning Talks very soon, and unlike last year we managed to video all the sessions – so anything you (or I) missed (or simply didn’t understand well enough at the time) will be available to peruse at your leisure. Keep watching the conference website for updates.

You might want to know that we’ll be running a Haystack EU conference in Berlin on October 28th 2019 – do keep an eye on the Haystack website and follow me on Twitter for more updates.

Leave a comment Cancel reply