Dynamic Information Retrieval Modeling

Evolution of IR from Static to Interactive to DynamicChange is at the heart of a modern Information Retrieval system.  Advances in IR interface, personalization and ad display demand models that can intelligently react to users and their context in real time. Many of the current problems in IR research can be attributed to dynamic systems, for instance, in session search or recommender systems. The aim of Dynamic Information Retrieval Modeling research is to find IR solutions that are responsive to a changing environment, learn from past interactions and predict future utility.

We can conceptualize IR systems into three categories: static, interactive and dynamic, each a generalization of the previous and exhibiting a natural progression of complexity. We observe other such trends in IR research, for example the early term-based vector space retrieval model giving way to more complex models such as BM25 (Robertson, 2009) and then language models (Ponte, 1998). Likewise, the evolution from static to interactive and then dynamic IR reflects the increasing complexity of search problems and the need for responsive solutions.

Static IR

Static Information Retrieval
The independent stages in a Static IR system.

Static IR encompasses problems in information retrieval that are resolved in a single time step or interaction, thus not requiring consideration of how the state of the system has changed following the interaction. Many traditional areas in IR can be described as static, for instance, ad hoc ranking and retrieval where document relevance scores for query terms are typically generated in advance of retrieval and fixed.  These scores and the rankings they give rise to are independent of the user’s preceding or future actions or the state of the system. Re-calculating the scores based on some real-time user input would be a slow and expensive operation.

Multi Page Search Static IR Example
Static IR example: 2 pages of search results for the query 'jaguar'. 3 subtopics are represented by the results, 'cars', 'animals' and 'guitars'.

A static system is illustrated in our example above. Here, a retrieval system is generating two pages of search results for the query jaguar. This ambiguous query gives results for caranimal and guitar related webpages. These results are diversified across both pages in the static system so that the results can cater to the widest range of users and their search intent. In this static system, the results on page 2 are unaffected by any interaction that occurs on page 1, a different user searching for the same query would encounter the same results.

We could improve this particular search by using interactive IR to personalize the results on the second page.

Interactive IR

Interactive IR has dependent steps
In Interactive IR, each stage is dependent on the previous stage.

An interactive search system is one that extends a static system by incorporating user feedback. Broadly there are three types of feedback:

  • Explicit – Direct actions made by the user to inform the search system of their intent and satisfaction i.e. a movie rating or a click on a ‘like’ button.
  • Implicit –  User actions recorded by the IR system as the user interacts with it, most commonly clickthroughs and dwell time, although many other behaviours such as mouse tracking and scrolling behaviour are also used .  These are unobtrusive and cheap to collect, but require careful interpretation when used.
  • Pseudo – Simulated user actions such as assuming the top ranked documents in a search are relevant (Cao, 2008).

Once feedback from the user has been observed, an interactive search system can then improve the search experience. For example, the well known Rocchio algorithm (Rocchio, 1971) uses explicit feedback to improve the user’s query. Recommender systems, ad selection and query auto completion are examples of modern systems that incorporate feedback to improve performance, with new web capable devices and IR interfaces leading to the availability of different user interactions and feedback signals.

In IR research, Interactive Information Retrieval can refer to work that improves upon the Cranfield methodology (Cleverdon, 1968) in IR evaluation. Here, the rigid assumptions of Cranfield are relaxed in order to understand the user’s interactive strategies with a (typically static) search system. In the context of dynamic IR, we instead consider interactivity from the system’s viewpoint i.e. one that is responsive to a user.

Interactive IR is a progression from static IR in that it deals with the complexity of user interaction by operating over multiple stages, making use of user feedback. The stages may represent multiple queries in a search session, multiple sessions in a user’s search history, different users in a search log and so on. An interactive system may begin by using a static method but will then continue to adapt to the user after each stage.

Multi Page Search example for Interactive IR
Interactive IR Example: Clicked webpages lead to the personalization of the second page of results based on the subtopic clicked, but not all of the subtopics are represented.

When interactivity is added to the static example, the second page of search results can be personalized based on the user’s subtopic preference. For instance, if a user clicked on a car related webpage on page 1, then the second page of results can be updated to show similar webpages. This is a user targeted improvement over the static ranker’s second page which continued to display a mixture of subtopics.

Nonetheless, the second page in the static IR example contained a guitar related webpage which is now inaccessible to those users using the interactive system. A dynamic IR system can help resolve this situation.

Dynamic IR

Dynamic IR has interactivity that takes into account future steps
The stages in Dynamic IR are dependent on both past and predicted future interactions.

A dynamic IR system responds to the dynamics of its real world setting so that it can achieve its goal. Such systems are resistant to adverse change or error and are able to learn and adapt. There are three defining characteristics of dynamic IR systems:

  1. User Interactions – A dynamic system must be able to perceive its environment according to some stimulus i.e. user feedback.
  2. Temporal dependency – Dynamic systems operate over distinct stages and have the ability to adapt and change their behaviour in response to user interactions.
  3. Overall Goal – A long-term goal or reward drives a dynamic system i.e. maximising an IR metric such as NDCG or the return on investment of an ad campaign.

Dynamic IR is a natural evolution of the described static and interactive models.  An interactive system may collect feedback from a static system and respond accordingly, thus exhibiting two of the attributes described above, user interactions and temporal dependency. A key difference is how the goal is optimized and defined; in interactive retrieval only immediate rewards are considered, whereas in dynamic retrieval the overall reward is. As a result, the action chosen at each time step in a dynamic system is made in consideration of all past and future interactions.

Multi page search example for dynamic IR
Dynamic IR Example: The first page ranking has been diversified so that the search system is better able to learn the user's second page preference.

In the interactive example, the results on the first page followed those of the static ranking and the second page was enhanced by interactively incorporating user clicks, but at the cost of alienating users with a preference for guitars. This can be resolved using a dynamic approach that determines optimal rankings for both pages of search results. Using the ‘Interactive Exploratory Search’  (Jin, 2013) dynamic IR approach, a first page of results can be found that balances the learning of a user’s preference and the display of relevant documents.

When used, this approach finds that a diversified first page ranking maximizes its learning potential so that improved, targeted results can be returned for the next page.  This can also be considered in light of the explore/exploit methodology common to dynamic IR problems, where on the first page documents are explored, then relevant documents are exploited on the second page. Any performance losses suffered due to first page diversification are gained by overall improvements in the second page.

Conclusion

In summary, the differences between the three types of IR system in the conceptual model are: Static systems are those that operate over a single stage or otherwise multiple stages which are independent of one another. Interactive systems extend static systems by introducing local dependency from one stage to the next and individual goals per stage. A dynamic system extends an interactive system by focusing on a single goal that forces dependency across all stages.

A dynamic system is one which changes or adapts over time and can be a range of things, from a ranking and retrieval algorithm, to an advert recommender or a query suggestion model. The increased complexity of search systems has resulted in the need for dynamic modelling as documents, relevance, users and tasks all exhibit dynamic behaviour. Dynamic IR is the modelling of adaptive, responsive, goal-oriented information retrieval systems, and an emerging and promising sub-field of information retrieval research.

The content of this article is taken from the tutorial ‘Dynamic Information Retrieval Modeling’. The slides for the tutorial as well as a useful list of related work and more information on the subject can be found on the website http://www.dynamic-ir-modeling.org/ Also, a lecture on the subject will be published by Morgan & Claypool in the ‘Synthesis Lectures on Information Concepts, Retrieval and Services’ series this August/September.

References

  1. Robertson, S. E. (2009) The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval, 3(4):333-389.
  2. Ponte, J. M. and Croft, B. W. (1998) A Language Modeling Approach to Information Retrieval. SIGIR ’98, 275-281.
  3. Cao, G. and Nie, J. and Gao, J. and Robertson, S. (2008) Selecting good expansion terms for pseudo-relevance feedback. SIGIR ’08, 243-250.
  4. Rocchio, J. J. (1971) Relevance Feedback in Information Retrieval. The SMART Retrieval System – Experiments in Automatic Document Processing.
  5. Cleverdon, C. and Kean, M. (1968) Factors Determining the Performance of Indexing Systems. Aslib Cranfield Research project, Cranfield, England.
  6. Jin, X. and Sloan, M. and Wang, J. (2013) Interactive Exploratory Search for Multi Page Search Results. WWW ’13, 655-666.