{"id":5597,"date":"2018-08-06T19:08:06","date_gmt":"2018-08-06T18:08:06","guid":{"rendered":"https:\/\/irsg.bcs.org\/informer\/?p=5597"},"modified":"2018-08-06T19:08:06","modified_gmt":"2018-08-06T18:08:06","slug":"visualizing-search-strategies","status":"publish","type":"post","link":"https:\/\/archive-irsg.bcs.org\/informer\/?p=5597","title":{"rendered":"Visualizing search strategies"},"content":{"rendered":"<div id=\"block-81a0daff02da60646182\" class=\"sqs-block html-block sqs-block-html\" data-block-type=\"2\">\n<div class=\"sqs-block-content\">\n<p>According to the IDC whitepaper,\u00a0<a href=\"http:\/\/www.ejitime.com\/materials\/IDC%20on%20The%20High%20Cost%20Of%20Not%20Finding%20Information.pdf\">The High Cost of Not Finding Information<\/a>, knowledge workers spend 2.5 hours per day searching for information. Whether they eventually find what they are looking for or just stop and make a sub-optimal decision, there is a high cost to both outcomes. The recruitment industry, for example, relies on\u00a0<a href=\"http:\/\/booleanblackbelt.com\/2008\/12\/basic-boolean-search-operators-and-query-modifiers-explained\/\">Boolean search<\/a>\u00a0as the foundation of the candidate sourcing process, and yet finding candidates with appropriate skills and experience\u00a0<a href=\"https:\/\/devskiller.com\/50-recruitment-stats-hr-pros-must-know-2017\/\">remains an ongoing challenge<\/a>. Similarly, patent agents rely on accurate prior art search as the foundation of their due diligence process, and yet infringement suits are being filed at a rate of more than 10 a day due to the later\u00a0<a href=\"https:\/\/pdfs.semanticscholar.org\/a7f4\/6accffdfb06ce61333ae1bd811460345733d.pdf\">discovery of prior art which their original search tools missed<\/a>.<\/p>\n<\/div>\n<\/div>\n<p><!--more--><\/p>\n<div id=\"block-81a0daff02da60646182\" class=\"sqs-block html-block sqs-block-html\" data-block-type=\"2\">\n<div class=\"sqs-block-content\">\n<p>What these professions have in common is a need to develop search strategies that are accurate, repeatable and transparent. The traditional solution to this problem is to use line-by-line query builders which require the user to enter\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Boolean_algebra\">Boolean strings<\/a>\u00a0that may then be combined to form a\u00a0<a href=\"https:\/\/isquared.wordpress.com\/2018\/06\/14\/think-outside-the-search-box\/\">multi-line search strategy<\/a>:<\/p>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_24229\" class=\"sqs-block image-block sqs-block-image sqs-text-ready\" data-block-type=\"5\">\n<div id=\"yui_3_17_2_1_1533313493339_604\" class=\"sqs-block-content\">\n<div id=\"yui_3_17_2_1_1533313493339_603\" class=\"image-block-outer-wrapper layout-caption-below design-layout-inline\">\n<div id=\"yui_3_17_2_1_1533313493339_602\" class=\"intrinsic\">\n<div id=\"yui_3_17_2_1_1533313493339_601\" class=\"image-block-wrapper has-aspect-ratio\" data-description=\"\"><img decoding=\"async\" class=\"thumb-image loaded\" src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcd50352f5309fc648a9e\/1529597281932\/who+ictrp+search+builder.PNG?format=1000w\" alt=\"who ictrp search builder.PNG\" data-src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcd50352f5309fc648a9e\/1529597281932\/who+ictrp+search+builder.PNG\" data-image=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcd50352f5309fc648a9e\/1529597281932\/who+ictrp+search+builder.PNG\" data-image-dimensions=\"968x622\" data-image-focal-point=\"0.5,0.5\" data-load=\"false\" data-image-id=\"5b2bcd50352f5309fc648a9e\" data-type=\"image\" data-position-mode=\"standard\" data-image-resolution=\"1000w\" \/><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_24508\" class=\"sqs-block html-block sqs-block-html\" data-block-type=\"2\">\n<div class=\"sqs-block-content\">\n<p>&nbsp;<\/p>\n<p>However, such query builders typically offer limited support for error checking or query optimization, and their output is often\u00a0<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pubmed\/16980145\">compromised by errors and inefficiencies<\/a>. In this post, we review three early but highly original and influential alternatives, and discuss their contribution to contemporary issues and design challenges.<\/p>\n<h2>Alternative approaches<\/h2>\n<p>The application of data visualization to search query formulation can offer\u00a0<a href=\"https:\/\/patents.google.com\/patent\/US7383513B2\/en\">significant benefits<\/a>, such as fewer zero-hit queries, improved query comprehension, and better support for exploration of an unfamiliar database. An early example of such an approach is that of\u00a0<a href=\"https:\/\/dl.acm.org\/citation.cfm?id=98015\">Anick et al<\/a>. (1989), who developed a system that could parse natural language queries and represent them using a \u201cQuery Reformulation Workspace\u201d. Although early work, this system introduced a number of key design ideas:<\/p>\n<ul>\n<li>The query was represented as a set of \u2018tiles\u2019 on a visual canvas, which could be (re)arranged by direct manipulation<\/li>\n<li>Query elements could be made \u2018active\u2019 or \u2018inactive\u2019<\/li>\n<li>The layout had a left-to-right reading, with tiles that overlapped vertically being ORed and those which did not being ANDed.<\/li>\n<\/ul>\n<p>For example, the natural language query \u2018Copying<em>\u00a0backup savesets from tape under ~5.0\u2019<\/em>\u00a0would be represented as follows:<\/p>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_29149\" class=\"sqs-block image-block sqs-block-image sqs-text-ready\" data-block-type=\"5\">\n<div id=\"yui_3_17_2_1_1533313493339_626\" class=\"sqs-block-content\">\n<div id=\"yui_3_17_2_1_1533313493339_625\" class=\"image-block-outer-wrapper layout-caption-below design-layout-inline\">\n<div id=\"yui_3_17_2_1_1533313493339_624\" class=\"intrinsic\">\n<div id=\"yui_3_17_2_1_1533313493339_623\" class=\"image-block-wrapper has-aspect-ratio\" data-description=\"\"><img decoding=\"async\" class=\"thumb-image loaded\" src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcd8c6d2a73c675814da7\/1529597332759\/anick.png?format=750w\" alt=\"anick.png\" data-src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcd8c6d2a73c675814da7\/1529597332759\/anick.png\" data-image=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcd8c6d2a73c675814da7\/1529597332759\/anick.png\" data-image-dimensions=\"621x409\" data-image-focal-point=\"0.5,0.5\" data-load=\"false\" data-image-id=\"5b2bcd8c6d2a73c675814da7\" data-type=\"image\" data-position-mode=\"standard\" data-image-resolution=\"750w\" \/><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_29428\" class=\"sqs-block html-block sqs-block-html\" data-block-type=\"2\">\n<div class=\"sqs-block-content\">\n<p>and the Boolean semantic interpretation (shown in the lower half) would be:<\/p>\n<pre>(\u201ccopy\u201d AND \u201cBACKUP saveset\u201d AND \u201ctape\u201d AND (\u201c~5.0\u201d OR \u201cversion 5.0\u201d)).<\/pre>\n<p>The set of results retrieved was defined as all those documents that contained some combination of terms from any possible left-to-right path through the chart. Crucially, the user was at liberty to re-arrange those tiles to reformulate the expression, and to activate or deactivate alternative elements to optimise the query. In addition, the system offered support for integration with thesauri and it also displayed the number of hits in the lower left corner of each tile. These are remarkably prescient ideas, and themes to which we return in\u00a0<a href=\"https:\/\/www.2dsearch.com\/\">our own work<\/a>.<\/p>\n<p>In subsequent work,\u00a0<a href=\"http:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.37.6381\">Fishkin and Stone<\/a>\u00a0(1995) investigated the application of direct manipulation techniques to the problem of database query formulation, using a system of \u2018lenses\u2019 to refine and filter the data. Lenses could be combined by stacking them and applying a suitable operator, e.g. AND\/OR, etc. For example, a user could search a database of US census data to find cities that have high salaries (the upper filter) AND low taxes (the lower filter):<\/p>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_41807\" class=\"sqs-block image-block sqs-block-image sqs-text-ready\" data-block-type=\"5\">\n<div id=\"yui_3_17_2_1_1533313493339_643\" class=\"sqs-block-content\">\n<div id=\"yui_3_17_2_1_1533313493339_642\" class=\"image-block-outer-wrapper layout-caption-below design-layout-inline\">\n<div id=\"yui_3_17_2_1_1533313493339_641\" class=\"intrinsic\">\n<div id=\"yui_3_17_2_1_1533313493339_640\" class=\"image-block-wrapper has-aspect-ratio\" data-description=\"\"><img decoding=\"async\" class=\"thumb-image loaded\" src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcde42b6a28994f63b4ed\/1529597420613\/fishkin.PNG?format=1500w\" alt=\"fishkin.PNG\" data-src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcde42b6a28994f63b4ed\/1529597420613\/fishkin.PNG\" data-image=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcde42b6a28994f63b4ed\/1529597420613\/fishkin.PNG\" data-image-dimensions=\"1133x707\" data-image-focal-point=\"0.5,0.5\" data-load=\"false\" data-image-id=\"5b2bcde42b6a28994f63b4ed\" data-type=\"image\" data-position-mode=\"standard\" data-image-resolution=\"1500w\" \/><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_42086\" class=\"sqs-block html-block sqs-block-html\" data-block-type=\"2\">\n<div class=\"sqs-block-content\">\n<p>Moreover, these lenses could be combined to create\u00a0<em>compound lenses<\/em>, and hence support the encapsulation of queries of arbitrary complexity. This is a further theme to which we return in\u00a0<a href=\"https:\/\/app.2dsearch.com\/\" target=\"_blank\" rel=\"noopener\">our own work<\/a>.<\/p>\n<p>A further influential work is that of\u00a0<a href=\"https:\/\/dl.acm.org\/citation.cfm?id=288595\">Jones<\/a>\u00a0(1998), who reflected upon the difficulties that users experience in dealing with Boolean logic, noting in particular the disconnect between query specification and result browsing and the inefficiency caused by a lack of feedback regarding the effectiveness of individual terms. He proposed an alternative in which concepts are expressed using a\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Venn_diagram\">Venn diagram<\/a>\u00a0notation combined with integrated query result previews. Queries could be formulated by overlapping objects within the workspace to create intersections and disjunctions, and subsets could be selected to facilitate execution of subcomponents of an overall query:<\/p>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_45900\" class=\"sqs-block image-block sqs-block-image sqs-text-ready\" data-block-type=\"5\">\n<div id=\"yui_3_17_2_1_1533313493339_660\" class=\"sqs-block-content\">\n<div id=\"yui_3_17_2_1_1533313493339_659\" class=\"image-block-outer-wrapper layout-caption-below design-layout-inline\">\n<div id=\"yui_3_17_2_1_1533313493339_658\" class=\"intrinsic\">\n<div id=\"yui_3_17_2_1_1533313493339_657\" class=\"image-block-wrapper has-aspect-ratio\" data-description=\"\"><img decoding=\"async\" class=\"thumb-image loaded\" src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcdfa758d46956ba70cf7\/1529597442783\/jones.jpg?format=1500w\" alt=\"jones.jpg\" data-src=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcdfa758d46956ba70cf7\/1529597442783\/jones.jpg\" data-image=\"https:\/\/static1.squarespace.com\/static\/5a8c30620abd04527cc381ec\/t\/5b2bcdfa758d46956ba70cf7\/1529597442783\/jones.jpg\" data-image-dimensions=\"1306x708\" data-image-focal-point=\"0.5,0.5\" data-load=\"false\" data-image-id=\"5b2bcdfa758d46956ba70cf7\" data-type=\"image\" data-position-mode=\"standard\" data-image-resolution=\"1500w\" \/><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"block-yui_3_17_2_1_1529596950077_46179\" class=\"sqs-block html-block sqs-block-html\" data-block-type=\"2\">\n<div class=\"sqs-block-content\">\n<p>Crucially, Jones noted that although the representation offered a degree of universality of expression, the semantic interpretation would necessarily need to be tied to that of the particular collection being searched, and thus independent adapters would be required for each such database. This is also a theme to which we return\u00a0<a href=\"https:\/\/app.2dsearch.com\/\" target=\"_blank\" rel=\"noopener\">in our own work<\/a>.<\/p>\n<h2>In summary<\/h2>\n<p>In this short piece we have briefly reviewed some of the challenges involved in articulating complex search strategies and Boolean expressions, and studied three early but highly original alternative approaches. Given the decade in which these systems were developed (the first of which pre-dates the web by several years), this is extraordinary work, offering design insights and principles of enduring value. \u00a0In our next post, we\u2019ll review some of the more recent approaches, and reflect on how their design ideas and insights may be used to address contemporary search challenges.<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>According to the IDC whitepaper,\u00a0The High Cost of Not Finding Information, knowledge workers spend 2.5 hours per day searching for information. Whether they eventually find what they are looking for or just stop and make a sub-optimal decision, there is a high cost to both outcomes. The recruitment industry, for example, relies on\u00a0Boolean search\u00a0as the&hellip; <a class=\"more-link\" href=\"https:\/\/archive-irsg.bcs.org\/informer\/?p=5597\">Continue reading <span class=\"screen-reader-text\">Visualizing search strategies<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[201,224],"tags":[],"class_list":["post-5597","post","type-post","status-publish","format-standard","hentry","category-feature-article","category-summer-2018","entry"],"_links":{"self":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/posts\/5597","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5597"}],"version-history":[{"count":0,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/posts\/5597\/revisions"}],"wp:attachment":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5597"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5597"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5597"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}