{"id":7833,"date":"2023-04-19T12:25:48","date_gmt":"2023-04-19T11:25:48","guid":{"rendered":"https:\/\/irsg.bcs.org\/informer\/?p=7833"},"modified":"2023-04-19T12:25:48","modified_gmt":"2023-04-19T11:25:48","slug":"academia-and-the-enterprise-steve-zimmerman","status":"publish","type":"post","link":"https:\/\/archive-irsg.bcs.org\/informer\/?p=7833","title":{"rendered":"Academia and the Enterprise &#8211; Steve Zimmerman"},"content":{"rendered":"<p><strong>Academia and the Enterprise<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">It is an honour to be asked by a highly respected contributor to the enterprise search community to share my journey from academia into the enterprise. \u00a0 Admittedly, it has been an unusual journey, so perhaps it\u2019s best to say a bit about where things are at the exact moment before diving into the details.\u00a0\u00a0<\/span><\/p>\n<p><em><span style=\"font-weight: 400;\">Now<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400;\">Currently, I am a Senior Data Scientist in the NLP team at a large multinational, and there has never been a more interesting time to work in search and NLP. \u00a0 This is a strong statement given my journey into search and NLP, which began 10 years ago,\u00a0 has always been fascinating. \u00a0 So what makes this journey even more fascinating now?\u00a0 \u00a0 Probably not surprising to you, the latest generation of large language models (LLMs) is what has made the work even more interesting.\u00a0 A former colleague told me about ChatGPT on December 1st and said it will be as big as Google.\u00a0\u00a0\u00a0<\/span><\/p>\n<p><!--more--><\/p>\n<p><span style=\"font-weight: 400;\">It\u2019s now just over 4 months later, and I tend to agree with my former colleague\u2019s assessment that ChatGPT is indeed just as big as the release of Google (more specifically the release of PageRank).\u00a0 The initial impact of ChatGPT is so large that Southpark recently aired an entire episode about the it\u2019s powers and the related dangers (co-written with ChatGPT nonetheless), and I note there is still yet to be an episode devoted to the release of Google.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In actuality, there is nothing that new with respect to ChatGPT, it builds upon an existing body of research in the space of generative AI, so it isn\u2019t that revolutionary. Yes, there has been some buzz over the past few years around models like DALLE, as well as the topics of deep fakes. And in recent years, usage of this flavour of models has greatly simplified development of solutions to many difficult problems. \u00a0 But, this is the first generative LLM that got the attention of the masses and further permitted the masses to easily interact.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Personally, it blew my mind away, as it was the first AI based interactive dialogue system that had a feeling of being \u201creal\u201d.\u00a0 \u00a0 Very quickly though, my antennae began to wiggle a bit, as I found big holes in many of the legitimate sounding responses it gave. \u00a0 Of course, those in the business of AI and NLP, refer to these legitimate sounding big holes as \u201challucination\u201d.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Most of us, I assume, were taught at a quite young age the inherent risks of\u00a0 hallucination within self and\/or those around you.\u00a0 \u00a0 So one might ask, should there be so much belief in a capability from which its designers caution us that it\u00a0 will \u201challucinate\u201d from time to time? \u00a0 I leave that to you to decide.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For me, this question ties directly back to my academic research centred in the somewhat recent emergent field of interactive information retrieval (IIR), which focused on risk mitigation of harms on the Web.\u00a0 \u00a0 And in my view, due to this latest technology, there has never been a greater potential for harm, and paradoxically there never has been a greater potential for benefit.\u00a0 It turns out,\u00a0 there has never been a more important time for IIR to play a role in the development of methods and evaluation approaches for the safe usage of this capability.\u00a0 Yes, I might be biassed given my research background in IIR, but it is my research background that explains a huge part of the excitement I currently feel. \u00a0 ChatGPT opens up the door to many new research avenues to explore, the research possibilities on the Web and in the Enterprise are not only massive, but highly important.<\/span><\/p>\n<p><em><span style=\"font-weight: 400;\">Before Now<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400;\">Perhaps interesting to some, I come from a background of computer scientists that worked at some large players in the tech industry. \u00a0 Admittedly, I was avoidant to go down this path as my parents worked insane hours well into my adulthood.\u00a0 I thought they were mad!\u00a0 So I stayed away from this area for a while.\u00a0 And here I find myself today working insane (but not quite as insane) hours in computing.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">My first dive into computing was in the post 9\/11 era when I finished undergrad and found jobs were in short supply.\u00a0 While working various menial jobs as a contractor, I took a few computing courses at Northeastern in Boston and in a very short time found myself working full-time as a programmer at a large financial company.\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">After 5 years in technology, I took a pause to explore the possibility of graduate studies in atmospheric physics through coursework at Cornell.\u00a0 After a couple of years building up the fundamentals of atmospheric science, I found myself much more interested in the computing aspects and much less interested in deriving the fluid dynamics of the atmosphere (though that is still very interesting too).\u00a0 Though the rigour and demands at this university developed my abilities to solve difficult problems independently (a necessary skill for a PhD), I no longer felt excitement about an academic career in atmospheric sciences.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">It was around this time in 2013 that I first heard about NLP and the emergent career of data science (via a <\/span><a href=\"https:\/\/hbr.org\/2012\/10\/data-scientist-the-sexiest-job-of-the-21st-century\"><span style=\"font-weight: 400;\">well known article<\/span><\/a><span style=\"font-weight: 400;\"> on the topic), and it immediately sparked a flame in me.\u00a0 Low and behold, a well timed life event\u00a0 led me to relocate to England with a simultaneous opportunity arising to join a newly created MSc programme that focused on NLP and search.\u00a0 I remember telling my classmates in the atmospheric sciences lab about my plans, to which one said \u201cIt sounds like you are going to work for SkyNet\u201d.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Well, I don\u2019t work for <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Skynet_(Terminator)\"><span style=\"font-weight: 400;\">SkyNet<\/span><\/a><span style=\"font-weight: 400;\"> (yet), but the main takeaway here is the events in early 2013 led to the subsequent chain of events to this article right now.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">And now I have a PhD in the field of IIR, so what happened?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Sure, maybe I have some aptitude for work and research in modern computing, but I also admit a lot of it comes down to good timing and lucky connections. \u00a0 For instance, my NLP course was taught by Udo Kruschwitz, and his eager presentation of the topics in NLP and IR were infectious, which made this area even more interesting. \u00a0 He encouraged our class to attend the London Text Analytics Meetup which he co-ran with other well known folks in NLP\/IR such as Tony Russell-Rose.\u00a0 And it was at these events where I connected with many different companies building interesting products and were all hiring!\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">It was through the Text Analytics meetup that I connected with Miguel Martinez, and this connection led to my first job in NLP as an intern between my first and second year in my MSc.\u00a0 \u00a0 What a fun time!\u00a0 It was during the seed round funding period for a small startup in a garage in Belsize Park, which has now turned into a much larger company called <\/span><a href=\"https:\/\/www.signal-ai.com\/\"><span style=\"font-weight: 400;\">Signal AI<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0 \u00a0 After completing my MSc, I found full time work in the data science team of a large newspaper, developing document classification pipelines and prototype recommender engines.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Here again, timing played an important role.\u00a0 Udo Kruschwitz contacted me about an ESRC funded research grant that looked at human rights in the digital age, and the timing was perfect in the sense of my concerns about the world.\u00a0 In particular, I was very concerned about unmitigated online misinformation campaigns on various topics. For example, the dialogue surrounding justification for Brexit had many false claims that spread like wildfire.\u00a0 \u00a0 It would be dishonest to say that June 23rd 2016 did not\u00a0 help me clarify the main points of my PhD application to focus on harm mitigation on the Web.\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Initially, my research focused on hate speech mitigation, but very quickly realised the research problem was potentially intractable due to free-speech concerns.\u00a0 Nonetheless, my initial research was published at LREC, which was a good foot to start on. \u00a0 A key lesson from this experience was that my ideology of\u00a0 AI solving the world\u2019s ugly problems would not be realised without taking into consideration the psychology of humans.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Around the time I submitted my paper to LREC for review, two things happened.\u00a0 First, my application to the Autumn School for Information Retrieval and Foraging (ASIRF) at Dagstuhl was confirmed.\u00a0 Second, a fellow PhD student in the Psychology department researching judgement and decision making in medicine lent me his copy of Daniel Kahneman\u2019s \u201cThinking, Fast and Slow\u201d.\u00a0 Attendance at ASIRF introduced me to many great researchers, most notably David Elsweiler, who lectured on the fundamentals of IIR studies. \u00a0 The book and the Autumn school were the foundation for a rapid update to my PhD research plan to include the consideration of the human in the system.\u00a0 \u00a0 This shift in research led to co-authored papers with David Elsweiler and the aforementioned PhD student (Alistair Thorpe).\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Concurrently to my PhD research, my PhD advisor (who has many industry links), encouraged me to explore avenues in the private sector.\u00a0 \u00a0 He connected me with an enterprise search expert at a large energy company based in London, which led me to an internship which took place during my PhD.\u00a0 This internship transitioned to my current full-time role as a search and NLP researcher in the private sector.\u00a0 \u00a0 At the moment, my research is predominantly in the private sector, and heavily focused on enterprise search.\u00a0 Applications of NLP and search have been interesting to me from the first day I set foot into the field, and I find it more interesting now than at any point of my career.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I close with some key learnings from my experience.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For those considering an advanced degree in Search\/NLP<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Consider an interdisciplinary approach to your research.\u00a0 Though at the core my research was in computer science, it considered research in a broad set of fields. \u00a0 In the modern age, my view is we cannot afford to take a narrow view on the problems we face.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">PhD\u2019s are a huge commitment, and strongly recommend against self-funding.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Ideology is a great motivator for research, but be prepared to let it go. \u00a0 My experiences with hate speech research taught me a lot about this matter.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">For those in a PhD (or recently signed up for one)<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Get your hands dirty early in your PhD.\u00a0 Build some experiments and try publishing your findings as soon as you can.\u00a0\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Apply for a doctoral consortium. \u00a0 \u00a0 SIGIR kindly accepted my application and furthermore covered my expenses to\/from the event.\u00a0 This is an experience that you should not miss!<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Apply for and attend \u201csummer\u201d schools. \u00a0 In addition to ASIRF, I attended the summer school for Bounded Rationality at the Max Planck Institute for Human Development. \u00a0 Both of these experiences provided strong foundational knowledge for my PhD, and furthermore connections to several co-authors.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Take a pause and do an internship \/ placement at a company. \u00a0 It\u2019s important to get a feel if you want to be in academia, private sector, or a bit of both.\u00a0\u00a0<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Academia and Industry &#8211; It\u2019s a spectrum, find what\u2019s right for you after your PhD.\u00a0 Some considerations and possibilities:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Evaluation is much more straightforward in academia than in the private sector. \u00a0 Academia is contained and offers great experimental control.\u00a0 Industry has many moving parts, and many people to work with.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Pure Industry will pay a lot more, but pure academia will give you a lot more freedom (though freedom has eroded greatly in recent years)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Just as in academia, industry offers the opportunity to investigate interesting research problems in search and NLP. \u00a0 However in industry, the problem is typically business driven, and thus much easier to define.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Some private sector companies offer research positions which allocate some time for academic work outside of the company.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It is quite common for folks with full-time academic appointments to do side research in the private sector.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">You can work in the private sector and still keep an academic affiliation to conduct research you wish to continue doing on the side<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">If you really feel the pull of a full-time academic appointment, talk to people in their work and understand fully what is involved, it is much more than research.\u00a0 You will also have responsibilities of creating course syllabuses, teaching slides, assignment marking, administrative work; very different from a PhD or post-doc.\u00a0<\/span><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>NOTE:\u00a0 A ChatGPT cleansed version of this article can be <a href=\"https:\/\/irsg.bcs.org\/informer\/2023\/04\/chatgpt-take-on-academia-and-enterprise\">found here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Academia and the Enterprise It is an honour to be asked by a highly respected contributor to the enterprise search community to share my journey from academia into the enterprise. \u00a0 Admittedly, it has been an unusual journey, so perhaps it\u2019s best to say a bit about where things are at the exact moment before&hellip; <a class=\"more-link\" href=\"https:\/\/archive-irsg.bcs.org\/informer\/?p=7833\">Continue reading <span class=\"screen-reader-text\">Academia and the Enterprise &#8211; Steve Zimmerman<\/span><\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[201,217],"tags":[],"class_list":["post-7833","post","type-post","status-publish","format-standard","hentry","category-feature-article","category-spring-2023","entry"],"_links":{"self":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/posts\/7833","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7833"}],"version-history":[{"count":0,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=\/wp\/v2\/posts\/7833\/revisions"}],"wp:attachment":[{"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7833"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7833"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive-irsg.bcs.org\/informer\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7833"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}