In the course of writing Funnelback and Me, I took the opportunity to revisit my 30-year learning journey through the world of Information Retrieval and commercial search. Several lessons stand out, and I want to tell the story of how they came about, and who helped me learn them. In my IR career I benefited from interactions with many IR luminaries -- it will be a pleasure to acknowledge their contributions. Funnelback customers also taught me a thing or two! Lessons covered will likely come under the following headings:
1. The value of mathematics, statistics, and experimental design.
2. The importance of efficient algorithms to climate change.
3. The differences between Web and Enterprise Search and Text Retrieval.
4. When metadata is and isn't useful.
5. Measurement in theory and practice.
6. The contributions and limitations of TREC.
7. Potential for machine learning.
David Hawking took his first steps in Information Retrieval in 1991 while Head Programmer in the ANU Computer Science Department. From 1994 to 2003 he participated in TREC using his retrieval system PADRE and for many years he was a track coordinator on the TREC Very Large Collection and Web tracks. In 1998 he completed a PhD by published work, and joined CSIRO Mathematical and Information Sciences as a research scientist. In 1999 he launched the first enterprise search installation to be based on PADRE. Between 1999 and 2009, he worked on the refinement and commercialisation of the enterprise search system (including PADRE) which was known as P@NOPTIC and later as Funnelback. He joined Funnelback Pty Ltd in 2009 as Chief Scientist, before giving into the attraction of working on web-scale search and joining Bing in 2013. There he worked on an efficient system for identifying song lyric queries and a system for emulating information retrieval test collections. Since retiring in 2018, he has written three books and is working on a fourth. One is on the use of simulation in Information Retrieval; one is The History of ANU Computing; and the third is Funnelback and Me, on which this talk is based. You can find more detail at https://david-hawking.net/
Today’s neural NLP can do amazing things, leading some people to expect human-level performance soon. But it also fails spectacularly, in ways we find hard to predict and explain. Is perfection just a matter of doing additional neural architecture engineering and more-advanced training to overcome these problems, or are there deeper reasons for the failures? I argue that trying to understand the nature and reason for failures by couching the necessary operations in terms of symbolic reasoning is a good way to discover what neural networks will remain unable to do despite additional architecture engineering and training.