The talk begins by an analysis of the tradition of IR research based on the TREC approach. The structure of IR experiments, so characteristic for IR research, is discussed in detail, with IR effectiveness as the dominating dependent variable. The strengths of the approach in research and in practice are acknowledged. The talk then moves on to pointing out mounting challenges to IR (evaluation) and IR system development:
All this suggests that, in addition to search engine development, IR (experimentation) might be deserving of other serious foci. These studies might not almost exclusively, and certainly not primarily, focus on search engine effectiveness as the paramount dependent variable. The talk finishes by discussing a cognitive approach to IR as a way for developing material theories on information access with more varied designs of dependent and independent variables. It is the author's view that, by lifting one's eyes from the search engine effectiveness fixation, it is readily understood that 80% of the IR terrain is unmapped, even from the CS viewpoint. Or, if no alternative approaches matter, we might close up shop.
Kal Järvelin is an Academy Professor at the Academy of Finland, working at the Dept. of Information Studies, University of Tampere. He holds a PhD in Information Studies (1987) from the same university. Kal's research covers information seeking and retrieval, database management, and structured documents; and linguistic and conceptual methods in IR. He has authored over 200 scholarly publications and supervised fourteen doctoral dissertations. Kal has served the ACM SIGIR Conferences as a program committee member (1992-2005), Conference Chair (2002) and Program Co- Chair (2004, 2006). He is an Associate Editor of Information Processing and Management (USA).
Traditionally, information retrieval examines the search query in isolation: a query is used to retrieve documents, and the relevance of the documents returned is evaluated in relation to that query. The query itself is assumed to consist of a bag of words, without any grammatical structure. However, queries can also be shown to exhibit grammatical structure, often consisting of telegraphic noun-phrases. In addition, users typically conduct web and other types of searches in sessions, issuing a query, examining results, and then re-issuing a modified query to improve the results. We describe the properties of real web search sessions, and show that users conduct searches for both broad and finer grained tasks, which can be both interleaved and nested. Reformulations reflect many relationships, including synonymy, hypernymy and hyponomy. We show that user search reformulations can be mined to identify related terms, and that we can identify the boundaries between tasks with greater accuracy than previous methods.
Rosie Jones is a Senior Research Scientist at Yahoo!. Her research interests include web search, geographic information retrieval, and natural language processing. She received her PhD from the School of Computer Science at Carnegie Mellon University under the supervision of Tom Mitchell. She is co-organizing the WSDM 2009 Workshop on Web Search Click Data (WSCD09). She served on the Senior PC for SIGIR in 2007 and 2008, and is a Senior Member of the ACM.
9:00 | Registration opens |
9:30 | Keynote: Quo vadis information
retrieval research Kal Järvelin |
10:20 | Morning tea |
Session 1 (chair: Paul Thomas) | |
10:45 | Term-frequency surrogates in text similarity
computations Stefan Pohl and Alistair Moffat |
11:00 | MetaView: Dynamic metadata based views of user
files James Bunton, Judy Kay, and Bob Kummerfeld |
11:15 | On the relevance of documents for semantic
representation Laurianne Sitbon and Peter Bruza |
11:30 | Exploring the benefit of contextual information for boosting
TREC Genomic IR performance Bader Aljaber, Nicola Stokes, James Bailey, and Yi Li |
11:45 | WebKnox: Web knowledge extraction David Urbansky and Jamie Thom |
12:00 | Posters & lunch |
Session 2 (chair: Alistair Moffat) | |
1:30 | Anonymous folksonomies for small enterprise webs: a case
study Tom Rowlands, David Hawking, and Ramesh Sankaranarayana |
1:45 | The effect of using pitch and duration for symbolic music
retrieval Iman S. H. Suyoto and Alexandra L. Uitdenbogerd |
2:00 | Extraction of named entities from tables in gene mutation
literature Wern Wong, David Martinez, and Lawrence Cavedon |
2:15 | Afternoon tea |
Session 3 (chair: Justin Zobel) | |
2:35 | Facilitating biomedical systematic reviews using ranked text
retrieval and classification David Martinez, Sarvnaz Karimi, Lawrence Cavedon, and Timothy Baldwin |
2:50 | Parameter sensitivity in rank-biased
precision Yuye Zhang, Laurence A. F. Park, and Alistair Moffat |
3:05 | Business meeting/informal chats - grants |
Session 4 (joint with ALTA) (chair: Nicola Stokes) | |
3:45 | Querying linguistic annotations Sumukh Ghodke and Steven Bird |
4:00 | Using collaboratively constructed document collections to
simulate real world object comparisons Karl Grieser, Timothy Baldwin, Fabian Bohnert, and Liz Sonenberg |
5:00 | Keynote: Syntactic and semantic structure in web search
queries Rosie Jones |
6:00 | Drinks |
7:00 | Dinner |