ADCS 2008 Programme

Keynotes

Kal Järvelin: Quo vadis information retrieval research

The talk begins by an analysis of the tradition of IR research based on the TREC approach. The structure of IR experiments, so characteristic for IR research, is discussed in detail, with IR effectiveness as the dominating dependent variable. The strengths of the approach in research and in practice are acknowledged. The talk then moves on to pointing out mounting challenges to IR (evaluation) and IR system development:

Sometimes there are no articulated information needs preceding information access. The searcher first needs to find a focus, then develop questions. Is it possible to support the searcher?
Sometimes there are neither unique questions / queries nor unique right answers. The searchers are individual and inconsistent throughout. Is it possible to support the searcher?
Practical IR is often a process, not a single shot at the database, while IR evaluation is by-and-large based on one-query sessions. IR processes have rarely been sufficiently described in recent times. Therefore they cannot be understood, properly supported, nor evaluated.
Practical IR is rarely performed in vacuum at the center of the universe. Rather it is highly integrated with the other components of the searcher's information environment. This is unfortunately not reflected in IR evaluation.
Relevance is dynamic and multidimensional while measured in the opposite way as stable, topical and binary. Simple measures like searchers' clicks are increasingly taken as indications of relevance while they are insufficient and do not reliably predict relevance.
Searchers' task performance is independent of IR system effectiveness. People cope with clearly degraded retrieval systems as well as with better ones. Just their behavior is changed. Are we investing our research efforts optimally?

All this suggests that, in addition to search engine development, IR (experimentation) might be deserving of other serious foci. These studies might not almost exclusively, and certainly not primarily, focus on search engine effectiveness as the paramount dependent variable. The talk finishes by discussing a cognitive approach to IR as a way for developing material theories on information access with more varied designs of dependent and independent variables. It is the author's view that, by lifting one's eyes from the search engine effectiveness fixation, it is readily understood that 80% of the IR terrain is unmapped, even from the CS viewpoint. Or, if no alternative approaches matter, we might close up shop.

Kal Järvelin is an Academy Professor at the Academy of Finland, working at the Dept. of Information Studies, University of Tampere. He holds a PhD in Information Studies (1987) from the same university. Kal's research covers information seeking and retrieval, database management, and structured documents; and linguistic and conceptual methods in IR. He has authored over 200 scholarly publications and supervised fourteen doctoral dissertations. Kal has served the ACM SIGIR Conferences as a program committee member (1992-2005), Conference Chair (2002) and Program Co- Chair (2004, 2006). He is an Associate Editor of Information Processing and Management (USA).

Rosie Jones: Syntactic and semantic structure in web search queries

Traditionally, information retrieval examines the search query in isolation: a query is used to retrieve documents, and the relevance of the documents returned is evaluated in relation to that query. The query itself is assumed to consist of a bag of words, without any grammatical structure. However, queries can also be shown to exhibit grammatical structure, often consisting of telegraphic noun-phrases. In addition, users typically conduct web and other types of searches in sessions, issuing a query, examining results, and then re-issuing a modified query to improve the results. We describe the properties of real web search sessions, and show that users conduct searches for both broad and finer grained tasks, which can be both interleaved and nested. Reformulations reflect many relationships, including synonymy, hypernymy and hyponomy. We show that user search reformulations can be mined to identify related terms, and that we can identify the boundaries between tasks with greater accuracy than previous methods.

Rosie Jones is a Senior Research Scientist at Yahoo!. Her research interests include web search, geographic information retrieval, and natural language processing. She received her PhD from the School of Computer Science at Carnegie Mellon University under the supervision of Tom Mitchell. She is co-organizing the WSDM 2009 Workshop on Web Search Click Data (WSCD09). She served on the Senior PC for SIGIR in 2007 and 2008, and is a Senior Member of the ACM.

Programme

9:00	Registration opens
9:30	Keynote: Quo vadis information retrieval research Kal Järvelin
10:20	Morning tea
	Session 1 (chair: Paul Thomas)
10:45	Term-frequency surrogates in text similarity computations Stefan Pohl and Alistair Moffat
11:00	MetaView: Dynamic metadata based views of user files James Bunton, Judy Kay, and Bob Kummerfeld
11:15	On the relevance of documents for semantic representation Laurianne Sitbon and Peter Bruza
11:30	Exploring the benefit of contextual information for boosting TREC Genomic IR performance Bader Aljaber, Nicola Stokes, James Bailey, and Yi Li
11:45	WebKnox: Web knowledge extraction David Urbansky and Jamie Thom
12:00	Posters & lunch
	Session 2 (chair: Alistair Moffat)
1:30	Anonymous folksonomies for small enterprise webs: a case study Tom Rowlands, David Hawking, and Ramesh Sankaranarayana
1:45	The effect of using pitch and duration for symbolic music retrieval Iman S. H. Suyoto and Alexandra L. Uitdenbogerd
2:00	Extraction of named entities from tables in gene mutation literature Wern Wong, David Martinez, and Lawrence Cavedon
2:15	Afternoon tea
	Session 3 (chair: Justin Zobel)
2:35	Facilitating biomedical systematic reviews using ranked text retrieval and classification David Martinez, Sarvnaz Karimi, Lawrence Cavedon, and Timothy Baldwin
2:50	Parameter sensitivity in rank-biased precision Yuye Zhang, Laurence A. F. Park, and Alistair Moffat
3:05	Business meeting/informal chats - grants
	Session 4 (joint with ALTA) (chair: Nicola Stokes)
3:45	Querying linguistic annotations Sumukh Ghodke and Steven Bird
4:00	Using collaboratively constructed document collections to simulate real world object comparisons Karl Grieser, Timothy Baldwin, Fabian Bohnert, and Liz Sonenberg

5:00	Keynote: Syntactic and semantic structure in web search queries Rosie Jones
6:00	Drinks
7:00	Dinner