«Submitted in partial fulﬁlment of the requirements for the degree of Doctor of Philosophy Department of Computer Science University of Shefﬁeld, ...»
Open-Domain Question Answering
Mark Andrew Greenwood
Submitted in partial fulﬁlment of the requirements
for the degree of Doctor of Philosophy
Department of Computer Science
University of Shefﬁeld, UK
Dedicated to the memory of
David Lowndes (1979-2003)
“Life is hard somehow”
C & R Macdonald (1999)
Table of Contents
I What Is Question Answering?
1 Question Answering: An Overview 3
1.1 Questions.................................. 3
1.2 Answers................................... 5
1.3 The Process of Question Answering.................... 5
1.4 The Challenges of Question Answering.................. 6
1.5 Thesis Aims and Objectives........................ 7
1.6 Thesis Structure............................... 9 2 A Brief History of Question Answering 11
2.1 Natural Language Database Systems.................... 11
2.2 Dialogue Systems.............................. 12
2.3 Reading Comprehension Systems..................... 15
2.4 Open-Domain Factoid Question Answering................ 18
2.5 Deﬁnition Questions............................ 20 3 Evaluating Question Answering Systems 23
3.1 End-to-End Evaluation........................... 23 3.1.1 Factoid Questions.......................... 24 3.1.2 List Questions........................... 26 3.1.3 Deﬁnition Questions........................ 27
3.2 Evaluating IR Systems for Question Answering.....
12.1 Indirect evaluation of phrase removal techniques.............. 138 D.1 Summary of TREC 2003 factoid performance................ 166 E.1 Summary of TREC 2004 factoid performance................ 171 F.1 Examples of combining questions and their associated target........ 176 F.2 Summary of TREC 2005 factoid performance................ 177
9.1 AnswerFinder: an open-domain factoid question answering system.... 111
9.2 Comparison of AnswerBus, AnswerFinder, IONAUT, and PowerAnswer.. 113
Question answering aims to develop techniques that can go beyond the retrieval of relevant documents in order to return exact answers to natural language questions, such as “How tall is the Eiffel Tower?”, “Which cities have a subway system?”, and “Who is Alberto Tomba?”. Answering natural language questions requires more complex processing of text than employed by current information retrieval systems. A number of question answering systems have been developed which are capable of carrying out the processing required to achieve high levels of accuracy. However, little work has been reported on techniques for quickly ﬁnding exact answers.
This thesis investigates a number of novel techniques for performing open-domain question answering. Investigated techniques include: manual and automatically constructed question analysers, document retrieval speciﬁcally for question answering, semantic type answer extraction, answer extraction via automatically acquired surface matching text patterns, principled target processing combined with document retrieval for deﬁnition questions, and various approaches to sentence simpliﬁcation which aid in the generation of concise deﬁnitions.
The novel techniques in this thesis are combined to create two end-to-end question answering systems which allow answers to be found quickly. AnswerFinder answers factoid questions such as “When was Mozart born?”, whilst Varro builds deﬁnitions for terms such as “aspirin”, “Aaron Copland”, and “golden parachute”. Both systems allow users to ﬁnd answers to their questions using web documents retrieved by Google™. Together these two systems demonstrate that the techniques developed in this thesis can be successfully used to provide quick effective open-domain question answering.
Since before the dawn of language humans have hungered after knowledge. We have explored the world around us, asking questions about what we can see and feel. As time progressed we became more and more interested in acquiring knowledge; constructing libraries to hold a permanent record of our ever expanding knowledge and founding schools and universities to teach each new generation things their forefathers could never have imagined. From the walls of caves to papyrus, from clay tablets to the ﬁnest parchment we have recorded our thoughts and experiences for others to share. With modern computer technology it is now easier to access that information than at any point in the history of human civilization.
When the World Wide Web (WWW) exploded on to the scene, during the late 80’s and early 90’s, it allowed access to a vast amount of predominately unstructured electronic documents. Effective search engines were rapidly developed to allow a user to ﬁnd a ‘needle’ in this ‘electronic haystack’.
The continued increase in the amount of electronic information available shows no sign of abating, with the WWW effectively tripling in size between the years 2000 and 2003 to approximately 167 terabytes of information (Lyman and Varian, 2003). Although modern search engines are able to cope with this volume of text, they are most useful when a query returns only a handful of documents which the user can then quickly read to ﬁnd the information they are looking for. It is, however, becoming more and more the case that giving a simple query to a modern search engine will result in hundreds if not thousands of documents being returned; more than can possibly be searched by hand – even ten documents is often too many for the time people have available to ﬁnd the information they are looking for. Clearly a new approach is needed to allow easier and more focused access to this vast store of information.
With this explosive growth in the number of available electronic documents we are entering an age where effective question answering technology will become an essential part of everyday life. In an ideal world a user could ask a question such as “What is the state ﬂower of Hawaii?”, “Who was Aaron Copland?” or “How do you cook a Christmas Pudding?”, and instead of being presented with a list of possibly relevant documents, question answering technology would simply return the answer or answers to the questions, with a link back to the most relevant documents for those users who want further information or explanation.
Prefacexii The Gigablast1 web search engine has started to move towards question answering with the introduction of what it refers to as Giga bits – essentially these Giga bits are concepts which are related to the user’s search query. For example, in response to the search query “Who invented the barometer?” Gigablast, as well as returning possibly relevant documents, lists a number of concepts which it believes may answer the question. The ﬁrst ﬁve of these (along with a conﬁdence level) are Torricelli (80%), mercury barometer (64%), Aneroid Barometer (63%), Italian physicist Evangelista Torricelli (54%) and 1643 (45%). Whilst the ﬁrst Giga bit is indeed the correct answer to the question it is clear that many of the other concepts are not even of the correct semantic type to be answers to the question. Selecting one of these Giga bits does not result in a single document justifying the answer but rather adds the concept to the original search query in the hope that the documents retrieved will be relevant to both the question and answer. While this approach seems to be a step in the right direction, it is unclear how far using related concepts can move towards full question answering.
One recent addition to the set of available question answering systems, aimed squarely at the average web user, is BrainBoost2. BrainBoost presents short sentences as answers to questions; although like most question answering (QA) systems it is not always able to return an answer. From the few implementation details that are available (Rozenblatt,
2003) it appears that BrainBoost works like many other QA systems in that it classiﬁes the questions based upon ‘lexical properties’ of the expected answer type. This enables it to locate possible answers in documents retrieved using up to four web search engines.
Whilst such systems are becoming more common, none has yet appeared which is capable of returning exact answers to every question imaginable. The natural language processing (NLP) community has experience of numerous techniques which could be applied to the problem of providing effective question answering. This thesis reports the results of research investigating a number of approaches to QA with a view to advancing the current state-of-the-art and, in time, along with the research of many other individuals and organizations, will hopefully lead to effective question answering technology being made available to the millions of people who would beneﬁt from it.
Acknowledgements Working towards my PhD in the Natural Language Processing (NLP) group at The University of Shefﬁeld has been an enjoyable experience and I am indebted to my supervisor Robert Gaizauskas, not only for his continued support but also for giving me the opportunity to carry out this research. My thanks also to my two advisors, Mark Hepple and Mike Holcombe, for making sure I kept working and for asking those questions aimed at making me think that little bit harder.
Although the NLP research group at Shefﬁeld University is quite large, the number of http://www.gigablast.com http://www.brainboost.com
people actively involved in question answering research is relatively small and I owe a large debt of thanks to all of them for always having the time to discuss my research ideas.
They are also the people with whom I have collaborated on a number of academic papers and the annual QA evaluations held as part of the Text Retrieval Conference (TREC);
without them this research would not have been as enjoyable or as thorough. They are:
Robert Gaizauskas, Horacio Saggion, Mark Hepple, and Ian Roberts.
I owe a huge thank you to all those members of the NLP group who over a number of years have worked hard to develop the GATE framework3. Using this framework made it possible for me to push my research further as I rarely had to think about the lower level NLP tasks which are a standard part of the framework.
The TREC QA evaluations have been an invaluable resource of both data and discussion and I am indebted to the organisers, especially Ellen Voorhees.
I’d like to say a special thank you to a number of researchers from around the globe with whom I’ve had some inspiring conversations which have led to me trying new ideas or approaches to speciﬁc problems: Tiphaine Dalmas, Donna Harman, Jimmy Lin, and Matthew Bilotti.
I would also like to thank Lucy Lally for her administrative help during my PhD, I have no idea how I would have made it to most of the conferences I attended without her help.
Of course, none of the research presented in this thesis would have been carried out without the ﬁnancial support provided by the UK’s Engineering and Physical Sciences Research Council4 (EPSRC) as part of their studentship programme for which I am especially grateful.
Any piece of technical writing as long as this thesis clearly requires a number of people who are willing to proof read various drafts in an effort to remove all the technical and language mistakes. In this respect I would like to thank Robert Gaizauskas, Mark Stevenson, Pauline Greenwood, John Edwards, Angus Roberts, Bryony Edwards, Horacio Saggion, and Emma Barker. I am also grateful to my examiners, Louise Guthrie and John Tait, for their constructive criticism which led to immeasurable improvements in this thesis. I claim full responsibility for any remaining mistakes.
On a more personal note I would like to thank the family and friends who have given me encouragement and provided support during my time at University. I would speciﬁcally like to thank my parents without whose help I would never have made it to University in the ﬁrst place. Their continued support and encouragement has helped me maintain my self-conﬁdence throughout the four years of this research. Without the unwavering support of my ﬁanc´ e I do not know if I would have made it this far and I am eternally e grateful for her belief in my ability to ﬁnish this thesis.
We all know what a question is and often we know what an answer is. If, however, we were asked to explain what questions are or how we go about answering them then many people would have to stop and think about what to say. This chapter gives an introduction to what we mean by question answering and hence the challenges that the approaches introduced in this thesis are designed to overcome.
1.1 Questions One deﬁnition of a question could be ‘a request for information’. But how do we recognise such a request? In written language we often rely on question marks to denote questions.
However, this clue is misleading as rhetorical questions do not require an answer but are often terminated by a question mark while statements asking for information may not be phrased as questions. For example the question “What cities have underground railways?” could also be written as a statement “Name cities which have underground railways”. Both ask for the same information but one is a question and one an instruction.
People can easily handle these different expressions as we tend to focus on the meaning (semantics) of an expression and not the exact phrasing (syntax). We can, therefore, use the full complexities of language to phrase questions knowing that when they are asked other people will understand them and may be able to provide an answer.