Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Edward W. D. Whittaker is active.

Publication


Featured researches published by Edward W. D. Whittaker.


cross language evaluation forum | 2007

CLEF2006 Question Answering Experiments at Tokyo Institute of Technology

Edward W. D. Whittaker; Josef R. Novak; Pierre Chatain; Paul R. Dixon; Matthias H. Heie; Sadaoki Furui

In this paper we present the experiments performed at Tokyo Institute of Technology for the CLEF2006 Multiple Language Question Answering ([emailxa0protected]) track. Our approach to QA centres on a non-linguistic, data-driven, statistical classification model that uses the redundancy of the web to find correct answers. For the cross-language aspect we employed publicly available web-based text translation tools to translate the question from the source into the corresponding target language, then used the corresponding mono-lingual QA system to find the answers. The hypothesised correct answers were then projected back on to the appropriate closed-domain corpus. Correct and supported answer performance on the mono-lingual tasks was around 14% for both Spanish and French. Performance on the cross-language tasks ranged from 5% for Spanish-English, to 12% for French-Spanish. Our method of projecting answers onto documents was shown not to work well: in the worst case on the French-English task we lost 84% of our otherwise correct answers. Ignoring the need for correct support information the exact answer accuracy increased to 29% and 21% correct on the Spanish and French mono-lingual tasks, respectively.


international conference on acoustics, speech, and signal processing | 2006

Automatic Sentence Segmentation of Speech for Automatic Summarization

Joanna Mrozinski; Edward W. D. Whittaker; Pierre Chatain; Sadaoki Furui

This paper presents an automatic sentence segmentation method for an automatic speech summarization system. The segmentation method is based on combining word- and class-based statistical language models to predict sentence and non-sentence boundaries. We study both the performance of the sentence segmentation system itself and the effect of the segmentation on the summarization accuracy. The sentence segmentation is done by modelling the probability of a sentence boundary given a certain word history with language models trained on transcriptions and texts from several sources. The resulting segmented data is used as the input to an existing automatic summarization system to determine the effect it has on the summarization process. We conduct all our experiments with two types of evaluation data: broadcast news and lecture transcriptions. The automatic summarizations are created with different sentence segmentations and different summarization ratios (30% and 40%) and evaluated by comparing them to human-made summaries. We show that a proper sentence segmentation is essential to achieve good performance with an automatic summarization system


MLQA '06 Proceedings of the Workshop on Multilingual Question Answering | 2006

Monolingual web-based factoid question answering in Chinese, Swedish, English and Japanese

Edward W. D. Whittaker; Julien Hamonic; Dong Yang; T. Klingberg; Sadaoki Furui

In this paper we extend the application of our statistical pattern classification approach to question answering (QA) which has previously been applied successfully to English and Japanese to develop two prototype QA systems in Chinese and Swedish. We show what data is necessary to achieve this and also evaluate the performance of the two new systems using a translation of the TREC 2003 factoid QA task. While performance for Chinese and Swedish is found to be lower than that for the more developed English and Japanese systems we explain why this is the case and offer solutions for their improvement. All systems form the basis of our publicly accessible web-based multilingual QA system at http://asked.jp.


cyberworlds | 2005

A statistical classification approach to question answering using Web data

Edward W. D. Whittaker; Sadaoki Furui; Dietrich Klakow

In this paper we treat question answering (QA) as a classification problem. Our motivation is to build systems for many languages without the need for highly tuned linguistic modules. Consequently, word tokens and Web data are used extensively but no explicit linguistic knowledge is incorporated. A mathematical model for answer retrieval, answer classification and answer length prediction is derived. The TREC 2002 QA task is used for system development where 33% of questions are answered correctly. Performance is then evaluated on the factoid questions of the TREC 2003 QA task where 23% of questions were answered correctly, which would rank the system in the top 10 of contemporary QA systems on the same task


Computer Speech & Language | 2012

Question answering using statistical language modelling

Matthias H. Heie; Edward W. D. Whittaker; Sadaoki Furui

In this paper we present a statistical approach to question answering (QA). Our motivation is to build robust systems for many languages without the need for highly tuned linguistic modules. Consequently, word tokens and web data are used extensively but neither explicit linguistic knowledge nor annotated data is incorporated. A mathematical model for answer retrieval and answer classification is derived. Experiments are conducted by searching for answers in the AQUAINT corpus, as well as in web data. The redundancy inherent in web data outperforms retrieval from a fixed corpus, where there are typically relatively few answer occurrences for any given question. We participated with an implementation of this framework in the TREC 2006 QA evaluations, where we ranked 9th among 27 participants on the factoid task.


international conference on acoustics, speech, and signal processing | 2006

Topic and Stylistic Adaptation for Speech Summarisation

Pierre Chatain; Edward W. D. Whittaker; J.A. Mmzinski; Sadaoki Furui

Contemporary approaches to automatic speech summarisation comprise several components, among them a linguistic model (LiM) component, which is unrelated to the language model used during the recognition process. This LiM component assigns a probability to word sequences from the source text according to their likelihood of appearing in the summarised text. In this paper we investigate LiM topic and stylistic adaptation using combinations of LiMs each trained on different adaptation data. Experiments are performed on 9 talks from the TED corpus of Eurospeech conference presentations, as well as 5 news stories from CNN broadcast news data, for all of which human (TRS) and speech recogniser (ASR) transcriptions along with human summaries were used. In all ASR cases, summarisation accuracy (SumACCY) of automatically generated summaries was significantly improved by automatic LiM adaptation, with relative improvements of at least 2.5% in all experiments


language and technology conference | 2006

Factoid Question Answering with Web, Mobile and Speech Interfaces

Edward W. D. Whittaker; Joanna Mrozinski; Sadaoki Furui

In this paper we describe the web and mobile-phone interfaces to our multi-language factoid question answering (QA) system together with a prototype speech interface to our English-language QA system. Using a statistical, data-driven approach to factoid question answering has allowed us to develop QA systems in five languages in a matter of months. In the web-based system, which is accessible at http://asked.jp, we have combined the QA system output with standard search-engine-like results by integrating it with an open-source web search engine. The prototype speech interface is based around a VoiceXML application running on the Voxeo developer platform. Recognition of the users question is performed on a separate speech recognition server dedicated to recognizing questions. An adapted version of the Sphinx-4 recognizer is used for this purpose. Once the question has been recognized correctly it is passed to the QA system and the resulting answers read back to the user by speech synthesis. Our approach is modular and makes extensive use of open-source software. Consequently, each component can be easily and independently improved and easily extended to other languages.


north american chapter of the association for computational linguistics | 2006

Class Model Adaptation for Speech Summarisation

Pierre Chatain; Edward W. D. Whittaker; Joanna Mrozinski; Sadaoki Furui

The performance of automatic speech summarisation has been improved in previous experiments by using linguistic model adaptation. We extend such adaptation to the use of class models, whose robustness further improves summarisation performance on a wider variety of objective evaluation metrics such as ROUGE-2 and ROUGE-SU4 used in the text summarisation literature. Summaries made from automatic speech recogniser transcriptions benefit from relative improvements ranging from 6.0% to 22.2% on all investigated metrics.


european conference on information retrieval | 2006

Rapid development of web-based monolingual question answering systems

Edward W. D. Whittaker; Julien Hamonic; Dong Yang; Tor Klingberg; Sadaoki Furui

In this paper we describe the application of our statistical pattern classification approach to question answering (QA) to the rapid development of monolingual QA systems. We show how the approach has been applied successfully to QA in English, Japanese, Chinese, Russian and Swedish to form the basis of our publicly accessible web-based multilingual QA system at http://asked.jp.


meeting of the association for computational linguistics | 2014

Correcting Preposition Errors in Learner English Using Error Case Frames and Feedback Messages

Ryo Nagata; Mikko Vilenius; Edward W. D. Whittaker

This paper presents a novel framework called error case frames for correcting preposition errors. They are case frames specially designed for describing and correcting preposition errors. Their most distinct advantage is that they can correct errors with feedback messages explaining why the preposition is erroneous. This paper proposes a method for automatically generating them by comparing learner and native corpora. Experiments show (i) automatically generated error case frames achieve a performance comparable to conventional methods; (ii) error case frames are intuitively interpretable and manually modifiable to improve them; (iii) feedback messages provided by error case frames are effective in language learning assistance. Considering these advantages and the fact that it has been difficult to provide feedback messages by automatically generated rules, error case frames will likely be one of the major approaches for preposition error correction.

Collaboration


Dive into the Edward W. D. Whittaker's collaboration.

Top Co-Authors

Avatar

Sadaoki Furui

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Pierre Chatain

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Joanna Mrozinski

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Matthias H. Heie

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Josef R. Novak

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Julien Hamonic

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Arnar Thor Jensson

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Dong Yang

Tokyo Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge