Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrew Kehoe is active.

Publication


Featured researches published by Andrew Kehoe.


Archive | 2007

WebCorp: an integrated system for web text search

Antoinette Renouf; Andrew Kehoe; Jayeeta Banerjee

The web has unique potential to yield large-volume data on up-to-date language use, obvious shortcomings notwithstanding. Since 1998, we have been developing a tool, WebCorp, to allow corpus linguists to retrieve raw and analysed linguistic output from the web. Based on internal trials and user feedback gleaned from our site (http://www. webcorp.org.uk/), we have established a working system which supports thousands of regular users world-wide. Many of the problems associated with the nature of web text have been accommodated, but problems remain, some due to the non-implementation of standards on the Internet, and others to reliance on commercial search engines, which mediation slows up average WebCorp response time and places constraints on linguistic search. To improve WebCorp performance, we are in the process of creating a tailored search engine, an infrastructure in which WebCorp will play an integral and enhanced role. In this paper, we shall give a brief description of WebCorp, the nature and level of its current functionality, the linguistic and procedural problems in web text search which remain; and the benefits of replacing the commercial search engine with tailored websearch architecture.


Archive | 2004

The accidental corpus: some issues in extracting linguistic information from the Web

Antoinette Renouf; Andrew Kehoe; David Mezquiriz

The Web is a text store which can potentially supplement traditional corpora as a source of up-to-date linguistic data. The WebCorp project investigates this potential, and in its second year tackles some residual problems inherent in the nature of Web text, thereby refining its retrieval and analysis tool for the facilitation of corpus linguistic study.


Archive | 2006

Diachronic linguistic analysis on the web with WebCorp

Andrew Kehoe

The WebCorp project has demonstrated how the Web may be used as a source of linguistic data. One feature of standard corpus analysis tools hitherto missing in WebCorp is the ability to filter and sort results by date. This paper discusses the dating mechanisms available on the Web and the date query facilities offered by standard Web search engines. The new date heuristics built into WebCorp are then d iscussed and illustrated with a case study.


Archive | 2009

Weaving web data into a diachronic corpus patchwork

Andrew Kehoe; Matt Gee

This paper offers a reassessment of the role of web data in diachronic linguistic analysis. We introduce the diachronic search facilities provi ded by the WebCorp Linguist’s Search Engine, including the use of a new ‘heat map’ graph for the analysis of changes in collocational patterns over time. We illustrate how web data can be used to supplement data from standard corpora in lexicological studies . Our focus is on the vogue phrase credit crunch and the paper compares examples from standard corpo ra (BNC, Brown, LOB, Frown, FLOB) with those found in web-accessible n ewspaper texts. Contrary to previous studies, we do not rely on the web solely fo r the most up-to-date usage examples. Instead, we show how web-accessible texts dating back t o the beginning of the 20 th Century can be used to fill gaps in and sharpen the picture provided by standard corpora.


Archive | 2006

The corpus-user’s chorus: (Based on The Major General's Song from Gilbert and Sullivan's The Pirates of Penzance)

Antoinette Renouf; Andrew Kehoe

This volume is witness to a spirited and fruitful period in the evolution of corpus linguistics. In twenty-two articles written by established corpus linguists, members of the ICAME (International Computer Archive of Modern and Mediaeval English) association, this new volume brings the reader up to date with the cycle of activities which make up this field of study as it is today, dealing with corpus creation, language varieties, diachronic corpus study from the past to present, present-day synchronic corpus study, the web as corpus, and corpus linguistics and grammatical theory. It thus serves as a valuable guide to the state of the art for linguistic researchers, teachers and language learners of all persuasions. After over twenty years of evolution, corpus linguistics has matured, incorporating nowadays not just small, medium and large primary corpus building but also specialised and multi-dimensional secondary corpus building; not just corpus analysis, but also corpus evaluation; not just an initial application of theory, but self-reflection and a new concern with theory in the light of experience. The volume also highlights the growing emphasis on language as a changing phenomenon, both in terms of established historical study and the newer short-range diachronic study of 20th century and current English; and the growing area of overlap between these two. Another section of the volume illustrates the recent changes in the definition of ‘corpus’ which have come about due to the emergence of new technologies and in particular of the availability of texts on the world wide web. The volume culminates in the contributions by a group of corpus grammarians to a timely and novel discussion panel on the relationship between corpus linguistics and grammatical theory.


Archive | 2006

The changing face of corpus linguistics

Antoinette Renouf; Andrew Kehoe


Archive | 2009

Corpus linguistics : refinements and reassessments

Antoinette Renouf; Andrew Kehoe


International Journal of Corpus Linguistics | 2013

Filling the gaps: Using the WebCorp Linguist’s Search Engine to supplement existing text resources

Antoinette Renouf; Andrew Kehoe


Archive | 2012

Reader comments as an aboutness indicator in online texts: introducing the Birmingham Blog Corpus

Andrew Kehoe; Matt Gee


international world wide web conferences | 2003

Linguistic Research with XML/RDF-aware WebCorp Tool.

Antoinette Renouf; Barry Morley; Andrew Kehoe

Collaboration


Dive into the Andrew Kehoe's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ursula Lutzky

Birmingham City University

View shared research outputs
Top Co-Authors

Avatar

Matt Gee

Birmingham City University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge