Richard Xiao
Lancaster University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Richard Xiao.
Literary and Linguistic Computing | 2004
Paul Baker; Andrew Hardie; Tony McEnery; Richard Xiao; Kalina Bontcheva; Hamish Cunningham; Robert J. Gaizauskas; Oana Hamza; Diana Maynard; Valentin Tablan; Cristian Ursu; B. D. Jayaram; Mark Leisher
This paper describes the work carried out on the EMILLE Project (Enabling Minority Language Engineering), which was undertaken by the Universities of Lancaster and Sheffield. The primary resource developed by the project is the EMILLE Corpus, which consists of a series of monolingual corpora for fourteen South Asian languages, totalling more than 96 million words, and a parallel corpus of English and five of these languages. The EMILLE Corpus also includes an annotated component, namely, part-of-speech tagged Urdu data, together with twenty written Hindi corpus files annotated to show the nature of demonstrative use in Hindi. In addition, the project has had to address a number of issues related to establishing a language engineering (LE) environment for South Asian language processing, such as translating 8-bit language data into Unicode and producing a number of basic LE tools. The development of tools for EMILLE has contributed to the ongoing development of the LE architecture GATE, which has been extended to make use of Unicode. GATE thus plugs some of the gaps for language processing R&D necessary for the exploitation of the EMILLE corpora.
English Studies | 2005
Tony McEnery; Richard Xiao
In this paper, we will examine a range of factors that may potentially influence a language users choice of a full or bare infinitive following HELP. The factors include language variety, language change, spoken/written distinction, semantic distinction, and syntactic conditions, namely, an intervening noun phrase or adverbial, the number of intervening words, to preceding HELP, the passive construction, inflections of HELP, and it as the subject. Six corpora are used in this paper, four written corpora (LOB, Brown, FLOB and Frown) and two spoken corpora (the speech section of the BNC and the Corpus of Professional Spoken American English, CPSA).
Archive | 2015
Richard Xiao; Xianyao Hu
Notes for transcription.- Abbreviations.- Acknowledgements.- Introduction.- Corpus-based Translation Studies: An evolving paradigm.- Exploring the features of translational language.- Corpora and corpus tools in use.- The macro-statistic features of translational Chinese.- The lexical features of translational Chinese.- The grammatical features of translational Chinese.- The features of translational Chinese and Translation Universals.- Conclusive remarks.
Corpus Linguistics and Linguistic Theory | 2014
Richard Xiao; Guangrong Dai
Abstract Corpus-based Translation Studies focuses on translation as a product by comparing comparable corpora of translated and non-translated texts. A number of distinctive features of translations have been posited including, for example, explicitation, simplification, normalisation, levelling out, source language interference, and under-representation of target language unique items. Nevertheless, research of this area has until recently been confined largely to translational English and closely related European languages. If the features of translational language that have been reported on the basis of these languages are to be generalised as “translation universals”, the language pairs involved must not be restricted to English and closely related European languages. Clearly, evidence from a genetically distant language pair such as English and Chinese is arguably more convincing, if not indispensable. This article explores, in the broad context of translation universal research, lexical and grammatical properties of translational Chinese on the basis of two one-million-word balanced comparable corpora of translated and non-translated native Chinese texts. The findings of this empirical study of the properties of translational Chinese have enabled a reevaluation, from the perspective of translational Chinese, of largely English-based translation universal hypotheses.
Archive | 2010
Tony McEnery; Richard Xiao
1. Introduction 2. Aspect Marking in English and Chinese 3. Temporal Adverbials and Telicity in English and Chinese 4. Quantifying Constructions in English and Chinese 5. Passives in English and Chinese 6. Negation in English and Chinese: Variants and Variations 7. Negation in English and Chinese: Special Usages 8. Challenge and Promise, and the Way Forward
Corpus Linguistics and Linguistic Theory | 2014
Richard Xiao; Naixing Wei
Abstract Corpora have revolutionized nearly all areas of linguistic research over the past four decades (McEnery, Xiao and Tono 2006; McEnery and Hardie 2012). Translation studies and contrastive linguistics are no exceptions. Indeed, the rapid development of bilingual parallel corpora as well as monolingual and multilingual comparable corpora since the early 1990s has been of particular relevance and crucial importance to translation studies and contrastive linguistics. This special issue of Corpus Linguistics and Linguistic Theory focuses on corpus-based translation and contrastive linguistic studies involving two genetically different languages, namely English and Chinese, which we believe have formed an important interface with its unique features as a result of the mutual interaction between the two languages. This introduction will first contextualize the special issue by exploring the state of the art in using corpora in translation and contrastive linguistic studies, particularly in the context of the two languages covered, and then provide a synopsis of each article included in this volume and comment on its significance and implications for linguistic theorization.
Corpus Linguistics and Linguistic Theory | 2016
Xianyao Hu; Richard Xiao; Andrew Hardie
Abstract This paper discusses the debatable hypotheses of “Translation Universals”, i. e. the recurring common features of translated texts in relation to original utterances. We propose that, if translational language does have some distinctive linguistic features in contrast to non-translated writings in the same language, those differences should be statistically significant, consistently distributed and systematically co-occurring across registers and genres. Based on the balanced Corpus of Translational English (COTE) and its non-translated English counterpart, the Freiburg-LOB corpus of British English (FLOB), and by deploying a multi-feature statistical analysis on 96 lexical, syntactic and textual features, we try to pinpoint those distinctive features in translated English texts. We also propose that the stylo-statistical model developed in this study will be effective not only in analysing the translational variation of English but also be capable of clustering those variational features into a “translational” dimension which will facilitate a crosslinguistic comparison of translational languages (e. g. translational Chinese) to test the Translation Universals hypotheses.
Archive | 2015
Richard Xiao
Translational language as a “third code” has been found to differ from both source and target languages. Recent corpus-based studies have proposed a number of translation universal (TU) hypotheses including, for example, simplification, explicitation and normalisation. This article investigates the “source language shining through” hypothesis put forward by Teich (2003: 207) by exploring source language interference in translated texts, at both lexical and grammatical levels, in English-to-Chinese translation on the basis of comparable corpora and parallel corpora of the two languages. The evidence from the two genetically distant languages is of critical importance in generalising the source language interference as a potential translation universal.
Corpus Linguistics and Linguistic Theory | 2006
Richard Xiao; Tony McEnery
Abstract Telicity is an important concept in the study of aspect. While the compatibility tests with completive and durative adverbials have long been in operation as a diagnostic for telicity, their validity and reliability have rarely been questioned. This article critically explores the validity and reliability of such tests and discusses such temporal expressions in English and Chinese on the basis of written and spoken corpora of the two languages and proposes a scheme of usage categories of completive and durative adverbials with their respective proportions, which is adequately explanatory of the phenomena observed in attested language use and enables compatibility tests with completive and durative adverbials to achieve improved accuracy and reliability.
Archive | 2015
Richard Xiao; Xianyao Hu
We have so far analysed and compared translational and non-translational or native Chinese as represented by our corpora LCMC and ZCTC in terms of their macro-statistic features in Chap. 5 and the lexical and grammatical characteristics in Chaps. 6 and 7, while the present chapter is an interface between the empirical findings and theoretical hypotheses, that is, it is a combination of descriptive translation studies with the “pure translation theory” (Holmes 1972/1988). It is important to find these connections for the reason that without a higher level of generalisation, empirical and quantitative discoveries can be meaningless or aimless. We will first of all summarise the discriminatory features of translational Chinese at different levels and then discuss the implications, if any, of these translation specific features to translation universals hypotheses reviewed in Chap. 3. Due to the fact that the translated corpus (ZCTC) used as the basis of this research consists mostly of translated texts from English and that the parallel corpus (GCEPC) which is used whenever necessary is a corpus of English and Chinese translation, our generalisation for the sake of translation universals should be limited within the particular realm of English-to-Chinese translation.