Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marc Alexander is active.

Publication


Featured researches published by Marc Alexander.


Digital Scholarship in the Humanities | 2015

Metaphor, Popular Science, and Semantic Tagging: Distant Reading with the Historical Thesaurus of English

Marc Alexander; Fraser Dallachy; Scott Piao; Alistair Baron; Paul Rayson

The use of metaphor in popular science is widespread to aid readers’ conceptions of the scientific concepts under discussion. Almost all research in this area has been done by careful close reading of the text(s) in question, but this article describes—for the first time—a digital ‘distant reading’ analysis of popular science, using a system created by a team from Glasgow and Lancaster. This team, as part of the SAMUELS project, has developed semantic tagging software which is based upon the UCREL Semantic Analysis System developed by Lancaster University’s University Centre for Computer Corpus Research on Language, but using the uniquely comprehensive Historical Thesaurus of English (published in 2009 as The Historical Thesaurus of the Oxford English Dictionary) as its knowledge base, in order to provide fine-grained meaning distinctions for use in word-sense disambiguation. In addition to analyzing metaphors in highly abstract book-length popular science texts from physics and mathematics, this article describes the technical underpinning to the system and the methods employed to hone the word-sense disambiguation procedure.


Studia Neophilologica | 2017

Linguistic DNA: Investigating Conceptual Change in Early Modern English Discourse

Susan M. Fitzmaurice; Justyna Robinson; Marc Alexander; Iona Hine; Seth Mehl; Fraser Dallachy

ABSTRACT This article describes the background and premises of the AHRC-funded project, ‘The Linguistic DNA of Modern Western Thought’. We offer an empirical, encyclopaedic approach to historical semantics regarding ‘conceptual history’, i.e. the history of concepts that shape thought, culture and society in a particular period. We relate the project to traditional work in conceptual and semantic history and define our object of study as the discursive concept, a category of meaning encoded linguistically as a cluster of expressions that co-occur in discourse. We describe our principal data source, EEBO-TCP, and introduce our key research interests, namely, the contexts of conceptual change, the semantic structure of lexical fields and the nature of lexicalisation pressure. We outline our computational processes, which build upon the theoretical definition of discursive concepts, to discover the linguistically encoded forms underpinning the discursive concepts we seek to identify in EEBO-TCP. Finally, we share preliminary results via a worked example, exploring the discursive contexts in which paradigmatic terms of key cultural concepts emerge. We consider the extent to which particular genres, discourses and users in the early modern period make paradigms, and examine the extent to which these contexts determine the characteristics of key concepts.


international conference on emerging technologies | 2011

Implementing MapReduce over language and literature data over the UK National Grid Service

Muhammad S. Sarwar; Marc Alexander; Jean Anderson; J. Green; Richard O. Sinnott

Humanities researchers are producing large volumes and heterogeneous varieties of language and literature data collections in digital format. These collections include dictionaries, thesauri, corpora, images, audio and video resources. The increased availability of these datasets brought about by advances and adaptations of the Internet and increased digitisation of humanities data resources, poses new challenges for humanities researchers. Many of these challenges are related to data access and usage and include security, integrity, interoperability, information retrieval, sharing, licensing and copyright. The JISC-funded project Enhancing Repositories for Language and Literature Research (ENROLLER; https://www.enroller.org.uk) is addressing these issues through development of a targeted e-Research environment. A key component of this effort is in supporting large-scale analysis of diverse language and literature data sets. To this end, this paper presents the application of the MapReduce algorithm, that supports information retrieval and linguistic analysis on those datasets. In particular, we describe how MapReduce is used to provide advanced bulk search capabilities exploiting a range of high performance computing resources including the UK National Grid Service (www.ngs.ac.uk) and ScotGrid (www.scotgrid.ac.uk) to offer a step change in the kinds of research that can be undertaken by this community. We also present performance analysis results based on the application of these systems.


Computer Speech & Language | 2017

A time-sensitive historical thesaurus-based semantic tagger for deep semantic annotation

Scott Piao; Fraser Dallachy; Alistair Baron; Jane Demmen; Steve Wattam; Philip Durkin; James McCracken; Paul Rayson; Marc Alexander

Automatic extraction and analysis of meaning-related information from natural language data has been an important issue in a number of research areas, such as natural language processing (NLP), text mining, corpus linguistics, and data science. An important aspect of such information extraction and analysis is the semantic annotation of language data using a semantic tagger. In practice, various semantic annotation tools have been designed to carry out different levels of semantic annotation, such as topics of documents, semantic role labeling, named entities or events. Currently, the majority of existing semantic annotation tools identify and tag partial core semantic information in language data, but they tend to be applicable only for modern language corpora. While such semantic analyzers have proven useful for various purposes, a semantic annotation tool that is capable of annotating deep semantic senses of all lexical units, or all-words tagging, is still desirable for a deep, comprehensive semantic analysis of language data. With large-scale digitization efforts underway, delivering historical corpora with texts dating from the last 400 years, a particularly challenging aspect is the need to adapt the annotation in the face of significant word meaning change over time. In this paper, we report on the development of a new semantic tagger (the Historical Thesaurus Semantic Tagger), and discuss challenging issues we faced in this work. This new semantic tagger is built on existing NLP tools and incorporates a large-scale historical English thesaurus linked to the Oxford English Dictionary. Employing contextual disambiguation algorithms, this tool is capable of annotating lexical units with a historically-valid highly fine-grained semantic categorization scheme that contains about 225,000 semantic concepts and 4,033 thematic semantic categories. In terms of novelty, it is adapted for processing historical English data, with rich information about historical usage of words and a spelling variant normalizer for historical forms of English. Furthermore, it is able to make use of knowledge about the publication date of a text to adapt its output. In our evaluation, the system achieved encouraging accuracies ranging from 77.12% to 91.08% on individual test texts. Applying time-sensitive methods improved results by as much as 3.54% and by 1.72% on average.


Proceedings of the 1st Workshop on Programming Language Evolution | 2014

Programming language feature agglomeration

Jeremy Singer; Callum Cameron; Marc Alexander

Feature-creep is a well-known phenomenon in software systems. In this paper, we argue that feature-creep also occurs in the domain of programming languages. Recent languages are more expressive than earlier languages. However recent languages generally extend rather than replace the syntax (sometimes) and semantics (almost always) of earlier languages. We demonstrate this trend of agglomeration in a sequence of languages comprising Pascal, C, Java, and Scala. These are all block-structured Algol-derived languages, with earlier languages providing explicit inspiration for later ones. We present empirical evidence from several language-specific sources, including grammar definitions and canonical manuals. The evidence suggests that there is a trend of increasing complexity in modern languages that have evolved from earlier languages.


Archive | 2013

’In Countries so Unciviliz’d as Those?’: The Language of Incivility and the British Experience of the World

Marc Alexander; Andrew Struan

’Civilisation’, wrote Arnold J. Toynbee in the 1950s, ‘is a movement, not a condition; it is a voyage, not a harbour.’1 In a similar vein, the ways in which peoples and nations have thought others to be civilised, or uncivilised, have altered and changed over time. This development is true particularly of the contact over the past 1,000 years between the British and those they thought to be, and deemed, ‘uncivilised’. The ways in which British writers represented and constructed these ‘uncivilised’ peoples in their factual narratives and explanations, and the extent to which those writers engaged with shifting and changing conceptions of such people, allow an insight into the reactions and attitudes of the British towards those they encountered through imperial expansions and travel abroad. This chapter therefore seeks to analyse the ways in which the English-speaking peoples have sought to conceptualise those deemed uncivil, through an investigation into the word choices which scholars now know were available to them at each stage in the evolution of the English language.


Proceedings of the 1st international workshop on Search and mining entity-relationship data | 2011

Data mining and search enhancements using the historical thesaurus of English

Jean Anderson; Marc Alexander; Christian Kay; Muhammad S. Sarwar

In this paper, we describe a new and unique thesaurus which allows us to address temporal aspects in search and data mining, and to improve context-based retrieval. We discuss the application of the Historical Thesaurus of English (HTE) to semantic tagging of texts from AD 700 to the current day, and present several use-cases where the power of HTE is highly exploitable for linguistics, information retrieval and computational approaches to meaning.


language resources and evaluation | 2014

Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard

Stephen Wattam; Paul Rayson; Marc Alexander; Jean Anderson


Archive | 2014

Foregrounding, burying and plot construction

Catherine Emmott; Marc Alexander


Archive | 2013

Rhetorical control of readers' attention: psychological and stylistic perspectives on foreground and background in narrative

Catherine Emmott; Alison Sanford; Marc Alexander

Collaboration


Dive into the Marc Alexander's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

J. Green

University of Glasgow

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge