Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrew Caines is active.

Publication


Featured researches published by Andrew Caines.


text speech and dialogue | 2015

Incremental Dependency Parsing and Disfluency Detection in Spoken Learner English

Russell Moore; Andrew Caines; Calbert Graham; Paula Buttery

This paper investigates the suitability of state-of-the-art natural language processing NLP tools for parsing the spoken language of second language learners of English. The task of parsing spoken learner-language is important to the domains of automated language assessment ALA and computer-assisted language learning CALL. Due to the non-canonical nature of spoken language containing filled pauses, non-standard grammatical variations, hesitations and other disfluencies and compounded by a lack of available training data, spoken language parsing has been a challenge for standard NLP tools. Recently the Redshift parser Honnibal et al. In: Proceedings of CoNLL 2013 has been shown to be successful in identifying grammatical relations and certain disfluencies in native speaker spoken language, returning unlabelled dependency accuracy of 90.5% and a disfluency F-measure of 84.1% Honnibal & Johnson: TACL 2, 131-142 2014. We investigate how this parser handles spoken data from learners of English at various proficiency levels. Firstly, we find that Redshifts parsing accuracy on non-native speech data is comparable to Honnibal & Johnsons results, with 91.1% of dependency relations correctly identified. However, disfluency detection is markedly down, with an F-measure of just 47.8%. We attempt to explain why this should be, and investigate the effect of proficiency level on parsing accuracy. We relate our findings to the use of NLP technology for CALL and ALA applications.


recent advances in intrusion detection | 2018

Characterizing Eve: Analysing Cybercrime Actors in a Large Underground Forum.

Sergio Pastrana; Alice Hutchings; Andrew Caines; Paula Buttery

Underground forums contain many thousands of active users, but the vast majority will be involved, at most, in minor levels of deviance. The number who engage in serious criminal activity is small. That being said, underground forums have played a significant role in several recent high-profile cybercrime activities. In this work we apply data science approaches to understand criminal pathways and characterize key actors related to illegal activity in one of the largest and longest-running underground forums. We combine the results of a logistic regression model with k-means clustering and social network analysis, verifying the findings using topic analysis. We identify variables relating to forum activity that predict the likelihood a user will become an actor of interest to law enforcement, and would therefore benefit the most from intervention. This work provides the first step towards identifying ways to deter the involvement of young people away from a career in cybercrime.


Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground | 2010

You Talking to Me? A Predictive Model for Zero Auxiliary Constructions

Andrew Caines; Paula Buttery


Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages | 2014

The effect of disfluencies and learner errors on the parsing of spoken learner language

Andrew Caines; Paula Buttery


Archive | 2012

Normalising frequency counts to account for 'opportunity of use' in learner corpora

Paula Buttery; Andrew Caines


language resources and evaluation | 2016

Crowdsourcing a Multi-lingual Speech Corpus: Recording, Transcription and Annotation of the CrowdIS Corpora.

Andrew Caines; Christian Bentz; Calbert Graham; Tim Polzehl; Paula Buttery


language resources and evaluation | 2016

Predicting Author Age from Weibo Microblog Posts.

Wanru Zhang; Andrew Caines; Dimitrios Alikaniotis; Paula Buttery


international conference on computational linguistics | 2016

Automated speech-unit delimitation in spoken learner English.

Russell Moore; Andrew Caines; Calbert Graham; Paula Buttery


workshop on innovative use of nlp for building educational applications | 2017

Collecting fluency corrections for spoken learner English.

Andrew Caines; Emma Flint; Paula Buttery


empirical methods in natural language processing | 2017

Parsing transcripts of speech.

Andrew Caines; Michael McCarthy; Paula Buttery

Collaboration


Dive into the Andrew Caines's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tim Polzehl

Technical University of Berlin

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge