Marek Rei | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marek Rei is active.

Explore More

Publication

Featured researches published by Marek Rei.

meeting of the association for computational linguistics | 2016

Automatic Text Scoring Using Neural Networks

Dimitrios Alikaniotis; Helen Yannakoudakis; Marek Rei

Automated Text Scoring (ATS) provides a cost-effective and consistent alternative to human marking. However, in order to achieve good performance, the predictive features of the system need to be manually engineered by human experts. We introduce a model that forms word representations by learning the extent to which specific words contribute to the texts score. Using Long-Short Term Memory networks to represent the meaning of texts, we demonstrate that a fully automated framework is able to achieve excellent results over similar approaches. In an attempt to make our results more interpretable, and inspired by recent advances in visualizing neural networks, we introduce a novel method for identifying the regions of the text that the model has found more discriminative.

conference on computational natural language learning | 2014

Looking for Hyponyms in Vector Space

Marek Rei; Ted Briscoe

The task of detecting and generating hyponyms is at the core of semantic understanding of language, and has numerous practical applications. We investigate how neural network embeddings perform on this task, compared to dependency-based vector space models, and evaluate a range of similarity measures on hyponym generation. A new asymmetric similarity measure and a combination approach are described, both of which significantly improve precision. We release three new datasets of lexical vector representations trained on the BNC and our evaluation dataset for hyponym generation.

meeting of the association for computational linguistics | 2017

Semi-supervised Multitask Learning for Sequence Labeling

Marek Rei

We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. The novel language modeling objective provided consistent performance improvements on every benchmark, without requiring any additional annotated or unannotated data.

international conference on computational linguistics | 2016

Attending to characters in neural sequence labeling models

Marek Rei; Gamal K. O. Crichton; Sampo Pyysalo

Sequence labeling architectures use word embeddings for capturing similarity, but suffer when handling previously unseen or rare words. We investigate character-level extensions to such models and propose a novel architecture for combining alternative word representations. By using an attention mechanism, the model is able to dynamically decide how much information to use from a word- or character-level component. We evaluated different architectures on a range of sequence labeling datasets, and character-level extensions were found to improve performance on every benchmark. In addition, the proposed attention-based architecture delivered the best results even with a smaller number of trainable parameters.

conference on computational natural language learning | 2010

Combining Manual Rules and Supervised Learning for Hedge Cue and Scope Detection

Marek Rei; Ted Briscoe

Hedge cues were detected using a supervised Conditional Random Field (CRF) classifier exploiting features from the RASP parser. The CRFs predictions were filtered using known cues and unseen instances were removed, increasing precision while retaining recall. Rules for scope detection, based on the grammatical relations of the sentence and the part-of-speech tag of the cue, were manually-developed. However, another supervised CRF classifier was used to refine these predictions. As a final step, scopes were constructed from the classifier output using a small set of post-processing rules. Development of the system revealed a number of issues with the annotation scheme adopted by the organisers.

meeting of the association for computational linguistics | 2016

A Joint Model for Word Embedding and Word Morphology

Kris Cao; Marek Rei

This paper presents a joint model for performing unsupervised morphological analysis on words, and learning a character-level composition function from morphemes to word embeddings. Our model splits individual words into segments, and weights each segment according to its ability to predict context words. Our morphological analysis is comparable to dedicated morphological analyzers at the task of morpheme boundary recovery, and also performs better than word-based embedding models at the task of syntactic analogy answering. Finally, we show that incorporating morphology explicitly into character-level models helps them produce embeddings for unseen words which correlate better with human judgments.

meeting of the association for computational linguistics | 2016

Compositional Sequence Labeling Models for Error Detection in Learner Writing

Marek Rei; Helen Yannakoudakis

In this paper, we present the first experiments using neural network models for the task of error detection in learner writing. We perform a systematic comparison of alternative compositional architectures and propose a framework for error detection based on bidirectional LSTMs. Experiments on the CoNLL-14 shared task dataset show the model is able to outperform other participants on detecting errors in learner writing. Finally, the model is integrated with a publicly deployed self-assessment system, leading to performance comparable to human annotators.

empirical methods in natural language processing | 2015

Online Representation Learning in Recurrent Neural Language Models

Marek Rei

We investigate an extension of continuous online learning in recurrent neural network language models. The model keeps a separate vector representation of the current unit of text being processed and adaptively adjusts it after each prediction. The initial experiments give promising results, indicating that the method is able to increase language modelling accuracy, while also decreasing the parameters needed to store the model along with the computation required at each step.

patent information retrieval | 2011

Intelligent Information Access from Scientific Papers

Ted Briscoe; K. Harrison; Andrew Naish; Andy Parker; Marek Rei; Advaith Siddharthan; David Sinclair; M. Slater; Rebecca Watson

We describe a novel search engine for scientific literature. The system allows for sentence-level search starting from portable document format (PDF) files, and integrates text and image search, thus, for example, facilitating the retrieval of information present in tables and figures using both image and caption content. In addition, the system allows the user to generate in an intuitive manner complex queries for search terms that are related through particular grammatical (and thus implicitly semantic) relations. Grid processing techniques are used to parallelise the analysis of large numbers of scientific papers. We are currently conducting user evaluations, but here we report some preliminary evaluation and comparison with Google Scholar, demonstrating the potential utility of the novel features. Finally, we discuss future work and the potential and complementarity of the system for patent search.

workshop on innovative use of nlp for building educational applications | 2016

Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays

Marek Rei; Ronan Cummins

We investigate the task of assessing sentence-level prompt relevance in learner essays. Various systems using word overlap, neural embeddings and neural compositional models are evaluated on two datasets of learner writing. We propose a new method for sentence-level similarity calculation, which learns to adjust the weights of pre-trained word embeddings for a specific task, achieving substantially higher accuracy compared to other relevant baselines.

Explore More