Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vivian Tsang is active.

Publication


Featured researches published by Vivian Tsang.


meeting of the association for computational linguistics | 2002

A Multilingual Paradigm for Automatic Verb Classification

Paola Merlo; Suzanne Stevenson; Vivian Tsang; Gianluca Allaria

We demonstrate the benefits of a multilingual approach to automatic lexical semantic verb classification based on statistical analysis of corpora in multiple languages. Our research incorporates two interrelated threads. In one, we exploit the similarities in the crosslinguistic classification of verbs, to extend work on English verb classification to a new language (Italian), and to new classes within that language, achieving an accuracy of 86.4% (baseline 33.9%). Our second strand of research exploits the differences across languages in the syntactic expression of semantic properties, to show that complementary information about English verbs can be extracted from their translations in a second language (Chinese). The use of multilingual features improves classification performance of the English verbs, achieving an accuracy of 83.5% (baseline 33.3%).


Computational Linguistics | 2010

A graph-theoretic framework for semantic distance

Vivian Tsang; Suzanne Stevenson

Many NLP applications entail that texts are classified based on their semantic distance (how similar or different the texts are). For example, comparing the text of a new document to that of documents of known topics can help identify the topic of the new text. Typically, a distributional distance is used to capture the implicit semantic distance between two pieces of text. However, such approaches do not take into account the semantic relations between words. In this article, we introduce an alternative method of measuring the semantic distance between texts that integrates distributional information and ontological knowledge within a network flow formalism. We first represent each text as a collection of frequency-weighted concepts within an ontology. We then make use of a network flow method which provides an efficient way of explicitly measuring the frequency-weighted ontological distance between the concepts across two texts. We evaluate our method in a variety of NLP tasks, and find that it performs well on two of three tasks. We develop a new measure of semantic coherence that enables us to account for the performance difference across the three data sets, shedding light on the properties of a data set that lends itself well to our method.


international conference on computational linguistics | 2002

Crosslinguistic transfer in automatic verb classification

Vivian Tsang; Suzanne Stevenson; Paola Merlo

We investigate the use of multilingual data in the automatic classification of English verbs, and show that there is a useful transfer of information across languages. Specifically, we experiment with three lexical semantic classes of English verbs. We collect statistical features over a sample of English verbs from each of the classes, as well as over Chinese translations of those verbs. We use the English and Chinese data, alone and in combination, as training data for a machine learning algorithm whose output is an automatic verb classifier. We demonstrate that Chinese data is indeed useful in helping to classify the English verbs (at 82% accuracy), and furthermore that a multilingual combination of data outperforms the English data alone (85% accuracy). Moreover, our results using monolingual corpora show that it is not necessary to use a parallel corpus to extract the translations in order for this technique to be successful.


workshop on graph based methods for natural language processing | 2006

Context Comparison as a Minimum Cost Flow Problem

Vivian Tsang; Suzanne Stevenson

Comparing word contexts is a key component of many NLP tasks, but rarely is it used in conjunction with additional ontological knowledge. One problem is that the amount of overhead required can be high. In this paper, we provide a graphical method which easily combines an ontology with contextual information. We take advantage of the intrinsic graphical structure of an ontology for representing a context. In addition, we turn the ontology into a metric space, such that subgraphs within it, which represent contexts, can be compared. We develop two variants of our graphical method for comparing contexts. Our analysis indicates that our method performs the comparison efficiently and offers a competitive alternative to non-graphical methods.


conference on computational natural language learning | 2001

Automatic verb classification using multilingual resources

Vivian Tsang; Suzanne Stevenson

We propose the use of multilingual corpora in the automatic classification of verbs. We extend the work of (Merlo and Stevenson, 2001), in which statistics over simple syntactic features extracted from textual corpora were used to train an automatic classifier for three lexical semantic classes of English verbs. We hypothesize that some lexical semantic features that are difficult to detect superficially in English may manifest themselves as easily extractable surface syntactic features in another language. Our experimental results combining English and Chinese features show that a small bilingual corpus may provide a useful alternative to using a large monolingual corpus for verb classification.


human computer interaction with mobile devices and services | 2014

Interaction for reading comprehension on mobile devices

Rafael Veras; Erik Paluka; Meng-Wei Chang; Vivian Tsang; Fraser Shein; Christopher Collins

This paper introduces a touch-based reading interface for tablets designed to support vocabulary acquisition, text comprehension, and reduction of reading anxiety. Touch interaction is leveraged to allow direct replacement of words with synonyms, easy access to word definitions and seamless dialogue with a personalized model of the readers vocabulary. We discuss how fluid interaction and direct manipulation coupled with natural language processing can help address the reading needs of audiences such as school-age children and English as Second Language learners.


north american chapter of the association for computational linguistics | 2015

Building a Lexicon of Formulaic Language for Language Learners

Julian Brooke; Adam Hammond; David Jacob; Vivian Tsang; Graeme Hirst; Fraser Shein

Though the multiword lexicon has long been of interest in computational linguistics, most relevant work is targeted at only a small portion of it. Our work is motivated by the needs of learners for more comprehensive resources reflecting formulaic language that goes beyond what is likely to be codified in a dictionary. Working from an initial sequential segmentation approach, we present two enhancements: the use of a new measure to promote the identification of lexicalized sequences, and an expansion to include sequences with gaps. We evaluate using a novel method that allows us to calculate an estimate of recall without a reference lexicon, showing that good performance in the second enhancement depends crucially on the first, and that our lexicon conforms much more with human judgment of formulaic language than alternatives.


conference on computational natural language learning | 2004

Calculating Semantic Distance between Word Sense Probability Distributions.

Vivian Tsang; Suzanne Stevenson


international conference on computational linguistics | 2014

Unsupervised Multiword Segmentation of Large Corpora using Prediction-Driven Decomposition of n-grams

Julian Brooke; Vivian Tsang; Graeme Hirst; Fraser Shein


Archive | 2013

System and method for enhancing comprehension and readability of text

Vivian Tsang; David Jacob; Fraser Shein

Collaboration


Dive into the Vivian Tsang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christopher Collins

University of Ontario Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Erik Paluka

University of Ontario Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Meng-Wei Chang

University of Ontario Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Rafael Veras

University of Ontario Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Adam Hammond

San Diego State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge