Mark Dras
Macquarie University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mark Dras.
meeting of the association for computational linguistics | 2009
Stephen Wan; Mark Dras; Robert Dale; Cécile Paris
Abstract-like text summarisation requires a means of producing novel summary sentences. In order to improve the grammaticality of the generated sentence, we model a global (sentence) level syntactic structure. We couch statistical sentence generation as a spanning tree problem in order to search for the best dependency tree spanning a set of chosen words. We also introduce a new search algorithm for this task that models argument satisfaction to improve the linguistic validity of the generated tree. We treat the allocation of modifiers to heads as a weighted bipartite graph matching (or assignment) problem, a well studied problem in graph theory. Using BLEU to measure performance on a string regeneration task, we found an improvement, illustrating the benefit of the spanning tree approach armed with an argument satisfaction model.
meeting of the association for computational linguistics | 1999
Mark Dras
In applications such as translation and paraphrase, operations are carried out on grammars at the meta level. This paper shows how a meta-grammar, defining structure at the meta level, is useful in the case of such operations; in particular, how it solves problems in the current definition of Synchronous TAG (Shieber, 1994) caused by ignoring such structure in mapping between grammars, for applications such as translation. Moreover, essential properties of the formalism remain unchanged.
empirical methods in natural language processing | 2014
Shervin Malmasi; Mark Dras
Language transfer, the characteristic second language usage patterns caused by native language interference, is investigated by Second Language Acquisition (SLA) researchers seeking to find overused and underused linguistic features. In this paper we develop and present a methodology for deriving ranked lists of such features. Using very large learner data, we show our method’s ability to find relevant candidates using sophisticated linguistic features. To illustrate its applicability to SLA research, we formulate plausible language transfer hypotheses supported by current evidence. This is the first work to extend Native Language Identification to a broader linguistic interpretation of learner data and address the automatic extraction of underused features on a per-native language basis.
empirical methods in natural language processing | 2014
Shervin Malmasi; Mark Dras
In this paper we present the first application of Native Language Identification (NLI) to Arabic learner data. NLI, the task of predicting a writer’s first language from their writing in other languages has been mostly investigated with English data, but is now expanding to other languages. We use L2 texts from the newly released Arabic Learner Corpus and with a combination of three syntactic features (CFG production rules, Arabic function words and Part-of-Speech n-grams), we demonstrate that they are useful for this task. Our system achieves an accuracy of 41% against a baseline of 23%, providing the first evidence for classifier-based detection of language transfer effects in L2 Arabic. Such methods can be useful for studying language transfer, developing teaching materials tailored to students’ native language and forensic linguistics. Future directions are discussed.
conference of the european chapter of the association for computational linguistics | 2014
Shervin Malmasi; Mark Dras
We present the first application of Native Language Identification (NLI) to nonEnglish data. Motivated by theories of language transfer, NLI is the task of identifying a writer’s native language (L1) based on their writings in a second language (the L2). An NLI system was applied to Chinese learner texts using topicindependent syntactic models to assess their accuracy. We find that models using part-of-speech tags, context-free grammar production rules and function words are highly effective, achieving a maximum accuracy of 71% . Interestingly, we also find that when applied to equivalent English data, the model performance is almost identical. This finding suggests a systematic pattern of cross-linguistic transfer may exist, where the degree of transfer is independent of the L1 and L2.
international conference on computational linguistics | 2008
Simon Zwarts; Mark Dras
One style of Multi-Engine Machine Translation architecture involves choosing the best of a set of outputs from different systems. Choosing the best translation from an arbitrary set, even in the presence of human references, is a difficult problem; it may prove better to look at mechanisms for making such choices in more restricted contexts. In this paper we take a classification-based approach to choosing between candidates from syntactically informed translations. The idea is that using multiple parsers as part of a classifier could help detect syntactic problems in this context that lead to bad translations; these problems could be detected on either the source side---perhaps sentences with difficult or incorrect parses could lead to bad translations---or on the target side---perhaps the output quality could be measured in a more syntactically informed way, looking for syntactic abnormalities. We show that there is no evidence that the source side information is useful. However, a target-side classifier, when used to identify particularly bad translation candidates, can lead to significant improvements in BLEU score. Improvements are even greater when combined with existing language and alignment model approaches.
Expert Systems | 2008
Bhavna Orgun; Mark Dras; Abhaya C. Nayak; Geoff James
: Domain ontologies and knowledge-based systems have become very important in the agent and semantic web communities. As their use has increased, providing means of resolving semantic differences has also become very important. In this paper we survey the approaches that have been proposed for providing interoperability among domain ontologies. We also discuss some key issues that still need to be addressed if we are to move from semi-automated to fully automated approaches to providing consensus among heterogeneous ontologies.
ant colony optimization and swarm intelligence | 2006
Stephen Gilmour; Mark Dras
For solving combinatorial optimisation problems, exact methods accurately exploit the structure of the problem but are tractable only up to a certain size; approximation or heuristic methods are tractable for very large problems but may possibly be led into a bad solution. A question that arises is, From where can we obtain knowledge of the problem structure via exact methods that can be exploited on large-scale problems by heuristic methods? We present a framework that allows the exploitation of existing techniques and resources to integrate such structural knowledge into the Ant Colony System metaheuristic, where the structure is determined through the notion of kernelization from the field of parameterized complexity. We give experimental results using vertex cover as the problem instance, and show that knowledge of this type of structure improves performance beyond previously defined ACS algorithms.
meeting of the association for computational linguistics | 2003
Stephen Wan; Mark Dras; Cécile Paris; Robert Dale
We explore the problem of single sentence summarisation. In the news domain, such a summary might resemble a headline. The headline generation system we present uses Singular Value Decomposition (SVD) to guide the generation of a headline towards the theme that best represents the document to be summarised. In doing so, the intuition is that the generated summary will more accurately reflect the content of the source document. This paper presents SVD as an alternative method to determine if a word is a suitable candidate for inclusion in the headline. The results of a recall based evaluation comparing three different strategies to word selection, indicate that thematic information does help improve recall.
australasian joint conference on artificial intelligence | 2005
Stephen Gilmour; Mark Dras
Ant Colony Optimization (ACO) is a collection of metaheuristics inspired by foraging in ant colonies, whose aim is to solve combinatorial optimization problems. We identify some principles behind the metaheuristics’ rules; and we show that ensuring their application, as a correction to a published algorithm for the vertex cover problem, leads to a statistically significant improvement in empirical results.
Collaboration
Dive into the Mark Dras's collaboration.
Commonwealth Scientific and Industrial Research Organisation
View shared research outputsCommonwealth Scientific and Industrial Research Organisation
View shared research outputs