Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Masood Ghayoomi is active.

Publication


Featured researches published by Masood Ghayoomi.


language resources and evaluation | 2011

Lessons from building a Persian written corpus: Peykare

Mahmood Bijankhan; Javad Sheykhzadegan; Mohammad Bahrani; Masood Ghayoomi

This paper addresses some of the issues learned during the course of building a written language resource, called ‘Peykare’, for the contemporary Persian. After defining five linguistic varieties and 24 different registers based on these linguistic varieties, we collected texts for Peykare to do a linguistic analysis, including cross-register differences. For tokenization of Persian, we propose a descriptive generalization to normalize orthographic variations existing in texts. To annotate Peykare, we use EAGLES guidelines which result to have a hierarchy in the part-of-speech tags. To this aim, we apply a semi-automatic approach for the annotation methodology. In the paper, we also give a special attention to the Ezafe construction and homographs which are important in Persian text analyses.


international multiconference on computer science and information technology | 2010

PerGram: A TRALE implementation of an HPSG fragment of Persian

Stefan Müller; Masood Ghayoomi

In this paper, we discuss an HPSG grammar of Persian (PerGram) that is implemented in the TRALE system. We describe some of the phenomena which are currently covered. While working on the grammar, we developed a test suite with positive and negative examples from the linguistic literature. To be able to test the coverage of the grammar with respect to naturally occurring sentences, we use a subcorpus of a big corpus of Persian.


International Conference on NLP | 2012

Word Clustering for Persian Statistical Parsing

Masood Ghayoomi

Syntactically annotated data like a treebank are used for training the statistical parsers. One of the main aspects in developing statistical parsers is their sensitivity to the training data. Since data sparsity is the biggest challenge in data oriented analyses, parsers have a malperformance if they are trained with a small set of data, or when the genre of the training and the test data are not equal. In this paper, we propose a word-clustering approach using the Brown algorithm to overcome these problems. Using the proposed class-based model, a more coarser level of the lexicon is created compared to the words. In addition, we propose an extension to the clustering approach in which the POS tags of the words are also taken into the consideration while clustering the words. We prove that adding this information improves the performance of clustering specially for homographs. In usual word clusterings, homographs are treated equally; while the proposed extended model considers the homographs distinct and causes them to be assigned to different clusters. The experimental results show that the class-based approach outperforms the word-based parsing in general. Moreover, we show the superiority of the proposed extension of the class-based parsing to the model which only uses words for clustering.


Linguistic Issues in Language Technology | 2012

Bootstrapping the Development of an HPSG-based Treebank for Persian

Masood Ghayoomi


Int. J. of Asian Lang. Proc. | 2010

A Study of Corpus Development for Persian.

Masood Ghayoomi; Saeedeh Momtazi; Mahmood Bijankhan


north american chapter of the association for computational linguistics | 2010

Using Variance as a Stopping Criterion for Active Learning of Frame Assignment

Masood Ghayoomi


language resources and evaluation | 2014

Constituency Parsing of Bulgarian: Word- vs Class-based Parsing

Masood Ghayoomi; Kiril Simov; Petya Osenova


language resources and evaluation | 2012

From Grammar Rule Extraction to Treebanking: A Bootstrapping Approach

Masood Ghayoomi


language resources and evaluation | 2014

Converting an HPSG-based Treebank into its Parallel Dependency-based Treebank

Masood Ghayoomi; Jonas Kuhn


Signal and Data Processing | 2017

A Comparative Study on the Impact of Part-of-Speech Tagging on Parsing for the Persian Language Processing

Masood Ghayoomi

Collaboration


Dive into the Masood Ghayoomi's collaboration.

Top Co-Authors

Avatar

Jonas Kuhn

University of Stuttgart

View shared research outputs
Top Co-Authors

Avatar

Ludwig Linhuber

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sebastian St"uker

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kiril Simov

Bulgarian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Petya Osenova

Bulgarian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Kay Berkling

University of Edinburgh

View shared research outputs
Researchain Logo
Decentralizing Knowledge