Is this you? Create Your Porfile

Samar Husain

International Institute of Information Technology, Hyderabad

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Samar Husain is active.

Explore More

Publication

Featured researches published by Samar Husain.

international workshop conference on parsing technologies | 2009

Two stage constraint based hybrid approach to free word order language dependency parsing

Akshar Bharati; Samar Husain; Dipti Misra; Rajeev Sangal

The paper describes the overall design of a new two stage constraint based hybrid approach to dependency parsing. We define the two stages and show how different grammatical construct are parsed at appropriate stages. This division leads to selective identification and resolution of specific dependency relations at the two stages. Furthermore, we show how the use of hard constraints and soft constraints helps us build an efficient and robust hybrid parser. Finally, we evaluate the implemented parser on Hindi and compare the results with that of two data driven dependency parsers.

meeting of the association for computational linguistics | 2005

Comparison, Selection and Use of Sentence Alignment Algorithms for New Language Pairs

Anil Kumar Singh; Samar Husain

Several algorithms are available for sentence alignment, but there is a lack of systematic evaluation and comparison of these algorithms under different conditions. In most cases, the factors which can significantly affect the performance of a sentence alignment algorithm have not been considered while evaluating. We have used a method for evaluation that can give a better estimate about a sentence alignment algorithms performance, so that the best one can be selected. We have compared four approaches using this method. These have mostly been tried on European language pairs. We have evaluated manually-checked and validated English-Hindi aligned parallel corpora under different conditions. We also suggest some guidelines on actual alignment.

international conference on asian language processing | 2009

A Modular Cascaded Approach to Complete Parsing

Samar Husain; Phani Gadde; Bharat Ram Ambati; Dipti Misra Sharma; Rajeev Sangal

In this paper, we propose a modular cascaded approach to data driven dependency parsing. Each module or layer leading to the complete parse produces a linguistically valid partial parse. We do this by introducing an artificial root node in the dependency structure of a sentence and by catering to distinct dependency label sets that reflect the function of the set internal labels vis-à-vis a distinct and identifiable linguistic unit, at different layers. The linguistic unit in our approach is a clause. Output (partial parse) from each layer can be accessed independently. We applied this approach to Hindi, a morphologically rich free word order language using MST Parser. We did all our experiments on a part of Hyderabad Dependency Treebank. The final results show an increase of 1.35% in unlabeled attachment and 1.36% in labeled attachment accuracies over state-of-the-art data driven Hindi parser.

international conference on computational linguistics | 2011

Identification of conjunct verbs in hindi and its effect on parsing accuracy

Rafiya Begum; Karan Jindal; Ashish Jain; Samar Husain; Dipti Misra Sharma

This paper introduces a work on identification of conjunct verbs in Hindi. The paper will first focus on investigating which noun-verb combination makes a conjunct verb in Hindi using a set of linguistic diagnostics. We will then see which of these diagnostics can be used as features in a MaxEnt based automatic identification tool. Finally we will use this tool to incorporate certain features in a graph based dependency parser and show an improvement over previous best Hindi parsing accuracy.

international conference on computational linguistics | 2010

Issues in analyzing telugu sentences towards building a telugu treebank

Chaitanya Vempaty; Viswanatha Naidu; Samar Husain; Ravi Kiran; Lakshmi Bai; Dipti Misra Sharma; Rajeev Sangal

This paper describes an effort towards building a Telugu Dependency Treebank. We discuss the basic framework and issues we encountered while annotating. 1487 sentences have been annotated in Paninian framework. We also discuss how some of the annotation decisions would effect the development of a parser for Telugu.

meeting of the association for computational linguistics | 2007

Simple Preposition Correspondence: A Problem in English to Indian Language Machine Translation

Samar Husain; Dipti Misra Sharma; Manohar Reddy

The paper describes an approach to automatically select from Indian Language the appropriate lexical correspondence of English simple preposition. The paper describes this task from a Machine Translation (MT) perspective. We use the properties of the head and complement of the preposition to select the appropriate sense in the target language. We later show that the results obtained from this approach are promising.

international conference natural language processing | 2008

A Graph Based Method for Building Multilingual Weakly Supervised Dependency Parsers

Jagadeesh Gorla; Anil Kumar Singh; Rajeev Sangal; Karthik Gali; Samar Husain; Sriram Venkatapathy

The structure of a sentence can be seen as a spanning tree in a linguistically augmented graph of syntactic nodes. This paper presents an approach for unlabeled dependency parsing based on this view. The first step involves marking the chunks and the chunk heads of a given sentence and then identifying the intra-chunk dependency relations. The second step involves learning to identify the inter-chunk dependency relations. For this, we use an initialization technique based on a measure we call Normalized Conditional Mutual Information (NCMI), in addition to a few linguistic constraints. We present the results for Hindi. We have achieved a precision of 80.83% for sentences of size less than 10 words and 66.71% overall. This is significantly better than the baseline in which random initialization is used.

international joint conference on natural language processing | 2008