Rajeev Sangal
International Institute of Information Technology, Hyderabad
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rajeev Sangal.
meeting of the association for computational linguistics | 1993
Akshar Bharati; Rajeev Sangal
There is a need to develop a suitable computational grammar formalism for free word order languages for two reasons: First, a suitably designed formalism is likely to be more efficient. Second, such a formalism is also likely to be linguistically more elegant and satisfying. In this paper, we describe such a formalism, called the Paninian framework, that has been successfully applied to Indian languages.This paper shows that the Paninian framework applied to modern Indian languages gives an elegant account of the relation between surface form (vibhakti) and semantic (karaka) roles. The mapping is elegant and compact. The same basic account also explains active-passives and complex sentences. This suggests that the solution is not just adhoc but has a deeper underlying unity.A constraint based parser is described for the framework. The constraints problem reduces to bipartite graph matching problem because of the nature of constraints. Efficient solutions are known for these problems.It is interesting to observe that such a parser (designed for free word order languages) compares well in asymptotic time complexity with the parser for context free grammars (CFGs) which are basically designed for positional languages.
international workshop conference on parsing technologies | 2009
Akshar Bharati; Samar Husain; Dipti Misra; Rajeev Sangal
The paper describes the overall design of a new two stage constraint based hybrid approach to dependency parsing. We define the two stages and show how different grammatical construct are parsed at appropriate stages. This division leads to selective identification and resolution of specific dependency relations at the two stages. Furthermore, we show how the use of hard constraints and soft constraints helps us build an efficient and robust hybrid parser. Finally, we evaluate the implemented parser on Hindi and compare the results with that of two data driven dependency parsers.
international conference on asian language processing | 2009
Samar Husain; Phani Gadde; Bharat Ram Ambati; Dipti Misra Sharma; Rajeev Sangal
In this paper, we propose a modular cascaded approach to data driven dependency parsing. Each module or layer leading to the complete parse produces a linguistically valid partial parse. We do this by introducing an artificial root node in the dependency structure of a sentence and by catering to distinct dependency label sets that reflect the function of the set internal labels vis-à-vis a distinct and identifiable linguistic unit, at different layers. The linguistic unit in our approach is a clause. Output (partial parse) from each layer can be accessed independently. We applied this approach to Hindi, a morphologically rich free word order language using MST Parser. We did all our experiments on a part of Hyderabad Dependency Treebank. The final results show an increase of 1.35% in unlabeled attachment and 1.36% in labeled attachment accuracies over state-of-the-art data driven Hindi parser.
international conference on asian language processing | 2012
Abhijeet Gupta; Arjun R. Akula; Deepak Kumar Malladi; Puneeth Kukkadapu; Vinay Ainavolu; Rajeev Sangal
This paper presents a novel approach to building natural language interface to databases (NLIDB) based on Computational Paninian Grammar (CPG). It uses two distinct stages of processing, namely, syntactic processing followed by semantic processing. Syntactic processing makes the processing more general and robust. CPG is a dependency framework in which the analysis is in terms of syntactico-semantic relations. The closeness of these relations makes semantic processing easier and more accurate. It also makes the systems more portable.
international conference on computational linguistics | 2010
Chaitanya Vempaty; Viswanatha Naidu; Samar Husain; Ravi Kiran; Lakshmi Bai; Dipti Misra Sharma; Rajeev Sangal
This paper describes an effort towards building a Telugu Dependency Treebank. We discuss the basic framework and issues we encountered while annotating. 1487 sentences have been annotated in Paninian framework. We also discuss how some of the annotation decisions would effect the development of a parser for Telugu.
meeting of the association for computational linguistics | 2000
Akshar Bharati; Vineet Chaitanya; Rajeev Sangal
Computational linguistics activities in India are being carried out at many institutions. The activities are centred around development of machine translation systems and lexical resources.
international conference natural language processing | 2010
Pawan Kumar; Arun Kumar Rathaur; Rashid Ahmad; Mukul K. Sinha; Rajeev Sangal
The paper presents a software integration, testing and visualization tool, called Dashboard, which is based on pipe-lined backboard architecture for family of natural language processing (NLP) application. The Dashboard helps in testing of a module in isolation, facilitating the training and tuning of a module, integration and testing of a set of heterogeneous modules, and building and testing of complete integrated system as well. It is also equipped with a user-friendly visualization tool to build, test, and integrate a system (or a subsystem) and view its component-wise performance, and step-wise processing as well. The Dashboard is being successfully used by a consortium of eleven academic institutions to develop a suite of bi-directional machine translation (MT) system for nine pairs of Indic languages, and six MT systems have already been deployed on web. The MT systems are being developed by reusing / re-engineering previously developed NLP modules, by different institutions, in different programming languages, using Dashboard as the testing and integration tool. The paper also discusses the experiences of developing MT products in consortium mode, using Dashboard as its integrating and testing platform, and its proposed enhancements.
international conference on computational linguistics | 2014
Akshar Bharati; Rajeev Sangal; Dipti Misra Sharma; Anil Kumar Singh
We describe a representation scheme and an analysis engine using that scheme, both of which have been used to develop infrastructure for HLT. The Shakti Standard Format is a readable and robust representation scheme for analysis frameworks and other purposes. The representation is highly extensible. This representation scheme, based on the blackboard architectural model, allows a very wide variety of linguistic and non-linguistic information to be stored in one place and operated upon by any number of processing modules. We show how it has been successfully used for building machine translation systems for several language pairs using the same architecture. It has also been used for creation of language resources such as treebanks and for different kinds of annotation interfaces. There is even a query language designed for this representation. Easily wrappable into XML, it can be used equally well for distributed computing.
service oriented software engineering | 2013
Pawan Kumar; Rashid Ahmad; Banshi Dhar Chaudhary; Rajeev Sangal
Machine Translation (MT) system is a complex natural language processing (NLP) system composed of large number of heterogeneous modules. Deployment of such a complex system even on a stand alone system is a cumbersome, knowledge intensive and time consuming task, taking hours to load, configure and run the system. As an MT system goes through frequent and regular updates, mainly to improve its accuracy and performance, the cumbersome task of its deployment is required to be repeated on release of each new version. Further, when such a system is needed to be deployed on a cloud infrastructure, mainly to facilitate auto-scaling of computational resources for varying load conditions, the task of deployment gets even more complicated and more time consuming. This paper proposes that every software version of a complex NLP application like MT system should be built and released as a virtual appliance that can be deployed with a very little setup time and with ease even by a common user. It discusses the experiments performed to build the MT system into a virtual appliance, for stand alone system deployment as well as for cloud deployment, and reports the deployment time measurements in both the scenario. Deployment of the virtual MT appliance took 130 seconds in stand alone system, its deployment on a large number of virtual machines in the cloud environment took 150 seconds on an average, in contrast to several hours taken for the deployment of MT applications earlier.
international conference on conceptual structures | 2013
Rajeev Sangal; Soma Paul; P. Kiran Mayee
We show how purpose can be used as a central guiding principle for organizing knowledge about artifacts. It allows the actions in which the artifact participates to be related naturally to other objects. Similarly, the structure or parts of the artifact can also be related to the actions.