Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Rohit J. Kate is active.

Publication


Featured researches published by Rohit J. Kate.


Artificial Intelligence in Medicine | 2005

Comparative experiments on learning information extractors for proteins and their interactions

Razvan C. Bunescu; Ruifang Ge; Rohit J. Kate; Edward M. Marcotte; Raymond J. Mooney; Arun K. Ramani; Yuk Wah Wong

OBJECTIVE Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in Medline. However, extraction efforts have been frustrated by the lack of conventions for describing human genes and proteins. We have developed and evaluated a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting information on interactions between the proteins. METHODS AND MATERIAL We used a variety of machine learning methods to automatically develop information extraction systems for extracting information on gene/protein name, function and interactions from Medline abstracts. We present cross-validated results on identifying human proteins and their interactions by training and testing on a set of approximately 1000 manually-annotated Medline abstracts that discuss human genes/proteins. RESULTS We demonstrate that machine learning approaches using support vector machines and maximum entropy are able to identify human proteins with higher accuracy than several previous approaches. We also demonstrate that various rule induction methods are able to identify protein interactions with higher precision than manually-developed rules. CONCLUSION Our results show that it is promising to use machine learning to automatically build systems for extracting information from biomedical text. The results also give a broad picture of the relative strengths of a wide variety of methods when tested on a reasonably large human-annotated corpus.


meeting of the association for computational linguistics | 2006

Using String-Kernels for Learning Semantic Parsers

Rohit J. Kate; Raymond J. Mooney

We present a new approach for mapping natural language sentences to their formal meaning representations using string-kernel-based classifiers. Our system learns these classifiers for every production in the formal language grammar. Meaning representations for novel natural language sentences are obtained by finding the most probable semantic parse using these string classifiers. Our experiments on two real-world data sets show that this approach compares favorably to other existing systems and is particularly robust to noise.


Data Mining and Knowledge Discovery | 2016

Using dynamic time warping distances as features for improved time series classification

Rohit J. Kate

Dynamic time warping (DTW) has proven itself to be an exceptionally strong distance measure for time series. DTW in combination with one-nearest neighbor, one of the simplest machine learning methods, has been difficult to convincingly outperform on the time series classification task. In this paper, we present a simple technique for time series classification that exploits DTW’s strength on this task. But instead of directly using DTW as a distance measure to find nearest neighbors, the technique uses DTW to create new features which are then given to a standard machine learning method. We experimentally show that our technique improves over one-nearest neighbor DTW on 31 out of 47 UCR time series benchmark datasets. In addition, this method can be easily extended to be used in combination with other methods. In particular, we show that when combined with the symbolic aggregate approximation (SAX) method, it improves over it on 37 out of 47 UCR datasets. Thus the proposed method also provides a mechanism to combine distance-based methods like DTW with feature-based methods like SAX. We also show that combining the proposed classifiers through ensembles further improves the performance on time series classification.


north american chapter of the association for computational linguistics | 2007

Semi-Supervised Learning for Semantic Parsing using Support Vector Machines

Rohit J. Kate; Raymond J. Mooney

We present a method for utilizing unan-notated sentences to improve a semantic parser which maps natural language (NL) sentences into their formal meaning representations (MRs). Given NL sentences annotated with their MRs, the initial supervised semantic parser learns the mapping by training Support Vector Machine (SVM) classifiers for every production in the MR grammar. Our new method applies the learned semantic parser to the unannotated sentences and collects unla-beled examples which are then used to retrain the classifiers using a variant of transductive SVMs. Experimental results show the improvements obtained over the purely supervised parser, particularly when the annotated training set is small.


international conference on computational linguistics | 2014

UWM: Disorder Mention Extraction from Clinical Text Using CRFs and Normalization Using Learned Edit Distance Patterns

Omid Ghiasvand; Rohit J. Kate

This paper describes Team UWM’s system for the Task 7 of SemEval 2014 that does disorder mention extraction and normalization from clinical text. For the disorder mention extraction (Task A), the system was trained using Conditional Random Fields with features based on words, their POS tags and semantic types, as well as features based on MetaMap matches. For the disorder mention normalization (Task B), variations of disorder mentions were considered whenever exact matches were not found in the training data or in the UMLS. Suitable types of variations for disorder mentions were automatically learned using a new method based on edit distance patterns. Among nineteen participating teams, UWM ranked third in Task A with 0.755 strict F-measure and second in Task B with 0.66 strict accuracy.


Physiological Measurement | 2016

Comparative evaluation of features and techniques for identifying activity type and estimating energy cost from accelerometer data.

Rohit J. Kate; Ann M. Swartz; Whitney A. Welch; Scott J. Strath

Wearable accelerometers can be used to objectively assess physical activity. However, the accuracy of this assessment depends on the underlying method used to process the time series data obtained from accelerometers. Several methods have been proposed that use this data to identify the type of physical activity and estimate its energy cost. Most of the newer methods employ some machine learning technique along with suitable features to represent the time series data. This paper experimentally compares several of these techniques and features on a large dataset of 146 subjects doing eight different physical activities wearing an accelerometer on the hip. Besides features based on statistics, distance based features and simple discrete features straight from the time series were also evaluated. On the physical activity type identification task, the results show that using more features significantly improve results. Choice of machine learning technique was also found to be important. However, on the energy cost estimation task, choice of features and machine learning technique were found to be less influential. On that task, separate energy cost estimation models trained specifically for each type of physical activity were found to be more accurate than a single model trained for all types of physical activities.


empirical methods in natural language processing | 2008

A Dependency-based Word Subsequence Kernel

Rohit J. Kate

This paper introduces a new kernel which computes similarity between two natural language sentences as the number of paths shared by their dependency trees. The paper gives a very efficient algorithm to compute it. This kernel is also an improvement over the word subsequence kernel because it only counts linguistically meaningful word subsequences which are based on word dependencies. It overcomes some of the difficulties encountered by syntactic tree kernels as well. Experimental results demonstrate the advantage of this kernel over word subsequence and syntactic tree kernels.


international conference on multimedia and expo | 2001

Audio driven facial animation for audio-visual reality

Tanveer A. Faruquie; Ashish Kapoor; Rohit J. Kate; Nitendra Rajput; L.V. Subramaniam

In this paper, we demonstrate a morphing based automated audio driven facial animation system. Based on an incoming audio stream, a face image is animated with full lip synchronization and expression. An animation sequence using optical flow between visemes is constructed, given an incoming audio stream and still pictures of a face speaking different visemes. Rules are formulated based on coarticulation and the duration of a viseme to control the continuity in terms of shape and extent of lip opening. In addition to this new viseme-expression combinations are synthesized to be able to generate animations with new facial expressions. Finally various applications of this system are discussed in the context of creating audio-visual reality.


conference on computational natural language learning | 2008

Transforming Meaning Representation Grammars to Improve Semantic Parsing

Rohit J. Kate

A semantic parser learning system learns to map natural language sentences into their domain-specific formal meaning representations, but if the constructs of the meaning representation language do not correspond well with the natural language then the system may not learn a good semantic parser. This paper presents approaches for automatically transforming a meaning representation grammar (MRG) to conform it better with the natural language semantics. It introduces grammar transformation operators and meaning representation macros which are applied in an error-driven manner to transform an MRG while training a semantic parser learning system. Experimental results show that the automatically transformed MRGs lead to better learned semantic parsers which perform comparable to the semantic parsers learned using manually engineered MRGs.


international conference on machine learning and applications | 2011

Unsupervised Grammar Induction of Clinical Report Sublanguage

Rohit J. Kate

Clinical reports are written using a subset of natural language while employing many domain-specific terms, such a language is also known as a sub language for a scientific or a technical domain. In this paper, we present a method which automatically induces a grammar for the sub language of a given genre of clinical reports from a corpus of reports with no annotations. The method first identifies the semantic classes of the clinical terms used in the reports, then it induces a grammar that is based on these semantic classes and part-of-speech tags. Experiments show that the induced grammar is able to parse novel sentences and obtains a reasonable accuracy.

Collaboration


Dive into the Rohit J. Kate's collaboration.

Top Co-Authors

Avatar

Raymond J. Mooney

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Omid Ghiasvand

University of Wisconsin–Milwaukee

View shared research outputs
Top Co-Authors

Avatar

Ann M. Swartz

University of Wisconsin–Milwaukee

View shared research outputs
Top Co-Authors

Avatar

Arun K. Ramani

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Edward M. Marcotte

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ruifang Ge

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Scott J. Strath

University of Wisconsin–Milwaukee

View shared research outputs
Researchain Logo
Decentralizing Knowledge