Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Meenakshi Mishra is active.

Publication


Featured researches published by Meenakshi Mishra.


IEEE Transactions on Knowledge and Data Engineering | 2012

Knowledge Transfer with Low-Quality Data: A Feature Extraction Issue

Brian Quanz; Jun Huan; Meenakshi Mishra

Effectively utilizing readily available auxiliary data to improve predictive performance on new modeling tasks is a key problem in data mining. In this research, the goal is to transfer knowledge between sources of data, particularly when ground-truth information for the new modeling task is scarce or is expensive to collect where leveraging any auxiliary sources of data becomes a necessity. Toward seamless knowledge transfer among tasks, effective representation of the data is a critical but yet not fully explored research area for the data engineer and data miner. Here, we present a technique based on the idea of sparse coding, which essentially attempts to find an embedding for the data by assigning feature values based on subspace cluster membership. We modify the idea of sparse coding by focusing the identification of shared clusters between data when source and target data may have different distributions. In our paper, we point out cases where a direct application of sparse coding will lead to a failure of knowledge transfer. We then present the details of our extension to sparse coding, by incorporating distribution distance estimates for the embedded data, and show that the proposed algorithm can overcome the shortcomings of the sparse coding algorithm on synthetic data and achieve improved predictive performance on a real world chemical toxicity transfer learning task.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2013

Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary

Said Bleik; Meenakshi Mishra; Jun Huan; Min Song

Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high-level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifiers performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.


international conference on data engineering | 2011

Knowledge transfer with low-quality data: A feature extraction issue

Brian Quanz; Jun Huan; Meenakshi Mishra

Effectively utilizing readily available auxiliary data to improve predictive performance on new modeling tasks is a key problem in data mining. In this research, the goal is to transfer knowledge between sources of data, particularly when ground-truth information for the new modeling task is scarce or is expensive to collect where leveraging any auxiliary sources of data becomes a necessity. Toward seamless knowledge transfer among tasks, effective representation of the data is a critical but yet not fully explored research area for the data engineer and data miner. Here, we present a technique based on the idea of sparse coding, which essentially attempts to find an embedding for the data by assigning feature values based on subspace cluster membership. We modify the idea of sparse coding by focusing the identification of shared clusters between data when source and target data may have different distributions. In our paper, we point out cases where a direct application of sparse coding will lead to a failure of knowledge transfer. We then present the details of our extension to sparse coding, by incorporating distribution distance estimates for the embedded data, and show that the proposed algorithm can overcome the shortcomings of the sparse coding algorithm on synthetic data and achieve improved predictive performance on a real world chemical toxicity transfer learning task.


data mining in bioinformatics | 2012

Biomedical text categorization with concept graph representations using a controlled vocabulary

Meenakshi Mishra; Jun Huan; Said Bleik; Min Song

Recent work using graph representations for text categorization has shown promising performance over conventional bag-of-words representation of text documents. In this paper we investigate a graph representation of texts for the task of text categorization. In our representation we identify high level concepts extracted from a database of controlled biomedical terms and build a rich graph structure that contains important concepts and relationships. This procedure ensures that graphs are described with a regular vocabulary, leading to increased ease of comparison. We then classify document graphs by applying a set-based graph kernel that is intuitively sensible and able to deal with the disconnectedness of the constructed concept graphs. We compare this approach to standard approaches using non-graph, text-based features. We also do a comparison amongst different kernels that can be used to see which performs better.


bioinformatics and biomedicine | 2010

Computational prediction of toxicity

Meenakshi Mishra; Hongliang Fei; Jun Huan

As the number of new chemicals developed and being used keep adding every year, having the toxic profiles of each chemical becomes a daunting challenge. To meet this information gap, EPA suggested that certain in vitro assays and computational methods, which predict toxicity related information in much lesser time and cost than traditional in vivo methods, may be used. In this paper, we use computational techniques to use results from certain in vitro assays applied on 309 chemicals (whose toxicity profile is readily available) along with the molecular descriptors and other computed physical-chemical properties of the chemicals to predict the toxicity caused by chemical at a particular endpoint. The dataset is available from EPA TOXCAST group online. We show that Random Forest and Naïve Bayes have a good performance on this dataset. We also show that using small and related trees in random forest help to further improve the performance.


bioinformatics and biomedicine | 2011

Bayesian Classifiers for Chemical Toxicity Prediction

Meenakshi Mishra; Brian Potetz; Jun Huan

A major concern across the globe is the growing number of new chemicals that are brought to use on a regular basis without having any knowledge about their toxic behavior. The challenge here is that the growth in the number of chemicals is fast, and the traditional standards for toxicity testing involve a slow and expensive process of in vivo animal testing. Hence, a number of attempts are being made to find alternate methods of toxicity testing. In this paper we explore Bayesian classifiers and show that if we approximate posterior in the Bayesian classifier with specially crafted basis functions, we can improve upon the performance. We have tested our methods using data sets from the Environmental Protection Agency (EPA). Our experimental study demonstrated the utility of the advanced Bayesian classification approach.


Frontiers in Environmental Science | 2016

Predictive Toxicology: Modeling Chemical Induced Toxicological Response Combining Circular Fingerprints with Random Forest and Support Vector Machine

Alexios Koutsoukas; Joseph St.Amand; Meenakshi Mishra; Jun Huan

Modern drug discovery and toxicological research are under pressure, as the cost of developing and testing new chemicals for potential toxicological risk is rising. Extensive evaluation of chemical products for potential adverse effects is a challenging task, due to the large number of chemicals and the possible hazardous effects on human health. Safety regulatory agencies around the world are dealing with two major challenges. First, the growth of chemicals introduced every year in household products and medicines that need to be tested, and second the need to protect public welfare. Hence, alternative and more efficient toxicological risk assessment methods are in high demand. The Toxicology in the 21st Century (Tox21) consortium a collaborative effort was formed to develop and investigate alternative assessment methods. A collection of 10,000 compounds composed of environmental chemicals and approved drugs were screened for interference in biochemical pathways and released for crowdsourcing data analysis. The physicochemical space covered by Tox21 library was explored, measured by Molecular Weight (MW) and the octanol/water partition coefficient (cLogP). It was found that on average chemical structures had MW of 272.6 Daltons. In case of cLogP the average value was 2.476. Next relationships between assays were examined based on compounds activity profiles across the assays utilizing the Pearson correlation coefficient r. A cluster was observed between the Androgen and Estrogen Receptors and their ligand bind domains accordingly indicating presence of cross talks among the receptors. The highest correlations observed were between NR.AR and NR.AR_LBD, where it was r=0.66 and between NR.ER and NR.ER_LBD, where it was r=0.5. Our approach to model the Tox21 data consisted of utilizing circular molecular fingerprints combined with Random Forest and Support Vector Machine by modeling each assay independently. In all of the 12 sub-challenges our modeling approach achieved performance equal to or higher than 0.7 ROC-AUC showing strong overall performance. Best performance was achieved in sub-challenges NR.AR_LBD, NR.ER_LDB and NR.PPAR_gamma, where ROC-AUC of 0.756, 0.790 and 0.803 was achieved accordingly. These results show that computational methods based on machine learning techniques are well suited to support and play critical role in toxicological research.


international conference on data mining | 2013

Multitask Learning with Feature Selection for Groups of Related Tasks

Meenakshi Mishra; Jun Huan

Multitask learning has been thoroughly proven to improve the generalization performance given a set of related tasks. Most multitask learning algorithm assume that all tasks are related. However, if all the tasks are not related, negative transfer of information occurs amongst the tasks, and the performance of traditional multitask learning algorithm worsens. Thus, we design an algorithm that simultaneously groups the related tasks and trains only the related task together. There are different approaches to train the related tasks in multi-task learning based on which information is shared across the tasks. These approaches either assume that the parameters of each of the tasks are situated close together, or assume that there is a common underlying latent space in the features of the tasks that is related. Most multi-task learning algorithm use either regularization method or matrix-variate priors. In our algorithm, the related tasks are tied together by a set of common features selected by each tasks. Thus, to train the related tasks together, we use spike and slab prior to select a common set of features for the related tasks, and a mixture of gaussians prior to select the set of related tasks. For validation, the developed algorithm is tested on toxicity prediction and hand written digit recognition data sets. The results show a significant improvement over multitask learning with feature selection for larger number of tasks. Further, the developed algorithm is also compared against another state of the art algorithm that similarly groups the related tasks together and proven to be better and more accurate.


conference on information and knowledge management | 2015

Learning Task Grouping using Supervised Task Space Partitioning in Lifelong Multitask Learning

Meenakshi Mishra; Jun Huan

Lifelong multitask learning is a multitask learning framework in which a learning agent faces the tasks that need to be learnt in an online manner. Lifelong multitask learning framework may be applied to a variety of applications such as image annotation, robotics, automated machines etc, and hence, may prove to be a highly promising direction for further investigation. However, the lifelong learning framework comes with its own baggage of challenges. The biggest challenge is the fact that the characteristics of the future tasks which might be encountered by the learning agents are entirely unknown. If all the tasks are assumed to be related, there may be a risk of training from unrelated task resulting in negative transfer of information. To overcome this problem, both batch and online multitask learning algorithms learn task relationships. However, due to the unknown nature of the future tasks, learning the task relationships is also difficult in lifelong multitask learning. In this paper, we propose learning functions to model the task relationships as it is computationally cheaper in an online setting. More specifically, we learn partition functions in the task space to divide the tasks into cluster. Our major contribution is to present a global formulation to learn both the task partitions and the parameters. We provide a supervised learning framework to estimate both the partition function and the model. The current method has been implemented and compared against other leading lifelong learning algorithms using several real world datasets, and we show that the current method has a superior performance.


Archive | 2013

Predicting Toxicity of Chemicals Computationally

Meenakshi Mishra; Jun Huan; Brian Potetz

Collaboration


Dive into the Meenakshi Mishra's collaboration.

Top Co-Authors

Avatar

Jun Huan

University of Kansas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Said Bleik

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge