Oksana Yakhnenko | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Oksana Yakhnenko is active.

Explore More

Publication

Featured researches published by Oksana Yakhnenko.

knowledge discovery and data mining | 2008

Annotating images and image objects using a hierarchical dirichlet process model

Oksana Yakhnenko; Vasant G. Honavar

Many applications call for learning to label individual objects in an image where the only information available to the learner is a dataset of images with their associated captions, i.e., words that describe the image content without specifically labeling the individual objects. We address this problem using a multi-modal hierarchical Dirichlet process model (MoM-HDP) - a nonparametric Bayesian model which provides a generalization for multi-model latent Dirichlet allocation model (MoM-LDA) used for similar problems in the past. We apply this model for predicting labels of objects in images containing multiple objects. During training, the model has access to an un-segmented image and its caption, but not the labels for each object in the image. The trained model is used to predict the label for each region of interest in a segmented image. MoM-HDP generalizes a multi-modal latent Dirichlet allocation model in that it allows the number of components of the mixture model to adapt to the data. The model parameters are efficiently estimated using variational inference. Our experiments show that MoM-HDP performs just as well as or better than the MoM-LDA model (regardless the choice of the number of clusters in the MoM-LDA model).

international conference on data mining | 2005

Discriminatively trained Markov model for sequence classification

Oksana Yakhnenko; Adrian Silvescu; Vasant G. Honavar

In this paper, we propose a discriminative counterpart of the directed Markov Models of order k - 1, or MM(k - 1) for sequence classification. MM(k - 1) models capture dependencies among neighboring elements of a sequence. The parameters of the classifiers are initialized to based on the maximum likelihood estimates for their generative counterparts. We derive gradient based update equations for the parameters of the sequence classifiers in order to maximize the conditional likelihood function. Results of our experiments with data sets drawn from biological sequence classification (specifically protein function and subcellular localization) and text classification applications show that the discriminatively trained sequence classifiers outperform their generative counterparts, confirming the benefits of discriminative training when the primary objective is classification. Our experiments also show that the discriminatively trained MM(k - 1) sequence classifiers are competitive with the computationally much more expensive Support Vector Machines trained using k-gram representations of sequences.

computer vision and pattern recognition | 2009

Multiple label prediction for image annotation with multiple Kernel correlation models

Oksana Yakhnenko; Vasant G. Honavar

Image annotation is a challenging task that allows to correlate text keywords with an image. In this paper we address the problem of image annotation using Kernel Multiple Linear Regression model. Multiple Linear Regression (MLR) model reconstructs image caption from an image by performing a linear transformation of an image into some semantic space, and then recovers the caption by performing another linear transformation from the semantic space into the label space. The model is trained so that model parameters minimize the error of reconstruction directly. This model is related to Canonical Correlation Analysis (CCA) which maps both images and caption into the semantic space to minimize the distance of mapping in the semantic space. Kernel trick is then used for the MLR resulting in Kernel Multiple Linear Regression model. The solution to KMLR is a solution to the generalized eigen-value problem, related to KCCA (Kernel Canonical Correlation Analysis). We then extend Kernel Multiple Linear Regression and Kernel Canonical Correlation analysis models to multiple kernel setting, to allow various representations of images and captions. We present results for image annotation using Multiple Kernel Learning CCA and MLR on Oliva and Torralba (2001) scene recognition that show kernel selection behaviour.

british machine vision conference | 2011

Multi-Instance Multi-Label Learning for Image Classification with Large Vocabularies

Oksana Yakhnenko; Vasant G. Honavar

Multiple Instance Multiple Label learning problem has received much attention in machine learning and computer vision literature due to its applications in image classification and object detection. However, the current state-of-the-art solutions to this problem lack scalability and cannot be applied to datasets with a large number of instances and a large number of labels. In this paper we present a novel learning algorithm for Multiple Instance Multiple Label learning that is scalable for large datasets and performs comparable to the state-of-the-art algorithms. The proposed algorithm trains a set of discriminative multiple instance classifiers (one for each label in the vocabulary of all possible labels) and models the correlations among labels by finding a low rank weight matrix thus forcing the classifiers to share weights. This algorithm is a linear model unlike the state-of-the-art kernel methods which need to compute the kernel matrix. The model parameters are efficiently learned by solving an unconstrained optimization problem for which Stochastic Gradient Descent can be used to avoid storing all the data in memory.

Sigkdd Explorations | 2008

KDD cup 2008 and the workshop on mining medical data

R. Bharat Rao; Oksana Yakhnenko; Balaji Krishnapuram

In this report we summarize the KDD Cup 2008 task, which addressed a problem of early breast cancer detection. We describe the data and the challenges, the results and summarize the algorithms used by the winning teams. We also summarize the workshop on Mining Medical Data held in conjunction with SIGKDD on August 24, 2008 in Las Vegas, NV that brought together researchers working on various aspects of applying machine learning and data mining to challenging tasks in medical and health care domains.

neural information processing systems | 2013