Sergey Feldman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sergey Feldman is active.

Explore More

Publication

Featured researches published by Sergey Feldman.

IEEE Transactions on Knowledge and Data Engineering | 2010

Completely Lazy Learning

Eric K. Garcia; Sergey Feldman; Maya R. Gupta; Santosh Srivastava

Local classifiers are sometimes called lazy learners because they do not train a classifier until presented with a test sample. However, such methods are generally not completely lazy because the neighborhood size k (or other locality parameter) is usually chosen by cross validation on the training set, which can require significant preprocessing and risks overfitting. We propose a simple alternative to cross validation of the neighborhood size that requires no preprocessing: instead of committing to one neighborhood size, average the discriminants for multiple neighborhoods. We show that this forms an expected estimated posterior that minimizes the expected Bregman loss with respect to the uncertainty about the neighborhood choice. We analyze this approach for six standard and state-of-the-art local classifiers, including discriminative adaptive metric kNN (DANN), a local support vector machine (SVM-KNN), hyperplane distance nearest neighbor (HKNN), and a new local Bayesian quadratic discriminant analysis (local BDA). The empirical effectiveness of this technique versus cross validation is confirmed with experiments on seven benchmark data sets, showing that similar classification performance can be attained without any training.

international conference on acoustics, speech, and signal processing | 2009

Part-of-speech histograms for genre classification of text

Sergey Feldman; Marius A. Marin; Mari Ostendorf; Maya R. Gupta

This work addresses the problem of classifying the genre of text, which is useful for a variety of language processing problems. We propose statistics of POS histograms as classification features, coupled with a quadratic discriminant classifier. In experiments on six different text and speech genres, we demonstrate enhanced performance compared to standard techniques using word frequency count features and POS trigram features. Experiments on genres that were not seen in training show intuitive overlaps with the training classes.

international conference on acoustics, speech, and signal processing | 2009

Filtering web text to match target genres

Marius A. Marin; Sergey Feldman; Mari Ostendorf; Maya R. Gupta

In language modeling for speech recognition, both the amount of training data and the match to the target task impact the goodness of the model, with the trade-off usually favoring more data. For conversational speech, having some genre-matched text is particularly important, but also hard to obtain. This paper proposes a new approach for genre detection and compares different alternatives for filtering web text for genre to improve language models for use in automatic transcription of broadcast conversations (talk shows).

Journal of Proteome Research | 2010

Precursor Charge State Prediction for Electron Transfer Dissociation Tandem Mass Spectra

Vagisha Sharma; Jimmy K. Eng; Sergey Feldman; Priska D. von Haller; Michael J. MacCoss; William Stafford Noble

Electron-transfer dissociation (ETD) induces fragmentation along the peptide backbone by transferring an electron from a radical anion to a protonated peptide. In contrast with collision-induced dissociation, side chains and modifications such as phosphorylation are left intact through the ETD process. Because the precursor charge state is an important input to MS/MS sequence database search tools, the ability to accurately determine the precursor charge is helpful for the identification process. Furthermore, because ETD can be applied to large, highly charged peptides, the need for accurate precursor charge state determination is magnified. Otherwise, each spectrum must be searched repeatedly using a large range of possible precursor charge states. To address this problem, we have developed an ETD charge state prediction tool based on support vector machine classifiers that is demonstrated to exhibit superior classification accuracy while minimizing the overall number of predicted charge states. The tool is freely available, open source, cross platform compatible, and demonstrated to perform well when compared with an existing charge state prediction tool. The program is available from http://code.google.com/p/etdz/.

north american chapter of the association for computational linguistics | 2009

Classifying Factored Genres with Part-of-Speech Histograms

Sergey Feldman; Marius A. Marin; Julie Medero; Mari Ostendorf

This work addresses the problem of genre classification of text and speech transcripts, with the goal of handling genres not seen in training. Two frameworks employing different statistics on word/POS histograms with a PCA transform are examined: a single model for each genre and a factored representation of genre. The impact of the two frameworks on the classification of training-matched and new genres is discussed. Results show that the factored models allow for a finer-grained representation of genre and can more accurately characterize genres not seen in training.

SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition | 2011

Multi-task regularization of generative similarity models

Luca Cazzanti; Sergey Feldman; Maya R. Gupta; Michael Gabbay

We investigate a multi-task approach to similarity discriminant analysis, where we propose treating the estimation of the different class-conditional distributions of the pairwise similarities as multiple tasks. We show that regularizing these estimates together using a leastsquares regularization weighted by a task-relatedness matrix can reduce the resulting maximum a posteriori classification errors. Results are given for benchmark data sets spanning a range of applications. In addition, we present a new application of similarity-based learning to analyzing the rhetoric of multiple insurgent groups in Iraq. We show how to produce the necessary task relatedness information from standard given training data, as well as how to derive task-relatedness information if given side information about the class relatedness.

neural information processing systems | 2012