Shajith Ikbal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shajith Ikbal is active.

Explore More

Publication

Featured researches published by Shajith Ikbal.

international symposium on neural networks | 1999

Analysis of autoassociative mapping neural networks

Shajith Ikbal; Hemant Misra; B. Yegnanarayana

In this paper we analyse the mapping behavior of an autoassociative neural network (AANN). The mapping in an AANN is achieved by using a dimension reduction followed by a dimension expansion. One of the major results of the analysis is that, the network performs better autoassociation as the size increases. This is because, a network of a given size can deal with only a certain level of nonlinearity. Performance of autoassociative mapping is illustrated with 2D examples. We have shown the utility of the mapping feature of an AANN for speaker verification.

international conference on data engineering | 2009

Business Intelligence from Voice of Customer

L. Venkata Subramaniam; Tanveer A. Faruquie; Shajith Ikbal; Shantanu Godbole; Mukesh K. Mohania

In this paper, we present a first of a kind system, called Business Intelligence from Voice of Customer (BIVoC), that can: 1) combine unstructured information and structured information in an information intensive enterprise and 2) derive richer business insights from the combined data. Unstructured information, in this paper, refers to Voice of Customer (VoC) obtained from interaction of customer with enterprise namely, conversation with call-center agents, email, and sms. Structured database reflect only those business variables that are static over (a longer window of) time such as, educational qualification, age group, and employment details. In contrast, a combination of unstructured and structured data provide access to business variables that reflect upto date dynamic requirements of the customers and more importantly indicate trends that are difficult to derive from a larger population of customers through any other means. For example, some of the variables reflected in unstructured data are problem/interest in a certain product, expression of dissatisfaction with the business provided, and some unexplored category of people showing certain interest/problem. This gives the BIVoC system the ability to derive business insights that are richer, more valuable and crucial to the enterprises than the traditional business intelligence systems which utilize onlystructured information. We demostrate the effectiveness of BIVoC system through one of our real-life engagements where the problem is to determine how to improve agent productivity in a call center scenario. We also highlight major challenges faced while dealing with unstructured information such as handling noise and linking with structured data.

knowledge discovery and data mining | 2014

Predicting student risks through longitudinal analysis

Ashay Tamhane; Shajith Ikbal; Bikram Sengupta; Mayuri Duggirala; James Appleton

Poor academic performance in K-12 is often a precursor to unsatisfactory educational outcomes such as dropout, which are associated with significant personal and social costs. Hence, it is important to be able to predict students at risk of poor performance, so that the right personalized intervention plans can be initiated. In this paper, we report on a large-scale study to identify students at risk of not meeting acceptable levels of performance in one state-level and one national standardized assessment in Grade 8 of a major US school district. An important highlight of our study is its scale - both in terms of the number of students included, the number of years and the number of features, which provide a very solid grounding to the research. We report on our experience with handling the scale and complexity of data, and on the relative performance of various machine learning techniques we used for building predictive models. Our results demonstrate that it is possible to predict students at-risk of poor assessment performance with a high degree of accuracy, and to do so well in advance. These insights can be used to pro-actively initiate personalized intervention programs and improve the chances of student success.

ieee automatic speech recognition and understanding workshop | 2003

Nonlinear spectral transformations for robust speech recognition

Shajith Ikbal; Hynek Hermansky

Recently, a nonlinear transformation of autocorrelation coefficients named phase autocorrelation (PAC) coefficients has been considered for feature extraction. PAC based features show improved robustness to additive noise as a result of two operations, performed during the computation of PAC, namely energy normalization and inverse cosine transformation. In spite of the improved robustness achieved for noisy speech, these two operations lead to some degradation in recognition performance for clean speech. In this paper, we try to alleviate this problem, first by introducing the energy information back into the PAC based features, and second by studying alternatives to the inverse cosine function. Simply appending the frame energy as an additional coefficient in the PAC features has resulted in noticeable improvement in the performance for clean speech. Study of alternatives to the inverse cosine transformation leads to a conclusion that a linear transformation is the best for clean speech, while nonlinear functions help to improve robustness in noise.

conference on information and knowledge management | 2011

Privacy protected knowledge management in services with emphasis on quality data

Debapriyo Majumdar; Rose Catherine; Shajith Ikbal; Karthik Visweswariah

Improving productivity of practitioners through effective knowledge management and delivering high quality service in Application Management Services (AMS) domain, are key focus areas for all IT services organizations. One source of historical knowledge in AMS is the large amount of resolved problem ticket data which are often confidential, immensely valuable, but majority of it is of very bad quality. In this paper, we present a knowledge management tool that detects the quality of information present in problem tickets and enables effective knowledge search in tickets by prioritizing quality data in the search ranking. The tool facilitates leveraging of knowledge across different AMS accounts, while preserving data privacy, by masking client confidential information. It also extracts several relevant entities contained in the noisy unstructured text entered in the tickets and presents them to the users. We present several experimental evaluations and a pilot study conducted with an AMS account which show that our tool is effective and leads to substantial improvement in productivity of the practitioners.

Speech Communication | 2012

Phase AutoCorrelation (PAC) features for noise robust speech recognition

Shajith Ikbal; Hemant Misra; Hynek Hermansky; Mathew Magimai-Doss

In this paper, we introduce a new class of noise robust features derived from an alternative measure of autocorrelation representing the phase variation of speech signal frame over time. These features, referred to as Phase AutoCorrelation (PAC) features include PAC-spectrum and PAC-MFCC, among others. In traditional autocorrelation, correlation between two time delayed signal vectors is computed as their dot product. Whereas in PAC, angle between the vectors in the signal vector space is used to compute the correlation. PAC features are more noise robust because the angle is typically less affected by noise than the dot product. However, the use of angle as correlation estimate makes the PAC features inferior in clean speech. In this paper, we circumvent this problem by introducing another set of features where complementary information among the PAC features and the traditional features are combined adaptively to retain the best of both. An entropy based feature combination method in a multi-layer perceptron (MLP) based multi-stream framework is used to derive an adaptively combined representation of the component feature streams. An evaluation of the combined features using OGI Numbers95 database and Aurora-2 database under various noise conditions and noise levels show significant improvements in recognition accuracies in clean as well as noisy conditions.

international conference on acoustics, speech, and signal processing | 2004

Phase autocorrelation (PAC) features in entropy based multi-stream for robust speech recognition

Shajith Ikbal; Hemant Misra; Hynek Hermansky

Methods to improve noise robustness of speech recognition systems often result in degradation of recognition performance for clean speech. Recently proposed phase autocorrelation (PAC) based features (S. Ikbal et al., Proc. ICASSP-03, p.II-133-6, 2003; Proc. IEEE ASRU 2003 Workshop, 2003), while showing noticeable improvement in noise robustness, also suffer from this drawback. We try to alleviate this problem by using the PAC based features along with regular speech features in a multi-stream framework. The multi-stream system uses the entropy of the posterior probability distribution, computed during recognition, as a confidence measure to combine evidence from different feature streams adaptively (Misra, H. et al., Proc. ICASSP-03, p.II-741-4, 2003). Experimental results obtained on OGI Numbers95 database and Noisex92 noise database show that such a system yields the best possible recognition performance in all conditions. Actually, the combination always performs better than the best performing stream for all the conditions.

international conference on multimedia and expo | 2008

HMM based event detection in audio conversation

Shajith Ikbal; Tanveer A. Faruquie

In this paper, we address the problem of detecting sensitive events in speech signal such as exchange of credit card information. Although close in nature to the word spotting problem, variability in the linguistic content constituting an event and their composition makes event detection a harder task, especially in the context where it is applied such as call-center interaction. In this work we extend the hidden Markov model (HMM) based framework as used in word spotting to event detection, by constructing a network composed of HMM based acoustic models for event and garbage (non-event). Vocabularies specific to the event and non-event are used respectively to build the event and garbage models along with length constraints based on prior knowledge. Effectiveness of this approach is demonstrated by applying it to the problem of detecting credit card transaction event in real life conversations between agents and customers in call center. Our approach yield a false alarm rate of 17.0% and false miss rate of 12.5%.

ieee workshop on neural networks for signal processing | 2002

Speaker normalization using HMM2

Shajith Ikbal; Katrin Weber

We present an HMM2 based method for speaker normalization. Introduced as an extension of hidden Markov model (HMM), HMM2 differentiates itself from the regular HMM in terms of the emission density modeling, which is done by a set of state-dependent HMMs working in the feature vector space. The emission modeling HMM aims at maximizing the likelihood through optimal alignment of its states across the feature components. This property makes it potentially useful to speaker normalization, when applied to spectrum. With the alignment information we get, it is possible to normalize the speaker related variations through piecewise linear warping of frequency axis of the spectrum. In our case, (emission modeling) HMM based spectral warping is employed in the feature extraction block of regular HMM framework for normalizing the speaker related variabilities. After brief description of HMM2, we present the general approach towards HMM2-based speaker normalization and show, through preliminary experiments, the pertinence of the approach.

international conference on acoustics, speech, and signal processing | 2013

Intent focused summarization of caller-agent conversations

Shajith Ikbal; Ashish Verma; Kenneth Church; Jeffrey N. Marcus

In this paper, we propose a conditional random field (CRF) based to identify segments within call center conversations that convey caller intent. A distinguishing aspect of our approach is the use of context information of the intent bearing segments to predict the presence or absence of intents within various segments. The context is represented through a set of phrase features that are frequently present in and around the intent bearing segments. These phrases, identified in a data-driven manner, are used along with conventional word features in a CRF based sequence labeling framework to assign intent/non-intent labels to each utterance in a conversation. Another distinguishing aspect of our approach is that instead of using 1-best label alignment, we extract N-best label alignments at the output of CRF and combine evidences from them to rank the utterances according to their intent bearing potential, so that top ranked utterances can be chosen as the intent summary. To demonstrate the effectiveness of our approach and to evaluate the influence of automatic speech recognition (ASR) errors we evaluated our approach using manually transcribed and ASR transcribed conversations. Experimental results show improved summarization accuracy using our approach. Specifically, in 92% of the manually transcribed conversations accurate summaries of just one utterance length can be extracted using the proposed approach.

Explore More