Mittul Singh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mittul Singh is active.

Explore More

Publication

Featured researches published by Mittul Singh.

international conference on acoustics, speech, and signal processing | 2013

Comparing RNNs and log-linear interpolation of improved skip-model on four Babel languages: Cantonese, Pashto, Tagalog, Turkish

Mittul Singh; Dietrich Klakow

Recurrent neural networks (RNNs) are a very recent technique to model long range dependencies in natural languages. They have clearly outperformed trigrams and other more advanced language modeling techniques by using non-linearly modeling long range dependencies. An alternative is to use log-linear interpolation of skip models (i.e. skip bigrams and skip trigrams). The method as such has been published earlier. In this paper we investigate the impact of different smoothing techniques on the skip models as a measure of their overall performance. One option is to use automatically trained distance clusters (both hard and soft) to increase robustness and to combat sparseness in the skip model. We also investigate alternative smoothing techniques on word level. For skip bigrams when skipping a small number of words Kneser-Ney smoothing (KN) is advantageous. For a larger number of words being skipped Dirichlet smoothing performs better. In order to exploit the advantages of both KN and Dirichlet smoothing we propose a new unified smoothing technique. Experiments are performed on four Babel languages: Cantonese, Pashto, Tagalog and Turkish. RNNs and log-linearly interpolated skip models are on par if the skip models are trained with standard smoothing techniques. Using the improved smoothing of the skip models along with distance clusters, we can clearly outperform RNNs by about 8-11 % in perplexity across all four languages.

conference of the international speech communication association | 2016

Sequential Recurrent Neural Networks for Language Modeling.

Youssef Oualil; Clayton Greenberg; Mittul Singh; Dietrich Klakow

Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network. This paper presents a novel approach, which bridges the gap between these two categories of networks. In particular, we propose an architecture which takes advantage of the explicit, sequential enumeration of the word history in FNN structure while enhancing each word representation at the projection layer through recurrent context information that evolves in the network. The context integration is performed using an additional word-dependent weight matrix that is also learned during the training. Extensive experiments conducted on the Penn Treebank (PTB) and the Large Text Compression Benchmark (LTCB) corpus showed a significant reduction of the perplexity when compared to state-of-the-art feedforward as well as recurrent neural network architectures.

empirical methods in natural language processing | 2016

Long-Short Range Context Neural Networks for Language Modeling.

Youssef Oualil; Mittul Singh; Clayton Greenberg; Dietrich Klakow

The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora. This task typically involves the learning of short range dependencies, which generally model the syntactic properties of a language and/or long range dependencies, which are semantic in nature. We propose in this paper a new multi-span architecture, which separately models the short and long context information while it dynamically merges them to perform the language modeling task. This is done through a novel recurrent Long-Short Range Context (LSRC) network, which explicitly models the local (short) and global (long) context using two separate hidden states that evolve in time. This new architecture is an adaptation of the Long-Short Term Memory network (LSTM) to take into account the linguistic properties. Extensive experiments conducted on the Penn Treebank (PTB) and the Large Text Compression Benchmark (LTCB) corpus showed a significant reduction of the perplexity when compared to state-of-the-art language modeling techniques.

european conference on information retrieval | 2018

Long-Span Language Models for Query-Focused Unsupervised Extractive Text Summarization

Mittul Singh; Arunav Mishra; Youssef Oualil; Klaus Berberich; Dietrich Klakow

Effective unsupervised query-focused extractive summarization systems use query-specific features along with short-range language models (LMs) in sentence ranking and selection summarization subtasks. We hypothesize that applying long-span n-gram-based and neural LMs that better capture larger context can help improve these subtasks. Hence, we outline the first attempt to apply long-span models to a query-focused summarization task in an unsupervised setting. We also propose the A cross S entence B oundary LSTM-based LMs, ASB LSTM and bi ASB LSTM, that is geared towards the query-focused summarization subtasks. Intrinsic and extrinsic experiments on a real word corpus with 100 Wikipedia event descriptions as queries show that using the long-span models applied in an integer linear programming (ILP) formulation of MMR criterion are the most effective against several state-of-the-art baseline methods from the literature.

text speech and dialogue | 2016

The Custom Decay Language Model for Long Range Dependencies

Mittul Singh; Clayton Greenberg; Dietrich Klakow

Significant correlations between words can be observed over long distances, but contemporary language models like N-grams, Skip grams, and recurrent neural network language models (RNNLMs) require a large number of parameters to capture these dependencies, if the models can do so at all. In this paper, we propose the Custom Decay Language Model (CDLM), which captures long range correlations while maintaining sub-linear increase in parameters with vocabulary size. This model has a robust and stable training procedure (unlike RNNLMs), a more powerful modeling scheme than the Skip models, and a customizable representation. In perplexity experiments, CDLMs outperform the Skip models using fewer number of parameters. A CDLM also nominally outperformed a similar-sized RNNLM, meaning that it learned as much as the RNNLM but without recurrence.

Theory and Applications of Categories | 2013