Is this you? Create Your Porfile

Niladri Chatterjee

Indian Institute of Technology Delhi

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Niladri Chatterjee is active.

Explore More

Publication

Featured researches published by Niladri Chatterjee.

international conference on tools with artificial intelligence | 2007

Extraction-Based Single-Document Summarization Using Random Indexing

Niladri Chatterjee; Shiwali Mohan

This paper presents a summarization technique for text documents exploiting the semantic similarity between sentences to remove the redundancy from the text. Semantic similarity scores are computed by mapping the sentences on a semantic space using random indexing. Random indexing, in comparison with other semantic space algorithms, presents a computationally efficient way of implicit dimensionality reduction. It involves inexpensive vector computations such as addition. It thus provides an efficient way to compute similarities between words, sentences and documents. Random indexing has been used to compute the semantic similarity scores of sentences and graph-based ranking algorithms have been employed to produce an extract of the given text.

international conference on computing theory and applications | 2007

Some Improvements over the BLEU Metric for Measuring Translation Quality for Hindi

Niladri Chatterjee; Anish Johnson; Madhav Krishna

The BLEU translation quality evaluation metric is a modified n-gram precision measure that uses a number of reference translations of a given candidate text in order to evaluate its quality of translation. In this paper, we propose some modifications to this metric so that it suits Indian languages; more specifically, Hindi. The problem of using BLEU in Hindi presents two difficulties: non-availability of multiple references and prevalence of free word order in sentence construction. It is established that the validity of BLEU scores generally increases with the number of reference translations used. Further, Hindi being a free word order language, naive n-gram matching as adopted by BLEU does not accurately predict the quality of a translated text. In our approach we have modified BLEU in order to take care of the above-mentioned shortcomings. Our proposed metric has obtained a closer correlation with human judgment while using just one reference translation

Accident Analysis & Prevention | 2013

Impact of grade separator on pedestrian risk taking behavior

Mariya Khatoon; Geetam Tiwari; Niladri Chatterjee

Pedestrians on Delhi roads are often exposed to high risks. This is because the basic needs of pedestrians are not recognized as a part of the urban transport infrastructure improvement projects in Delhi. Rather, an ever increasing number of cars and motorized two-wheelers encourage the construction of large numbers of flyovers/grade separators to facilitate signal free movement for motorized vehicles, exposing pedestrians to greater risk. This paper describes the statistical analysis of pedestrian risk taking behavior while crossing the road, before and after the construction of a grade separator at an intersection of Delhi. A significant number of pedestrians are willing to take risks in both before and after situations. The results indicate that absence of signals make pedestrians behave independently, leading to increased variability in their risk taking behavior. Variability in the speeds of all categories of vehicles has increased after the construction of grade separators. After the construction of the grade separator, the waiting time of pedestrians at the starting point of crossing has increased and the correlation between waiting times and gaps accepted by pedestrians show that after certain time of waiting, pedestrians become impatient and accepts smaller gap size to cross the road. A Logistic regression model is fitted by assuming that the probability of road crossing by pedestrians depends on the gap size (in s) between pedestrian and conflicting vehicles, sex, age, type of pedestrians (single or in a group) and type of conflicting vehicles. The results of Logistic regression explained that before the construction of the grade separator the probability of road crossing by the pedestrian depends on only the gap size parameter; however after the construction of the grade separator, other parameters become significant in determining pedestrian risk taking behavior.

EWCBR '93 Selected papers from the First European Workshop on Topics in Case-Based Reasoning | 1993

Adaptation through Interpolation for Time-Critical Case-Based Reasoning

Niladri Chatterjee; John A. Campbell

The paper introduces and examines the relevance of the notion of “interpolation” between case features, to facilitate fast adaptation of existing cases to a current situation. When this situation is time-critical there is not enough time for exhaustive comparison of various aspects of all the stored cases, so it may not be possible to retrieve a high-quality match for a current problem within a specified time-limit. Viewing imperfect adaptation as a process of interpolation (or a set of possible processes with different qualities of interpolation) then gives a robust and novel perspective for time-critical reasoning, as well as being equally relevant for case-based reasoning (CBR) in general. Although interpolation-like adaptation techniques have been used in some existing CBR systems, they have not previously been treated explicitly from this perspective.

international conference on computational linguistics | 2008

Discovering word senses from text using random indexing

Niladri Chatterjee; Shiwali Mohan

Random Indexing is a novel technique for dimensionality reduction while creating Word Space model from a given text. This paper explores the possible application of Random Indexing in discovering word senses from the text. The words appearing in the text are plotted onto a multi-dimensional Word Space using Random Indexing. The geometric distance between words is used as an indicative of their semantic similarity. Soft Clustering by Committee algorithm (CBC) has been used to constellate similar words. The present work shows that the Word Space model can be used effectively to determine the similarity index required for clustering. The approach does not require parsers, lexicons or any other resources which are traditionally used in sense disambiguation of words. The proposed approach has been applied to TASA corpus and encouraging results have been obtained.

2013 International Symposium on Computational and Business Intelligence | 2013

Personality Traits Identification Using Rough Sets Based Machine Learning

Umang Gupta; Niladri Chatterjee

Prediction of human behavior from his/her traits has long been sought by cognitive scientists. Human traits are often embedded in ones writings. Although some work has been done on identification of traits from essays, very little work can be found on extracting personality traits from written texts. Psychological studies suggest that extraction and prediction of rules from a data has been long pursued, and several methods have been proposed. In the present work we used Rough sets to extract the rules for prediction of personality traits. Rough Set is a comparatively recent method that has been effective in various fields such as medical, geological and other fields where intelligent decision making is required. Our experiments with rough sets in predicting personality traits produced encouraging results.

Computer Speech & Language | 2015

Random Indexing and Modified Random Indexing based approach for extractive text summarization

Niladri Chatterjee; Pramod Kumar Sahoo

Abstract Random Indexing based extractive text summarization has already been proposed in literature. This paper looks at the above technique in detail, and proposes several improvements. The improvements are both in terms of formation of index (word) vectors of the document, and construction of context vectors by using convolution instead of addition operation on the index vectors. Experiments have been conducted using both angular and linear distances as metrics for proximity. As a consequence, three improved versions of the algorithm, viz. RISUM, RISUM+ and MRISUM were obtained. These algorithms have been applied on DUC 2002 documents, and their comparative performance has been studied. Different ROUGE metrics have been used for performance evaluation. While RISUM and RISUM+ perform almost at par, MRISUM is found to outperform both RISUM and RISUM+ significantly. MRISUM also outperforms LSA+TRM based summarization approach. The study reveals that all the three Random Indexing based techniques proposed in this study produce consistent results when linear distance is used for measuring proximity.

international conference on information technology | 2014

Discrete Differential Evolution for Text Summarization

Shweta Karwa; Niladri Chatterjee

The paper proposes a modified version of Differential Evolution (DE) algorithm and optimization criterion function for extractive text summarization applications. Cosine Similarity measure has been used to cluster similar sentences based on a proposed criterion function designed for the text summarization problem, and important sentences from each cluster are selected to generate a summary of the document. The modified Differential Evolution model ensures integer state values and hence expedites the optimization as compared to conventional DE approach. Experiments showed a 95.5% improvement in time in the Discrete DE approach over the conventional DE approach, while the precision and recall of extracted summaries remained comparable in all cases.

international conference on emerging applications of information technology | 2012

Single document extractive text summarization using Genetic Algorithms

Niladri Chatterjee; Amol Mittal; Shubham Goyal

This paper presents an extraction based single document text summarization technique using Genetic Algorithms. A given document is represented as a weighted Directed Acyclic Graph. A fitness function is defined to mathematically express the quality of a summary in terms of some desired properties of a summary, such as, topic relation, cohesion and readability. Genetic Algorithm is designed to maximize this fitness function, and get the corresponding summary by extracting the most important sentences. Results are compared with a couple of other existing text summarization methods keeping the DUC2002 data as benchmark, and using the precision-recall evaluation technique. The initial results obtained seem promising and encouraging for future work in this area.

international conference on computing theory and applications | 2007

Semantic Integration of Heterogeneous Databases on the Web

Niladri Chatterjee; Madhav Krishna

The Web is replete with databases, many of which are modeled on the relational paradigm. Currently, for the purpose of simultaneous querying data from multiple databases, the federated database technique is used extensively. However, the effectiveness of such a technique is suspect when it comes to querying heterogeneous databases. Therefore, it becomes imperative to develop an efficient methodology for the semantic integration of heterogeneous online databases. This may be realized by defining a mapping from a relational database to a description that utilises the resource description framework (RDF). Such a representation would be machine processable and would make the semantics as expressed by databases more explicit and, thereby, facilitate their integration

Explore More