Harshinder Singh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Harshinder Singh is active.

Explore More

Publication

Featured researches published by Harshinder Singh.

Probability in the Engineering and Informational Sciences | 1998

The Reversed Hazard Rate Function

Henry W. Block; Thomas H. Savits; Harshinder Singh

In this paper we discuss some properties of the reversed hazard rate function. This function has been shown to be useful in the analysis of data in the presence of left censored observations. It is also natural in discussing lifetimes with reversed time scale. In fact, ordinary hazard rate functions are most useful for lifetimes, and reverse hazard rates are natural if the time scale is reversed. Mixing up these concepts can often, although not always, lead to anomalies. For example, one result gives that if the reversed hazard rate function is increasing, its interval of support must be (—∞, b ) where b is finite. Consequently nonnegative random variables cannot have increasing reversed hazard rates. Because of this result some existing results in the literature on the reversed hazard rate ordering require modification. Reversed hazard rates are also important in the study of systems. Hazard rates have an affinity to series systems; reversed hazard rates seem more appropriate for studying parallel systems. Several results are given that demonstrate this. In studying systems, one problem is to relate derivatives of hazard rate functions and reversed hazard rate functions of systems to similar quantities for components. We give some results that address this. Finally, we carry out comparisons for k -out-of- n systems with respect to the reversed hazard rate ordering.

international symposium on software reliability engineering | 2004

Robust prediction of fault-proneness by random forests

Lan Guo; Yan Ma; Bojan Cukic; Harshinder Singh

Accurate prediction of fault prone modules (a module is equivalent to a C function or a C+ + method) in software development process enables effective detection and identification of defects. Such prediction models are especially beneficial for large-scale systems, where verification experts need to focus their attention and resources to problem areas in the system under development. This paper presents a novel methodology for predicting fault prone modules, based on random forests. Random forests are an extension of decision tree learning. Instead of generating one decision tree, this methodology generates hundreds or even thousands of trees using subsets of the training data. Classification decision is obtained by voting. We applied random forests in five case studies based on NASA data sets. The prediction accuracy of the proposed methodology is generally higher than that achieved by logistic regression, discriminant analysis and the algorithms in two machine learning software packages, WEKA [I. H. Witten et al. (1999)] and See5. The difference in the performance of the proposed methodology over other methods is statistically significant. Further, the classification accuracy of random forests is more significant over other methods in larger data sets.

Econometric Theory | 1994

Testing for Second-Order Stochastic Dominance of Two Distributions

Amarjot Kaur; B. L. S. Prakasa Rao; Harshinder Singh

A distribution function F is said to stochastically dominate another distribution function G in the second-order sense if null, for all x . Second-order stochastic dominance plays an important role in economics, finance, and accounting. Here a statistical test has been constructed to test null, for some x null [ a , b ], against the hypothesis null, for all x null [ a , b ], where a and b are any two real numbers. The test has been shown to be consistent and has an upper bound α on the asymptotic size. The test is expected to have usefulness for comparison of random prospects for risk averters.

international symposium on software reliability engineering | 2001

A Bayesian approach to reliability prediction and assessment of component based systems

Harshinder Singh; Vittorio Cortellessa; Bojan Cukic; Erdogan Gunel; Vijayanand Bharadwaj

It is generally believed that component-based software development leads to improved application quality, maintainability and reliability. However most software reliability techniques model integrated systems. These models disregard systems internal structure, taking into account only the failure data and interactions with the environment. We propose a novel approach to reliability analysis of component-based systems. Reliability prediction algorithm allows system architects to analyze reliability of the system before it is built, taking into account component reliability estimates and their anticipated usage. Fully integrated with the UML, this step can guide the process of identifying critical components and analyze the effect of replacing them with the more/less reliable ones. Reliability assessment algorithm, applicable in the system test phase, utilizes these reliability predictions as prior probabilities. In the Bayesian estimation. framework, posterior probability of failure is calculated from the priors and test failure data.

American Journal of Mathematical and Management Sciences | 2003

Nearest Neighbor Estimates of Entropy

Harshinder Singh; Neeraj Misra; Vladimir Hnizdo; Adam Fedorowicz; Eugene Demchuk

SYNOPTIC ABSTRACT Motivated by the problems in molecular sciences, we introduce new nonparametric estimators of entropy which are based on the kth nearest neighbor distances between the n sample points, where k (< n – 1) is a fixed positive integer. These provide competing estimators to an estimator proposed by Kozachenko and Leonenko (1987), which is based on the first nearest neighbor distances of the sample points. These estimators are helpful in the evaluation of entropies of random vectors. We establish the asymptotic unbiasedness and consistency of the proposed estimators. For some standard distributions, we also investigate their performance for finite sample sizes using Monte Carlo simulations. The proposed estimators are applied to estimate the entropy of internal rotation in the methanol molecule, which can be characterized by a one-dimensional random vector, and of diethyl ether, which is described by a four-dimensional random vector.

Journal of Computational Chemistry | 2007

Nearest-neighbor nonparametric method for estimating the configurational entropy of complex molecules.

Vladimir Hnizdo; Eva Darian; Adam Fedorowicz; Eugene Demchuk; Shengqiao Li; Harshinder Singh

A method for estimating the configurational (i.e., non‐kinetic) part of the entropy of internal motion in complex molecules is introduced that does not assume any particular parametric form for the underlying probability density function. It is based on the nearest‐neighbor (NN) distances of the points of a sample of internal molecular coordinates obtained by a computer simulation of a given molecule. As the method does not make any assumptions about the underlying potential energy function, it accounts fully for any anharmonicity of internal molecular motion. It provides an asymptotically unbiased and consistent estimate of the configurational part of the entropy of the internal degrees of freedom of the molecule. The NN method is illustrated by estimating the configurational entropy of internal rotation of capsaicin and two stereoisomers of tartaric acid, and by providing a much closer upper bound on the configurational entropy of internal rotation of a pentapeptide molecule than that obtained by the standard quasi‐harmonic method. As a measure of dependence between any two internal molecular coordinates, a general coefficient of association based on the information‐theoretic quantity of mutual information is proposed. Using NN estimates of this measure, statistical clustering procedures can be employed to group the coordinates into clusters of manageable dimensions and characterized by minimal dependence between coordinates belonging to different clusters.

IEEE Transactions on Information Forensics and Security | 2006

Performance analysis of iris-based identification system at the matching score level

Natalia A. Schmid; Manasi V. Ketkar; Harshinder Singh; Bojan Cukic

Practical iris-based identification systems are easily accessible for data collection at the matching score level. In a typical setting, a video camera is used to collect a single frontal view image of good quality. The image is then preprocessed, encoded, and compared with all entries in the biometric database resulting in a single highest matching score. In this paper, we assume that multiple scans from the same iris are available and design the decision rules based on this assumption. We consider the cases where vectors of matching scores may be described by a Gaussian model with dependent components under both genuine and imposter hypotheses. Two test statistics: the plug-in loglikelihood ratio and the average Hamming distance are designed. We further analyze the performance of filter-based iris recognition systems. The model fit is verified using the Shapiro-Wilk test for normality. We show that the loglikelihood ratio with well-estimated maximum-likelihood parameters in it often outperforms the average Hamming distance statistic. The problem of identification with M iris classes is further stated as an (M+1)ary hypothesis testing problem. We use empirical approach, Chernoff bound, and Large Deviations approach to predict the performance of the iris-based identification system. The bound on the probability of error is evaluated as a function of the number of classes and the number of iris scans per class.

automated software engineering | 2003

Predicting fault prone modules by the Dempster-Shafer belief networks

Lan Guo; Bojan Cukic; Harshinder Singh

This paper describes a novel methodology for predicting fault prone modules. The methodology is based on Dempster-Shafer (D-S) belief networks. Our approach consists of three steps: first, building the D-S network by the induction algorithm; second, selecting the predictors (attributes) by the logistic procedure; third, feeding the predictors describing the modules of the current project into the inducted D-S network and identifying fault prone modules. We applied this methodology to a NASA dataset. The prediction accuracy of our methodology is higher than that achieved by logistic regression or discriminant analysis on the same dataset.

Journal of Chemical Information and Modeling | 2005

Application of the random forest method in studies of local lymph node assay based skin sensitization data

Shengqiao Li; Adam Fedorowicz; Harshinder Singh; Sidney Soderholm

The random forest and classification tree modeling methods are used to build predictive models of the skin sensitization activity of a chemical. A new two-stage backward elimination algorithm for descriptor selection in the random forest method is introduced. The predictive performance of the random forest model was maximized by tuning voting thresholds to reflect the unbalanced size of classification groups in available data. Our results show that random forest with a proposed backward elimination procedure outperforms a single classification tree and the standard random forest method in predicting Local Lymph Node Assay based skin sensitization activity. The proximity measure obtained from the random forest is a natural similarity measure that can be used for clustering of chemicals. Based on this measure, the clustering analysis partitioned the chemicals into several groups sharing similar molecular patterns. The improved random forest method demonstrates the potential for future QSAR studies based on a large number of descriptors or when the number of available data points is limited.

Communications in Statistics-theory and Methods | 1989

Relations for reliability measures of weighted distributions

Kanchan Jain; Harshinder Singh; Isha Bagai

Relibility measures of weighted distribution of alifeistribution have been derived Sufficientconditions on the weight function have been obtained for the weighted distribution of an IFR distribution to be IFR. Length-biased and equilibrium distributions have been discussed as weighted distributions in the reliability context.

Explore More