Iftekhar Naim
University of Rochester
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Iftekhar Naim.
Cytometry Part A | 2014
Iftekhar Naim; Suprakash Datta; Jonathan Rebhahn; James S. Cavenaugh; Tim R. Mosmann; Gaurav Sharma
We present a model‐based clustering method, SWIFT (Scalable Weighted Iterative Flow‐clustering Technique), for digesting high‐dimensional large‐sized datasets obtained via modern flow cytometry into more compact representations that are well‐suited for further automated or manual analysis. Key attributes of the method include the following: (a) the analysis is conducted in the multidimensional space retaining the semantics of the data, (b) an iterative weighted sampling procedure is utilized to maintain modest computational complexity and to retain discrimination of extremely small subpopulations (hundreds of cells from datasets containing tens of millions), and (c) a splitting and merging procedure is incorporated in the algorithm to preserve distinguishability between biologically distinct populations, while still providing a significant compaction relative to the original data. This article presents a detailed algorithmic description of SWIFT, outlining the application‐driven motivations for the different design choices, a discussion of computational complexity of the different steps, and results obtained with SWIFT for synthetic data and relatively simple experimental data that allow validation of the desirable attributes. A companion paper (Part 2) highlights the use of SWIFT, in combination with additional computational tools, for more challenging biological problems.
Cytometry Part A | 2014
Tim R. Mosmann; Iftekhar Naim; Jonathan Rebhahn; Suprakash Datta; James S. Cavenaugh; Jason M. Weaver; Gaurav Sharma
A multistage clustering and data processing method, SWIFT (detailed in a companion manuscript), has been developed to detect rare subpopulations in large, high‐dimensional flow cytometry datasets. An iterative sampling procedure initially fits the data to multidimensional Gaussian distributions, then splitting and merging stages use a criterion of unimodality to optimize the detection of rare subpopulations, to converge on a consistent cluster number, and to describe non‐Gaussian distributions. Probabilistic assignment of cells to clusters, visualization, and manipulation of clusters by their cluster medians, facilitate application of expert knowledge using standard flow cytometry programs. The dual problems of rigorously comparing similar complex samples, and enumerating absent or very rare cell subpopulations in negative controls, were solved by assigning cells in multiple samples to a cluster template derived from a single or combined sample. Comparison of antigen‐stimulated and control human peripheral blood cell samples demonstrated that SWIFT could identify biologically significant subpopulations, such as rare cytokine‐producing influenza‐specific T cells. A sensitivity of better than one part per million was attained in very large samples. Results were highly consistent on biological replicates, yet the analysis was sensitive enough to show that multiple samples from the same subject were more similar than samples from different subjects. A companion manuscript (Part 1) details the algorithmic development of SWIFT.
ieee international conference on automatic face gesture recognition | 2015
Iftekhar Naim; M. Iftekhar Tanveer; Daniel Gildea; Mohammed E. Hoque
Ever wondered why you have been rejected from a job despite being a qualified candidate? What went wrong? In this paper, we provide a computational framework to quantify human behavior in the context of job interviews. We build a model by analyzing 138 recorded interview videos (total duration of 10.5 hours) of 69 internship-seeking students from Massachusetts Institute of Technology (MIT) as they spoke with professional career counselors. Our automated analysis includes facial expressions (e.g., smiles, head gestures), language (e.g., word counts, topic modeling), and prosodic information (e.g., pitch, intonation, pauses) of the interviewees. We derive the ground truth labels by averaging over the ratings of 9 independent judges. Our framework automatically predicts the ratings for interview traits such as excitement, friendliness, and engagement with correlation coefficients of 0.73 or higher, and quantifies the relative importance of prosody, language, and facial expressions. According to our framework, it is recommended to speak more fluently, use less filler words, speak as “we” (vs. “I”), use more unique words, and smile more.
north american chapter of the association for computational linguistics | 2015
Iftekhar Naim; Young Chol Song; Qiguang Liu; Liang Huang; Henry A. Kautz; Jiebo Luo; Daniel Gildea
We address the problem of automatically aligning natural language sentences with corresponding video segments without any direct supervision. Most existing algorithms for integrating language with videos rely on handaligned parallel data, where each natural language sentence is manually aligned with its corresponding image or video segment. Recently, fully unsupervised alignment of text with video has been shown to be feasible using hierarchical generative models. In contrast to the previous generative models, we propose three latent-variable discriminative models for the unsupervised alignment task. The proposed discriminative models are capable of incorporating domain knowledge, by adding diverse and overlapping features. The results show that discriminative models outperform the generative models in terms of alignment accuracy.
international conference on acoustics, speech, and signal processing | 2010
Iftekhar Naim; Suprakash Datta; Gaurav Sharma; James S. Cavenaugh; Tim R. Mosmann
Flow cytometry (FC) is a powerful technology for rapid multivariate analysis and functional discrimination of cells. Current FC platforms generate large, high-dimensional datasets which pose a significant challenge for traditional manual bivariate analysis. Automated multivariate clustering, though highly desirable, is also stymied by the critical requirement of identifying rare populations that form rather small clusters, in addition to the computational challenges posed by the large size and dimensionality of the datasets. In this paper, we address these twin challenges by developing a two-stage scalable multivariate parametric clustering algorithm. In the first stage, we model the data as a mixture of Gaussians and use an iterative weighted sampling technique to estimate the mixture components successively in order of decreasing size. In the second stage, we apply a graph-based hierarchical merging technique to combine Gaussian components with significant overlaps into the final number of desired clusters. The resulting algorithm offers a reduction in complexity over conventional mixture modeling while simultaneously allowing for better detection of small populations. We demonstrate the effectiveness of our method both on simulated data and actual flow cytometry datasets.
Communications of The ACM | 2017
Walter S. Lasecki; Christopher D. Miller; Iftekhar Naim; Raja S. Kushalnagar; Adam Sadilek; Daniel Gildea; Jeffrey P. Bigham
Quickly converting speech to text allows deaf and hard of hearing people to interactively follow along with live speech. Doing so reliably requires a combination of perception, understanding, and speed that neither humans nor machines possess alone. In this article, we discuss how our Scribe system combines human labor and machine intelligence in real time to reliably convert speech to text with less than 4s latency. To achieve this speed while maintaining high accuracy, Scribe integrates automated assistance in two ways. First, its user interface directs workers to different portions of the audio stream, slows down the portion they are asked to type, and adaptively determines segment length based on typing speed. Second, it automatically merges the partial input of multiple workers into a single transcript using a custom version of multiple-sequence alignment. Scribe illustrates the broad potential for deeply interleaving human labor and machine intelligence to provide intelligent interactive services that neither can currently achieve alone.
IEEE Transactions on Affective Computing | 2018
Iftekhar Naim; Mohammad Iftekhar Tanveer; Daniel Gildea; Mohammed E. Hoque
We present a computational framework for automatically quantifying verbal and nonverbal behaviors in the context of job interviews. The proposed framework is trained by analyzing the videos of 138 interview sessions with 69 internship-seeking undergraduates at the Massachusetts Institute of Technology (MIT). Our automated analysis includes facial expressions (e.g., smiles, head gestures, facial tracking points), language (e.g., word counts, topic modeling), and prosodic information (e.g., pitch, intonation, and pauses) of the interviewees. The ground truth labels are derived by taking a weighted average over the ratings of nine independent judges. Our framework can automatically predict the ratings for interview traits such as excitement, friendliness, and engagement with correlation coefficients of 0.70 or higher, and can quantify the relative importance of prosody, language, and facial expressions. By analyzing the relative feature weights learned by the regression models, our framework recommends to speak more fluently, use fewer filler words, speak as “we” (versus “I”), use more unique words, and smile more. We also find that the students who were rated highly while answering the first interview question were also rated highly overall (i.e., first impression matters). Finally, our MIT Interview dataset is available to other researchers to further validate and expand our findings.
ubiquitous computing | 2016
Mohammad Rafayet Ali; Facundo Ciancio; Ru Zhao; Iftekhar Naim; Mohammed E. Hoque
We present an automated interface, ROC Comment, for generating natural language comments on behavioral videos. We focus on the domain of public speaking, which many people consider their greatest fear. We collect a dataset of 196 public speaking videos from 49 individuals and gather 12,173 comments, generated by more than 500 independent human judges. We then train a k-Nearest-Neighbor (k-NN) based model by extracting prosodic (e.g., volume) and facial (e.g., smiles) features. Given a new video, we extract features and select the closest comments using k-NN model. We further filter the comments by clustering them using DBScan, and eliminating the outliers. Evaluation of our system with 30 participants conclude that while the generated comments are helpful, there is room for improvement in further personalizing them. Our model has been deployed online, allowing individuals to upload their videos and receive open-ended and interpretative comments. Our system is available at http://tinyurl.com/roccomment.
Computational Linguistics | 2018
Iftekhar Naim; Parker Riley; Daniel Gildea
Orthographic similarities across languages provide a strong signal for unsupervised probabilistic transduction (decipherment) for closely related language pairs. The existing decipherment models, however, are not well suited for exploiting these orthographic similarities. We propose a log-linear model with latent variables that incorporates orthographic similarity features. Maximum likelihood training is computationally expensive for the proposed log-linear model. To address this challenge, we perform approximate inference via Markov chain Monte Carlo sampling and contrastive divergence. Our results show that the proposed log-linear model with contrastive divergence outperforms the existing generative decipherment models by exploiting the orthographic features. The model both scales to large vocabularies and preserves accuracy in low- and no-resource contexts.
international conference on pattern recognition | 2016
Iftekhar Naim; Abdullah Al Mamun; Young Chol Song; Jiebo Luo; Henry A. Kautz; Daniel Gildea
Scripts provide rich textual annotation of movies, including dialogs, character names, and other situational descriptions. Exploiting such rich annotations requires aligning the sentences in the scripts with the corresponding video frames. Previous work on aligning movies with scripts predominantly relies on time-aligned closed-captions or subtitles, which are not always available. In this paper, we focus on automatically aligning faces in movies with their corresponding character names in scripts without requiring closed-captions/subtitles. We utilize the intuition that faces in a movie generally appear in the same sequential order as their names are mentioned in the script. We first apply standard techniques for face detection and tracking, and cluster similar face tracks together. Next, we apply a generative Hidden Markov Model (HMM) and a discriminative Latent Conditional Random Field (LCRF) to align the clusters of face tracks with the corresponding character names. Our alignment models (especially LCRF) significantly outperform the previous state-of-the-art on two different movie datasets and for a wide range of face clustering algorithms.