Brian S. Helfer
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Brian S. Helfer.
acm multimedia | 2013
James R. Williamson; Thomas F. Quatieri; Brian S. Helfer; Rachelle Horwitz; Bea Yu; Daryush D. Mehta
In Major Depressive Disorder (MDD), neurophysiologic changes can alter motor control [1, 2] and therefore alter speech production by influencing the characteristics of the vocal source, tract, and prosodics. Clinically, many of these characteristics are associated with psychomotor retardation, where a patient shows sluggishness and motor disorder in vocal articulation, affecting coordination across multiple aspects of production [3, 4]. In this paper, we exploit such effects by selecting features that reflect changes in coordination of vocal tract motion associated with MDD. Specifically, we investigate changes in correlation that occur at different time scales across formant frequencies and also across channels of the delta-mel-cepstrum. Both feature domains provide measures of coordination in vocal tract articulation while reducing effects of a slowly-varying linear channel, which can be introduced by time-varying microphone placements. With these two complementary feature sets, using the AVEC 2013 depression dataset, we design a novel Gaussian mixture model (GMM)-based multivariate regression scheme, referred to as Gaussian Staircase Regression, that provides a root-mean-squared-error (RMSE) of 7.42 and a mean-absolute-error (MAE) of 5.75 on the standard Beck depression rating scale. We are currently exploring coordination measures of other aspects of speech production, derived from both audio and video signals.
Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge | 2014
James R. Williamson; Thomas F. Quatieri; Brian S. Helfer; Gregory Ciccarelli; Daryush D. Mehta
In individuals with major depressive disorder, neurophysiological changes often alter motor control and thus affect the mechanisms controlling speech production and facial expression. These changes are typically associated with psychomotor retardation, a condition marked by slowed neuromotor output that is behaviorally manifested as altered coordination and timing across multiple motor-based properties. Changes in motor outputs can be inferred from vocal acoustics and facial movements as individuals speak. We derive novel multi-scale correlation structure and timing feature sets from audio-based vocal features and video-based facial action units from recordings provided by the 4th International Audio/Video Emotion Challenge (AVEC). The feature sets enable detection of changes in coordination, movement, and timing of vocal and facial gestures that are potentially symptomatic of depression. Combining complementary features in Gaussian mixture model and extreme learning machine classifiers, our multivariate regression scheme predicts Beck depression inventory ratings on the AVEC test set with a root-mean-square error of 8.12 and mean absolute error of 6.31. Future work calls for continued study into detection of neurological disorders based on altered coordination and timing across audio and video modalities.
wearable and implantable body sensor networks | 2013
Rachelle Horwitz; Thomas F. Quatieri; Brian S. Helfer; Bea Yu; James R. Williamson; James C. Mundt
In Major Depressive Disorder (MDD), neurophysiologic changes can alter motor control [1][2] and therefore alter speech production by influencing vocal fold motion (source), the vocal tract (system), and melody (prosody). In this paper, we use a database of voice recordings from 28 depressed subjects treated over a 6-week period [3] to compare correlations between features from each of the three speech-production components and clinical assessments of MDD. Toward biomarkers for audio-based continuous monitoring of depression severity, we explore the contextual dependence of these correlations with free-response and read speech, and show tradeoffs across categories of features in these two example contexts. Likewise, we also investigate the context-and speech component-dependence of correlations between our vocal features and assessment of individual symptoms of MDD (e.g., depressed mood, agitation, energy). Finally, motivated by our initial findings, we describe how context may be useful in “on-body” monitoring of MDD to facilitate identification of depression and evaluation of its treatment.
bioRxiv | 2018
Brian S. Helfer; Darrell O. Ricke
High throughput sequencing (HTS) of single nucleotide polymorphisms (SNPs) provides additional applications for DNA forensics including identification, mixture analysis, kinship prediction, and biogeographic ancestry prediction. Public repositories of human genetic data are being rapidly generated and released, but the majorities of these samples are de-identified to protect privacy, and have little or no individual metadata such as appearance (photos), ethnicity, relatives, etc. A reference in silico dataset has been generated to enable development and testing of new DNA forensics algorithms. This dataset provides 11 million SNP profiles for individuals with defined ethnicities and family relationships spanning eight generations with admixture for a panel with 39,108 SNPs.
Computer Speech & Language | 2018
James R. Williamson; Diana Young; Andrew A. Nierenberg; James Niemi; Brian S. Helfer; Thomas F. Quatieri
Abstract The ability to track depression severity over time using passive sensing of speech would enable frequent and inexpensive monitoring, allowing rapid assessment of treatment efficacy as well as improved long term care of individuals at high risk for depression. In this paper an algorithm is proposed that estimates the articulatory coordination of speech from audio and video signals, and uses these coordination features to learn a prediction model to track depression severity with treatment. In addition, the algorithm is able to adapt its prediction model to an individual’s baseline data in order to improve tracking accuracy. The algorithm is evaluated on two data sets. The first is the Wyss Institute Biomarkers for Depression (WIBD) multi-modal data set, which includes audio and video speech recordings. The second data set was collected by Mundt et al (2007) and contains audio speech recordings only. The data sets are comprised of patients undergoing treatment for depression as well as control subjects. In its within-subject tracking of clinical Hamilton depression (HAM-D) ratings, the algorithm achieves root mean squared error (RMSE) of 5.49 with Spearman correlation of r = 0.63 on the WIBD data set, and achieves RMSE = 5.99 with r = 0.48 on the Mundt data set.
ieee high performance extreme computing conference | 2017
Siddharth Samsi; Brian S. Helfer; Jeremy Kepner; Albert Reuther; Darrell O. Ricke
Analysis of DNA samples is an important tool in forensics, and the speed of analysis can impact investigations. Comparison of DNA sequences is based on the analysis of short tandem repeats (STRs), which are short DNA sequences of 2–5 base pairs. Current forensics approaches use 20 STR loci for analysis. The use of single nucleotide polymorphisms (SNPs) has utility for analysis of complex DNA mixtures. The use of tens of thousands of SNPs loci for analysis poses significant computational challenges because the forensic analysis scales by the product of the loci count and number of DNA samples to be analyzed. In this paper, we discuss the implementation of a DNA sequence comparison algorithm by re-casting the algorithm in terms of linear algebra primitives. By developing an overloaded matrix multiplication approach to DNA comparisons, we can leverage advances in GPU hardware and algoithms for dense matrix multiplication (DGEMM) to speed up DNA sample comparisons. We show that it is possible to compare 2048 unknown DNA samples with 20 million known samples in under 6 seconds using a NVIDIA K80 GPU.
bioRxiv | 2017
Brian S. Helfer; Philip Fremont-Smith; Darrell O. Ricke
Accurate kinship predictions using DNA forensic samples is limited to first degree relatives. High throughput sequencing of single nucleotide polymorphisms and short tandem repeats (STRs) can be used to expand DNA forensics kinship prediction capabilities. Current kinship identification models incorporate STR size profiles to statistical models that do not adequately depict genetic inheritance beyond the first degree, or machine learning algorithms that are prone to over optimization and requiring similar training data. This work presents an alternative approach using a computational framework that incorporates the inheritance of single nucleotide polymorphisms (SNPs) between specific relationships(patent pending)[1]. The impact of SNP panel size on predictions is visualized in terms of the distribution of allelic differences between individuals. The confidence of predictions is made by calculating log likelihood ratios. With a panel of 39108 SNPs evaluated on an in silico dataset, this method can resolve parents from siblings and distinguish 1st, 2nd, 3rd, and 4th degree relatives from each other and unrelated individuals.
conference of the international speech communication association | 2013
Brian S. Helfer; Thomas F. Quatieri; James R. Williamson; Daryush D. Mehta; Rachelle Horwitz; Bea Yu
conference of the international speech communication association | 2014
Brian S. Helfer; Thomas F. Quatieri; James R. Williamson; Laurel Keyes; Benjamin Evans; W. Nicholas Greene; Trina Vian; Joseph Lacirignola; Trey E. Shenk; Thomas M. Talavage; Jeff Palmer; Kristin Heaton
conference of the international speech communication association | 2015
James R. Williamson; Thomas F. Quatieri; Brian S. Helfer; Joseph Perricone; Satrajit S. Ghosh; Gregory Ciccarelli; Daryush D. Mehta