Andrew R. Plummer
Ohio State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andrew R. Plummer.
ieee automatic speech recognition and understanding workshop | 2015
Deblin Bagchi; Michael I. Mandel; Zhong-Qiu Wang; Yanzhang He; Andrew R. Plummer; Eric Fosler-Lussier
Automatic Speech Recognition systems suffer from severe performance degradation in the presence of myriad complicating factors such as noise, reverberation, multiple speech sources, multiple recording devices, etc. Previous challenges have sparked much innovation when it comes to designing systems capable of handling these complications. In this spirit, the CHiME-3 challenge presents system builders with the task of recognizing speech in a real-world noisy setting wherein speakers talk to an array of 6 microphones in a tablet. In order to address these issues, we explore the effectiveness of first applying a model-based source separation mask to the output of a beamformer that combines the source signals recorded by each microphone, followed by a DNN-based front end spectral mapper that predicts clean filterbank features. The source separation algorithm MESSL (Model-based EM Source Separation and Localization) has been extended from two channels to multiple channels in order to meet the demands of the challenge. We report on interactions between the two systems, cross-cut by the use of a robust beamforming algorithm called BeamformIt. Evaluations of different system settings reveal that combining MESSL and the spectral mapper together on the baseline beamformer algorithm boosts the performance substantially.
logical aspects of computational linguistics | 2012
Andrew R. Plummer; Carl Pollard
Working within standard classical higher-order logic, we propose a possible worlds semantics (PWS) which combines the simplicity of the familiar Montague semantics (MS), in which propositions are sets of worlds, with the fine-grainedness of the older but less well-known tractarian semantics (TS) of Wittgenstein and C.I. Lewis, wherein worlds are maximal consistent sets of propositions. The proposed agnostic PWS makes neither montagovian nor tractarian ontological commitments, but is consistent with (and easily extensible to) either alternative (among many others). It is technically straightforward and, we believe, capable of everything linguists need PWS to do, such as interfacing with a logical grammar and serving as a basis for dynamic semantics.
Computer Speech & Language | 2017
Mary E. Beckman; Andrew R. Plummer; Benjamin Munson; Patrick Reidy
Methods from automatic speech recognition (ASR), such as segmentation and forced alignment, have facilitated the rapid annotation and analysis of very large adult speech databases and databases of caregiver-infant interaction, enabling advances in speech science that were unimaginable just a few decades ago. This paper centers on two main problems that must be addressed in order to have analogous resources for developing and exploiting databases of young childrens speech. The first problem is to understand and appreciate the differences between adult and child speech that cause ASR models developed for adult speech to fail when applied to child speech. These differences include the fact that childrens vocal tracts are smaller than those of adult males and also changing rapidly in size and shape over the course of development, leading to between-talker variability across age groups that dwarfs the between-talker differences between adult men and women. Moreover, children do not achieve fully adult-like speech motor control until they are young adults, and their vocabularies and phonological proficiency are developing as well, leading to considerably more within-talker variability as well as more between-talker variability. The second problem then is to determine what annotation schemas and analysis techniques can most usefully capture relevant aspects of this variability. Indeed, standard acoustic characterizations applied to child speech reveal that adult-centered annotation schemas fail to capture phenomena such as the emergence of covert contrasts in childrens developing phonological systems, while also revealing childrens nonuniform progression toward community speech norms as they acquire the phonological systems of their native languages. Both problems point to the need for more basic research into the growth and development of the articulatory system (as well as of the lexicon and phonological system) that is oriented explicitly toward the construction of age-appropriate computational models.
Discussiones Mathematicae Graph Theory | 2009
Johannes H. Hattingh; Ernst J. Joubert; Marc Loizeaux; Andrew R. Plummer; Lucas C. van der Merwe
Let G = (V,E) be a graph. A set S ⊆ V is a restrained dominating set if every vertex in V − S is adjacent to a vertex in S and to a vertex in V − S. The restrained domination number of G, denoted by γr(G), is the minimum cardinality of a restrained dominating set of G. A unicyclic graph is a connected graph that contains precisely one cycle. We show that if U is a unicyclic graph of order n, then γr(U) ≥ dn3 e, and provide a characterization of graphs achieving this bound.
Journal of the Acoustical Society of America | 2018
Michael L. Wilson; Lisa R. O'Bryan; Andrew R. Plummer; Mary E. Beckman; Benjamin Munson
Humans differ strikingly from other primates in the capacity for vocal learning. How and why such vocal flexibility evolved remains puzzling. Evidence of geographic variation in the pant-hoot calls of chimpanzees (Pan troglodytes) suggests that chimpanzees have some capacity for vocal learning, which is intriguing given the close phylogenetic relationship between humans and chimpanzees. Many questions remain, however, about the various factors that might contribute to variation in acoustic structure of pant-hoot calls, including body size, health, genetic relatedness, and within-individual variation. We are currently examining these factors in a study of longitudinal recordings from individuals in two neighboring chimpanzee communities in Gombe National Park, Tanzania. As in studies of sound change in human social groups, we need to understand the articulatory mechanisms that produce the vocalizations, to interpret acoustic variation against the backdrop of factors including body size, sex, and age. The s...
conference of the international speech communication association | 2016
Andrew R. Plummer; Mary E. Beckman
Parametric speech synthesis has played an integral role in speech research since the 1950s. However, software sharing is unwieldy, making replication of experiments difficult, creating obstacles to communication between laboratories, and hindering entry into research. This paper describes our use of the Speech Recognition Virtual Kitchen environment (www.speechkitchen.org) to develop an infrastructure for sharing synthesis software for research and education. We tested the infrastructure by using it in teaching a seminar on “the speech science of speech synthesis” to students from several of the graduate programs in linguistics at the Ohio State University. Using the virtual machines that we developed for Klatt’s formant synthesis program and Kawahara’s STRAIGHT speech analysis, modification, and synthesis system enabled the students to advance much further in their understanding of the basic principles underlying these acoustic-domain models by comparison to the students enrolled in a similar seminar that we taught previously without the virtual machines. At the same time, implementing these and two other virtual machines for the course did not live up to our expectations for the course, in ways that highlight the need to adapt both the Speech Kitchen environment and the synthesis software systems to the needs of low-tech, low-resource users.
Journal of the Acoustical Society of America | 2015
Andrew R. Plummer
Computational models and methods of analysis have become a mainstay in speech research over the last seventy years, but the means for sharing software systems is often left to personal communication between model developers. As a result, the sharing of systems is typically complicated, error-prone, or simply not done at all, making it difficult (in some cases impossible) to verify model performance, engage with developers directly using their own models, or bridge gaps between research communities that have diverged over time. Moreover, the learning curve for new students or tech consumers entering these communities is quite steep, limiting the use of the models to those initiated in a given area. Over the last few years a number of computing infrastructures have taken shape that aim to address the difficulties encountered in the exchange of large software systems (e.g., the Berkeley Computational Environment, http://collaboratool.berkeley.edu/). We present the Speech Recognition Virtual Kitchen (www.spee...
Journal of the Acoustical Society of America | 2014
Andrew R. Plummer
Vowel normalization is a computation that is meant to account for the differences in the absolute direct (physical or psychophysical) representations of qualitatively equivalent vowel productions that arise due to differences in speaker properties such as body size types, age, gender, and other socially interpreted categories that are based on natural variation in vocal tract size and shape. We present a virtual environment for vocal learning which provides the means to model the acquisition of vowel normalization, along with other aspects of vocal learning. The environment consists of models of caretaker agents representing five different language communities—American English, Cantonese, Greek, Japanese, and Korean—derived from vowel category perception experiments (Munson et al., 2010, Plummer et al., 2013) and models of infant agents (Plummer, 2012, 2013) that “vocally interact” with their caretakers. Moreover, we develop a model of caretaker social and vocal signaling in response to infant vowel produ...
Journal of the Acoustical Society of America | 2013
Andrew R. Plummer
Results of decades of research on vowels support the conclusion that perception and production of language-specific vowel categories cannot be based on invariant targets that are represented directly in either the auditory domain or the articulatory (sensorimotor) domain. This raises a number of questions about how an infant can acquire the cognitive representations relevant for learning the vowels of the ambient language. Some models of the acquisition process assume a fixed auditory transform to normalize for talker vocal tract size (e.g., Callan et al., 2000), ignoring evidence that normalization must be culture-specific (e.g., Johnson, 2005). Others assume that learning can be based on statistical regularities solely within the auditory domain (e.g., Assmann and Nearey, 2008), ignoring evidence that articulatory experience also shapes vowel category learning (e.g., Kamen and Watson, 1991). This paper outlines an alternative approach that models cross-modal learning. The approach aligns graph structure...
Applied Mathematics Letters | 2008
Huaming Xing; Johannes H. Hattingh; Andrew R. Plummer
Abstract Let G = ( V , E ) be a simple graph. A set D ⊆ V is a dominating set of G if every vertex of V − D is adjacent to a vertex of D . The domination number of G , denoted by γ ( G ) , is the minimum cardinality of a dominating set of G . We prove that if G is a Hamiltonian graph of order n with minimum degree at least six, then γ ( G ) ≤ 6 n 17 .