Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael R. Brent is active.

Publication


Featured researches published by Michael R. Brent.


Cognition | 1996

Distributional Regularity and Phonotactic Constraints Are Useful for Segmentation.

Michael R. Brent; Timothy A. Cartwright

In order to acquire a lexicon, young children must segment speech into words, even though most words are unfamiliar to them. This is a non-trivial task because speech lacks any acoustic analog of the blank spaces between printed words. Two sources of information that might be useful for this task are distributional regularity and phonotactic constraints. Informally, distributional regularity refers to the intuition that sound sequences that occur frequently and in a variety of contexts are better candidates for the lexicon than those that occur rarely or in few contexts. We express that intuition formally by a class of functions called DR functions. We then put forth three hypotheses: First, that children segment using DR functions. Second, that they exploit phonotactic constraints on the possible pronunciations of words in their language. Specifically, they exploit both the requirement that every word must have a vowel and the constraints that languages impose on word-initial and word-final consonant clusters. Third, that children learn which word-boundary clusters are permitted in their language by assuming that all permissible word-boundary clusters will eventually occur at utterance boundaries. Using computational simulation, we investigate the effectiveness of these strategies for segmenting broad phonetic transcripts of child-directed English. The results show that DR functions and phonotactic constraints can be used to significantly improve segmentation. Further, the contributions of DR functions and phonotactic constraints are largely independent, so using both yields better segmentation than using either one alone. Finally, learning the permissible word-boundary clusters from utterance boundaries does not degrade segmentation performance.


Machine Learning | 1999

An Efficient, Probabilistically Sound Algorithm for Segmentation andWord Discovery

Michael R. Brent

This paper presents a model-based, unsupervised algorithm for recovering word boundaries in a natural-language text from which they have been deleted. The algorithm is derived from a probability model of the source that generated the text. The fundamental structure of the model is specified abstractly so that the detailed component models of phonology, word-order, and word frequency can be replaced in a modular fashion. The model yields a language-independent, prior probability distribution on all possible sequences of all possible words over a given alphabet, based on the assumption that the input was generated by concatenating words from a fixed but unknown lexicon. The model is unusual in that it treats the generation of a complete corpus, regardless of length, as a single event in the probability space. Accordingly, the algorithm does not estimate a probability distribution on words; instead, it attempts to calculate the prior probabilities of various word sequences that could underlie the observed text. Experiments on phonemic transcripts of spontaneous speech by parents to young children suggest that our algorithm is more effective than other proposed algorithms, at least when utterance boundaries are given and the text includes a substantial number of short utterances.


Cognition | 1997

Syntactic categorization in early language acquisition: formalizing the role of distributional analysis

Timothy A. Cartwright; Michael R. Brent

We propose an explicit, incremental strategy by which children could group words with similar syntactic privileges into discrete, unlabeled categories. This strategy, which can discover lexical ambiguity, is based in part on a generalization of the idea of sentential minimal pairs. As a result, it makes minimal assumptions about the availability of syntactic knowledge at the onset of categorization. Although the proposed strategy is distributional, it can make use of categorization cues from other domains, including semantics and phonology. Computer simulations show that this strategy is effective at categorizing words in both artificial-language samples and transcripts of naturally-occurring, child-directed speech. Further, the simulations show that the proposed strategy performs even better when supplied with semantic information about concrete nouns. Implications for theories of categorization are discussed.


Trends in Cognitive Sciences | 1999

Speech segmentation and word discovery: a computational perspective

Michael R. Brent

The segmentation and word discovery problem arises because speech does not contain any reliable acoustic analog of the blank spaces between words of printed English. As a result, children must segment the utterances they hear in order to discover the sound patterns of individual words in their language. A number of computational models have been proposed to explain how children segment speech and discover words, including ten new models in the last five years. This paper reviews all proposed models and organizes them according to their fundamental segmentation strategies, their processing characteristics, and the ways in which they use memory. All proposed models are found to use one of three fundamental strategies: the utterance-boundary strategy, the predictability strategy, or the word-recognition strategy. Selected predictions of the models are explained, their performance in computer simulations is summarized, and behavioral evidence bearing on them is discussed. Finally, ideas about how these diverse models might be synthesized into one comprehensive model are offered.


Journal of Experimental Psychology: General | 1999

On the discovery of novel wordlike units from utterances: an artificial-language study with implications for native-language acquisition

Delphine Dahan; Michael R. Brent

In 4 experiments, adults were familiarized with utterances from an artificial language. Short utterances occurred both in isolation and as part of a longer utterance, either at the edge or in the middle of the longer utterance. After familiarization, participants recognition memory for fragments of the long utterance was tested. Recognition was greatest for the remainder of the longer utterance after extraction of the short utterance, but only when the short utterance was located at the edge of the long utterance. These results support the incremental distributional regularity optimization (INCDROP) model of speech segmentation and word discovery, which asserts that people segment utterances into familiar and new wordlike units in such a way as to minimize the burden of processing new units. INCDROP suggests that segmentation and word discovery during native-language acquisition may be driven by recognition of familiar units from the start, with no need for transient bootstrapping mechanisms.


Cognition | 1996

Advances in the Computational Study of Language Acquisition.

Michael R. Brent

This paper provides a tutorial introduction to computational studies of how children learn their native languages. Its aim is to make recent advances accessible to the broader research community, and to place them in the context of current theoretical issues. The first section locates computational studies and behavioral studies within a common theoretical framework. The next two sections review two papers that appear in this volume: one on learning the meanings of words and one or learning the sounds of words. The following section highlights an idea which emerges independently in these two papers and which I have dubbed autonomous bootstrapping. Classical bootstrapping hypotheses propose that children begin to get a toc-hold in a particular linguistic domain, such as syntax, by exploiting information from another domain, such as semantics. Autonomous bootstrapping complements the cross-domain acquisition strategies of classical bootstrapping with strategies that apply within a single domain. Autonomous bootstrapping strategies work by representing partial and/or uncertain linguistic knowledge and using it to analyze the input. The next two sections review two more more contributions to this special issue: one on learning word meanings via selectional preferences and one on algorithms for setting grammatical parameters. The final section suggests directions for future research.


Journal of Psycholinguistic Research | 1997

Toward a Unified Model of Lexical Acquisition and Lexical Access

Michael R. Brent

Much effort has gone into constructing models of how children segment speech and thereby discover the words of their language. Much effort has also gone into constructing models of how adults access their mental lexicons and thereby segment speech into words. In this paper, I explore the possibility of a model that could account for both word discovery by children and on-line segmentation by adults. In particular, I discuss extensions to the distributional regularity (DR) model of Brent and Cartwright (1996) that could yield an account of on-line segmentation as well as word discovery.


international colloquium on grammatical inference | 1996

Lexical categorization: fitting template grammars by incremental MDL optimization

Michael R. Brent; Timothy A. Cartwright

2006 – The Henry Edwin Sever Professor of Engineering, Washington U. Joint in Biomedical Engineering and Genetics 2004 – Professor in Computers Science, Washington University, Joint in Biomedical Engineering and Genetics 1999 – 2004 Associate Professor in Computer Science, Washington University, Joint in Biomedical Engineering and Genetics 1997 – 1999 Associate Professor in Cognitive Science, Johns Hopkins University 1991 – 1997 Assistant Professor in Cognitive Science, Johns Hopkins University


Computational Linguistics | 1993

From grammar to lexicon: unsupervised learning of lexical syntax

Michael R. Brent


arXiv: Computation and Language | 1994

Segmenting Speech Without a Lexicon: The Roles of Phonotactics and Speech Source

Timothy A. Cartwright; Michael R. Brent

Collaboration


Dive into the Michael R. Brent's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Delphine Dahan

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge