Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marc Brysbaert is active.

Publication


Featured researches published by Marc Brysbaert.


Behavior Research Methods | 2009

Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English.

Marc Brysbaert; Boris New

Word frequency is the most important variable in research on word processing and memory. Yet, the main criterion for selecting word frequency norms has been the availability of the measure, rather than its quality. As a result, much research is still based on the old Kučera and Francis frequency norms. By using the lexical decision times of recently published megastudies, we show how bad this measure is and what must be done to improve it. In particular, we investigated the size of the corpus, the language register on which the corpus is based, and the definition of the frequency measure. We observed that corpus size is of practical importance for small sizes (depending on the frequency of the word), but not for sizes above 16–30 million words. As for the language register, we found that frequencies based on television and film subtitles are better than frequencies based on written sources, certainly for the monosyllabic and bisyllabic words used in psycholinguistic research. Finally, we found that lemma frequencies are not superior to word form frequencies in English and that a measure of contextual diversity is better than a measure based on raw frequency of occurrence. Part of the superiority of the latter is due to the words that are frequently used as names. Assembling a new frequency norm on the basis of these considerations turned out to predict word processing times much better than did the existing norms (including Kučera & Francis and Celex). The new SUBTL frequency norms from the SUBTLEXUS corpus are freely available for research purposes from http://brm.psychonomic-journals.org/content/supplemental, as well as from the University of Ghent and Lexique Web sites.


Behavior Research Methods Instruments & Computers | 2004

Lexique 2: A new French lexical database

Boris New; Christophe Pallier; Marc Brysbaert; Ludovic Ferrand

In this article, we present a new lexical database for French:Lexique. In addition to classical word information such as gender, number, and grammatical category,Lexique includes a series of interesting new characteristics. First, word frequencies are based on two cues: a contemporary corpus of texts and the number of Web pages containing the word. Second, the database is split into a graphemic table with all the relevant frequencies, a table structured around lemmas (particularly interesting for the study of the inflectional family), and a table about surface frequency cues. Third,Lexique is distributed under a GNU-like license, allowing people to contribute to it. Finally, a metasearch engine,Open Lexique, has been developed so that new databases can be added very easily to the existing ones.Lexique can either be downloaded or interrogated freely fromhttp://www.lexique.org.


Behavior Research Methods | 2013

Norms of valence, arousal, and dominance for 13,915 English lemmas

Amy Beth Warriner; Victor Kuperman; Marc Brysbaert

Information about the affective meanings of words is used by researchers working on emotions and moods, word recognition and memory, and text-based sentiment analysis. Three components of emotions are traditionally distinguished: valence (the pleasantness of a stimulus), arousal (the intensity of emotion provoked by a stimulus), and dominance (the degree of control exerted by a stimulus). Thus far, nearly all research has been based on the ANEW norms collected by Bradley and Lang (1999) for 1,034 words. We extended that database to nearly 14,000 English lemmas, providing researchers with a much richer source of information, including gender, age, and educational differences in emotion norms. As an example of the new possibilities, we included stimuli from nearly all of the category norms (e.g., types of diseases, occupations, and taboo words) collected by Van Overschelde, Rawson, and Dunlosky (Journal of Memory and Language 50:289-335, 2004), making it possible to include affect in studies of semantic memory.


Behavior Research Methods | 2012

Age-of-acquisition ratings for 30,000 English words

Victor Kuperman; Hans Stadthagen-Gonzalez; Marc Brysbaert

We present age-of-acquisition (AoA) ratings for 30,121 English content words (nouns, verbs, and adjectives). For data collection, this megastudy used the Web-based crowdsourcing technology offered by the Amazon Mechanical Turk. Our data indicate that the ratings collected in this way are as valid and reliable as those collected in laboratory conditions (the correlation between our ratings and those collected in the lab from U.S. students reached .93 for a subsample of 2,500 monosyllabic words). We also show that our AoA ratings explain a substantial percentage of the variance in the lexical-decision data of the English Lexicon Project, over and above the effects of log frequency, word length, and similarity to other words. This is true not only for the lemmas used in our rating study, but also for their inflected forms. We further discuss the relationships of AoA with other predictors of word recognition and illustrate the utility of AoA ratings for research on vocabulary growth.


Journal of Psycholinguistic Research | 1995

Exposure-based models of human parsing: Evidence for the use of coarse-grained (nonlexical) statistical records

Donald Mitchell; Fernando Cuetos; Martin Corley; Marc Brysbaert

Several current models of human parsing maintain that initial structural decisions are influenced (or tuned) by the listeners or readers prior contact with language. The precise workings of these models depend upon the “grain,” or level of detail, at which previous exposures to language are analyzed and used to influence parsing decisions. Some models are premised upon the use of fine-grained records (such as lexical cooccurrence statistics). Others use coarser measures. The present paper considers the viability of models based exclusively on the use of fine-grained lexical records. The results of several studies are reviewed and the evidence suggests that, if they are to account for the data, experience-based parsers must draw upon records or representations that capture statistical regularities beyond the lexical level. This poses problems for several parsing models in the literature.


Quarterly Journal of Experimental Psychology | 2014

SUBTLEX-UK: A new and improved word frequency database for British English

Walter J. B. van Heuven; Paweł Mandera; Emmanuel Keuleers; Marc Brysbaert

We present word frequencies based on subtitles of British television programmes. We show that the SUBTLEX-UK word frequencies explain more of the variance in the lexical decision times of the British Lexicon Project than the word frequencies based on the British National Corpus and the SUBTLEX-US frequencies. In addition to the word form frequencies, we also present measures of contextual diversity part-of-speech specific word frequencies, word frequencies in children programmes, and word bigram frequencies, giving researchers of British English access to the full range of norms recently made available for other languages. Finally, we introduce a new measure of word frequency, the Zipf scale, which we hope will stop the current misunderstandings of the word frequency effect.


Behavior Research Methods | 2010

Wuggy: a multilingual pseudoword generator

Emmanuel Keuleers; Marc Brysbaert

Pseudowords play an important role in psycholinguistic experiments, either because they are required for performing tasks, such as lexical decision, or because they are the main focus of interest, such as in nonwordreading and nonce-inflection studies. We present a pseudoword generator that improves on current methods. It allows for the generation of written polysyllabic pseudowords that obey a given language’s phonotactic constraints. Given a word or nonword template, the algorithm can quickly generate pseudowords that match the template in subsyllabic structure and transition frequencies without having to search through a list with all possible candidates. Currently, the program is available for Dutch, English, German, French, Spanish, Serbian, and Basque, and, with little effort, it can be expanded to other languages.


PLOS ONE | 2010

SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles

Qing Cai; Marc Brysbaert

Background Word frequency is the most important variable in language research. However, despite the growing interest in the Chinese language, there are only a few sources of word frequency measures available to researchers, and the quality is less than what researchers in other languages are used to. Methodology Following recent work by New, Brysbaert, and colleagues in English, French and Dutch, we assembled a database of word and character frequencies based on a corpus of film and television subtitles (46.8 million characters, 33.5 million words). In line with what has been found in the other languages, the new word and character frequencies explain significantly more of the variance in Chinese word naming and lexical decision performance than measures based on written texts. Conclusions Our results confirm that word frequencies based on subtitles are a good estimate of daily language exposure and capture much of the variance in word processing efficiency. In addition, our database is the first to include information about the contextual diversity of the words and to provide good frequency estimates for multi-character words and the different syntactic roles in which the words are used. The word frequencies are freely available for research purposes.


Eye Guidance in Reading and Scene Perception | 1998

Word skipping: implications for theories of eye movement control in reading

Marc Brysbaert; Françoise Vitu

This chapter provides a meta-analysis of the factors that govern word skipping in reading. It is concluded that the primary predictor is the length of the word to be skipped. A much smaller effect is due to the processing ease of the word (e.g., the frequency of the word and its predictability in the sentence).


Behavior Research Methods Instruments & Computers | 2004

WordGen: A tool for word selection and nonword generation in Dutch, English, German, and French

Wouter Duyck; Timothy Desmet; Lieven Verbeke; Marc Brysbaert

WordGen is an easy-to-use program that uses the CELEX and Lexique lexical databases for word selection and nonword generation in Dutch, English, German, and French. Items can be generated in these four languages, specifying any combination of seven linguistic constraints: number of letters, neighborhood size, frequency, summated position-nonspecific bigram frequency, minimum position-nonspecific bigram frequency, position-specific frequency of the initial and final bigram, and orthographic relatedness. The program also has a module to calculate the respective values of these variables for items that have already been constructed, either with the program or taken from earlier studies. Stimulus queries can be entered through WordGen’s graphical user interface or by means of batch files. WordGen is especially useful for (1) Dutch and German item generation, because no such stimulus-selection tool exists for these languages, (2) the generation of nonwords for all four languages, because our program has some important advantages over previous nonword generation approaches, and (3) psycholinguistic experiments on bilingualism, because the possibility of using the same tool for different languages increases the cross-linguistic comparability of the generated item lists. WordGen is free and available athttp://expsy.ugent.be/wordgen.htm.

Collaboration


Dive into the Marc Brysbaert's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Boris New

Paris Descartes University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Denis Drieghe

University of Southampton

View shared research outputs
Researchain Logo
Decentralizing Knowledge