Is this you? Create Your Porfile

Kyle Mahowald

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kyle Mahowald is active.

Explore More

Publication

Featured researches published by Kyle Mahowald.

Proceedings of the National Academy of Sciences of the United States of America | 2015

Large-scale evidence of dependency length minimization in 37 languages

Richard Futrell; Kyle Mahowald; Edward Gibson

Significance We provide the first large-scale, quantitative, cross-linguistic evidence for a universal syntactic property of languages: that dependency lengths are shorter than chance. Our work supports long-standing ideas that speakers prefer word orders with short dependency lengths and that languages do not enforce word orders with long dependency lengths. Dependency length minimization is well motivated because it allows for more efficient parsing and generation of natural language. Over the last 20 y, the hypothesis of a pressure to minimize dependency length has been invoked to explain many of the most striking recurring properties of languages. Our broad-coverage findings support those explanations. Explaining the variation between human languages and the constraints on that variation is a core goal of linguistics. In the last 20 y, it has been claimed that many striking universals of cross-linguistic variation follow from a hypothetical principle that dependency length—the distance between syntactically related words in a sentence—is minimized. Various models of human sentence production and comprehension predict that long dependencies are difficult or inefficient to process; minimizing dependency length thus enables effective communication without incurring processing difficulty. However, despite widespread application of this idea in theoretical, empirical, and practical work, there is not yet large-scale evidence that dependency length is actually minimized in real utterances across many languages; previous work has focused either on a small number of languages or on limited kinds of data about each language. Here, using parsed corpora of 37 diverse languages, we show that overall dependency lengths for all languages are shorter than conservative random baselines. The results strongly suggest that dependency length minimization is a universal quantitative property of human languages and support explanations of linguistic variation in terms of general properties of human information processing.

Cognition | 2013

Info/Information Theory: Speakers Choose Shorter Words in Predictive Contexts.

Kyle Mahowald; Evelina Fedorenko; Steven T. Piantadosi; Edward Gibson

A major open question in natural language research is the role of communicative efficiency in the origin and on-line processing of language structures. Here, we use word pairs like chimp/chimpanzee, which differ in length but have nearly identical meanings, to investigate the communicative properties of lexical systems and the communicative pressures on language users.If language is designed to be information-theoretically optimal, then shorter words should convey less information than their longer counterparts, when controlling for meaning. Consistent with this prediction, a corpus analysis revealed that the short form of our meaning-matched pairs occurs in more predictive contexts than the longer form. Second, a behavioral study showed that language users choose the short form more often in predictive contexts, suggesting that tendencies to be information-theoretically efficient manifest in explicit behavioral choices. Our findings, which demonstrate the prominent role of communicative efficiency in the structure of the lexicon, complement and extend the results of Piantadosi, Tily, and Gibson (2011), who showed that word length is better correlated with Shannon information content than with frequency. Crucially, we show that this effect arises at least in part from active speaker choice.

NeuroImage | 2016

Syntactic processing is distributed across the language system

Idan Blank; Zuzanna Balewski; Kyle Mahowald; Evelina Fedorenko

Language comprehension recruits an extended set of regions in the human brain. Is syntactic processing localized to a particular region or regions within this system, or is it distributed across the entire ensemble of brain regions that support high-level linguistic processing? Evidence from aphasic patients is more consistent with the latter possibility: damage to many different language regions and to white-matter tracts connecting them has been shown to lead to similar syntactic comprehension deficits. However, brain imaging investigations of syntactic processing continue to focus on particular regions within the language system, often parts of Brocas area and regions in the posterior temporal cortex. We hypothesized that, whereas the entire language system is in fact sensitive to syntactic complexity, the effects in some regions may be difficult to detect because of the overall lower response to language stimuli. Using an individual-subjects approach to localizing the language system, shown in prior work to be more sensitive than traditional group analyses, we indeed find responses to syntactic complexity throughout this system, consistent with the findings from the neuropsychological patient literature. We speculate that such distributed nature of syntactic processing could perhaps imply that syntax is inseparable from other aspects of language comprehension (e.g., lexico-semantic processing), in line with current linguistic and psycholinguistic theories and evidence. Neuroimaging investigations of syntactic processing thus need to expand their scope to include the entire system of high-level language processing regions in order to fully understand how syntax is instantiated in the human brain.

Psychological Science | 2017

Don’t Underestimate the Benefits of Being Misunderstood

Edward Gibson; Caitlin Tan; Richard Futrell; Kyle Mahowald; Lars Konieczny; Barbara Hemforth; Evelina Fedorenko

Being a nonnative speaker of a language poses challenges. Individuals often feel embarrassed by the errors they make when talking in their second language. However, here we report an advantage of being a nonnative speaker: Native speakers give foreign-accented speakers the benefit of the doubt when interpreting their utterances; as a result, apparently implausible utterances are more likely to be interpreted in a plausible way when delivered in a foreign than in a native accent. Across three replicated experiments, we demonstrated that native English speakers are more likely to interpret implausible utterances, such as “the mother gave the candle the daughter,” as similar plausible utterances (“the mother gave the candle to the daughter”) when the speaker has a foreign accent. This result follows from the general model of language interpretation in a noisy channel, under the hypothesis that listeners assume a higher error rate in foreign-accented than in nonaccented speech.

Cognition | 2017

Words cluster phonetically beyond phonotactic regularities

Isabelle Dautriche; Kyle Mahowald; Edward Gibson; Anne Christophe; Steven T. Piantadosi

Recent evidence suggests that cognitive pressures associated with language acquisition and use could affect the organization of the lexicon. On one hand, consistent with noisy channel models of language (e.g., Levy, 2008), the phonological distance between wordforms should be maximized to avoid perceptual confusability (a pressure for dispersion). On the other hand, a lexicon with high phonological regularity would be simpler to learn, remember and produce (e.g., Monaghan et al., 2011) (a pressure for clumpiness). Here we investigate wordform similarity in the lexicon, using measures of word distance (e.g., phonological neighborhood density) to ask whether there is evidence for dispersion or clumpiness of wordforms in the lexicon. We develop a novel method to compare lexicons to phonotactically-controlled baselines that provide a null hypothesis for how clumpy or sparse wordforms would be as the result of only phonotactics. Results for four languages, Dutch, English, German and French, show that the space of monomorphemic wordforms is clumpier than what would be expected by the best chance model according to a wide variety of measures: minimal pairs, average Levenshtein distance and several network properties. This suggests a fundamental drive for regularity in the lexicon that conflicts with the pressure for words to be as phonologically distinct as possible.

Journal of Semantics | 2015

A Pragmatic Account of Complexity in Definite Antecedent-Contained-Deletion Relative Clauses

Edward Gibson; Pauline Jacobson; Peter Graff; Kyle Mahowald; Evelina Fedorenko; Steven T. Piantadosi

Hackl, Koster-Hale & Varvoutis (2012; HKV) provide data that suggest that in a null context, antecedent-contained-deletion (ACD) relative clause structures modifying a quantified object noun phrase (NP; such as every doctor) are easier to process than those modifying a definite object NP (such as the doctor). HKV argue that this pattern of results supports a ‘quantifier-raising’ (QR) analysis of both ACD structures and quantified NPs in object position: under the account they advocate, both ACD resolution and quantified NPs in object position require movement of the object NP to a higher syntactic position. The processing advantage for quantified object NPs in ACD is hypothesized to derive from the fact that—at the point where ACD resolution must take place—the quantified NP has already undergone QR whereas this is not the case for definite NPs. Although in other work it is shown that HKV’s reading time analyses are flawed, such that the critical effects are not significant (Gibson et al. submitted), the effect in HKV’s acceptability rating is robust. But HKV’s interpretation is problematic. We present five experiments that provide evidence for an alternative, pragmatic, explanation for HKV’s observation. In particular, we argue that the low acceptability of the the / ACD condition is largely due to a strong pressure in the null context to use a competing form, by adding also or same. This pressure does not exist with quantified

Cognitive Science | 2017

Wordform Similarity Increases With Semantic Similarity: An Analysis of 100 Languages

Isabelle Dautriche; Kyle Mahowald; Edward Gibson; Steven T. Piantadosi

Although the mapping between form and meaning is often regarded as arbitrary, there are in fact well-known constraints on words which are the result of functional pressures associated with language use and its acquisition. In particular, languages have been shown to encode meaning distinctions in their sound properties, which may be important for language learning. Here, we investigate the relationship between semantic distance and phonological distance in the large-scale structure of the lexicon. We show evidence in 100 languages from a diverse array of language families that more semantically similar word pairs are also more phonologically similar. This suggests that there is an important statistical trend for lexicons to have semantically similar words be phonologically similar as well, possibly for functional reasons associated with language learning.

Neuropsychologia | 2018

A robust dissociation among the language, multiple demand, and default mode networks: Evidence from inter-region correlations in effect size

Zachary Mineroff; Idan Blank; Kyle Mahowald; Evelina Fedorenko

ABSTRACT Complex cognitive processes, including language, rely on multiple mental operations that are carried out by several large‐scale functional networks in the frontal, temporal, and parietal association cortices of the human brain. The central division of cognitive labor is between two fronto‐parietal bilateral networks: (a) the multiple demand (MD) network, which supports executive processes, such as working memory and cognitive control, and is engaged by diverse task domains, including language, especially when comprehension gets difficult; and (b) the default mode network (DMN), which supports introspective processes, such as mind wandering, and is active when we are not engaged in processing external stimuli. These two networks are strongly dissociated in both their functional profiles and their patterns of activity fluctuations during naturalistic cognition. Here, we focus on the functional relationship between these two networks and a third network: (c) the fronto‐temporal left‐lateralized “core” language network, which is selectively recruited by linguistic processing. Is the language network distinct and dissociated from both the MD network and the DMN, or is it synchronized and integrated with one or both of them? Recent work has provided evidence for a dissociation between the language network and the MD network. However, the relationship between the language network and the DMN is less clear, with some evidence for coordinated activity patterns and similar response profiles, perhaps due to the role of both in semantic processing. Here we use a novel fMRI approach to examine the relationship among the three networks: we measure the strength of activations in different language, MD, and DMN regions to functional contrasts typically used to identify each network, and then test which regions co‐vary in their contrast effect sizes across 60 individuals. We find that effect sizes correlate strongly within each network (e.g., one language region and another language region, or one DMN region and another DMN region), but show little or no correlation for region pairs across networks (e.g., a language region and a DMN region). Thus, using our novel method, we replicate the language/MD network dissociation discovered previously with other approaches, and also show that the language network is robustly dissociated from the DMN, overall suggesting that these three networks contribute to high‐level cognition in different ways and, perhaps, support distinct computations. Inter‐individual differences in effect sizes therefore do not simply reflect general differences in vascularization or attention, but exhibit sensitivity to the functional architecture of the brain. The strength of activation in each network can thus be probed separately in studies that attempt to link neural variability to behavioral or genetic variability. HighlightsIs the language network dissociable from multiple‐demand and default mode networks?Novel test: do individual differences in effect size (ES) correlate across regions?Individual differences co‐vary within networks much more than between networks.Data‐driven support for a triple language/multiple‐demand/default mode dissociation.Individual differences in regional ES respect the brains functional organization.

Cognitive Science | 2018

Word Forms Are Structured for Efficient Use

Kyle Mahowald; Isabelle Dautriche; Edward Gibson; Steven T. Piantadosi

Zipf famously stated that, if natural language lexicons are structured for efficient communication, the words that are used the most frequently should require the least effort. This observation explains the famous finding that the most frequent words in a language tend to be short. A related prediction is that, even within words of the same length, the most frequent word forms should be the ones that are easiest to produce and understand. Using orthographics as a proxy for phonetics, we test this hypothesis using corpora of 96 languages from Wikipedia. We find that, across a variety of languages and language families and controlling for length, the most frequent forms in a language tend to be more orthographically well-formed and have more orthographic neighbors than less frequent forms. We interpret this result as evidence that lexicons are structured by language usage pressures to facilitate efficient communication.

Proceedings of the National Academy of Sciences of the United States of America | 2013

Short, frequent words are more likely to appear genetically related by chance

Kyle Mahowald; Edward Gibson

An important question in historical linguistics is whether deep genetic relationships exist across language families. Although specific families can be reconstructed back to around 6,000 y ago, Pagel et al. (1) claim that seven Eurasian families arose from a common ancestor 15,000 y ago. Pagel et al. develop a phylogenetic model, starting with a subset of the Swadesh basic word list for seven language families in the Languages of the World Etymological Database, which lists reconstructed proto-words and cognates. Because these reconstructions are potentially unreliable, Pagel et al. treat each reconstructed cognate pair as a binary random variable. They find a robust correlation between the size of the cognate class and the word replacement rate (i.e., how fast the word is likely to be replaced in the vocabulary), which is closely related to frequency. As predicted, words with a slower replacement rate show deeper relationships across language families, which they take as evidence that there are deep relationships among the seven families.

Explore More