John Nerbonne | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John Nerbonne is active.

Explore More

Publication

Featured researches published by John Nerbonne.

Computational Linguistics | 1999

The MIT encyclopedia of the cognitive sciences

John Nerbonne

The MIT Encyclopedia of the Cognitive Sciences (MITECS) brings together 471 brief articles on a very wide range of topics within cognitive science. The general editors worked with advisory editors in six contributing fields, including Gennaro Chierchia on Linguistics and Language and Michael I. Jordan and Stuart Russell on Computational Intelligence. MITECS opens with excellent overview articles by each of the advisory editors on their fields. The general quality of the contributors and their contributions is outstanding. The editors secured the cooperation of leading scientists in every area including computational linguistics. Tables 1 and 2 suggest articles that are of particular interest in computational linguistics. The division into sections is that of the book’s companion Web site; in the printed volume, the articles are arranged in a single alphabetical sequence. There are many other articles of immediate interest, including several on grammar models popular in CL (Mark Steedman on Categorial Grammar, Georgia Green on Head-Driven Phrase Structure Grammar, and Mary Dalrymple on Lexical Functional Grammar) and several that treat computational simulations of psycholinguistic phenomena. Dennis Norris’s general article on computational psycholinguistics focuses nicely on interdisciplinary issues; it motivates why a range of computational models remain interesting within cognitive science when language is the subject of investigation.This is a fully-searchable, complete text of the MIT Encyclopedia of the Cognitive Sciences (MITECS) on a dual-platform CD-ROM. Since the 1970s the cognitive sciences have offered multidisciplinary ways of understanding the mind and cognition. The MIT Encyclopedia of the Cognitive Sciences (MITECS) is a landmark, comprehensive reference work that represents the methodological and theoretical diversity of this changing field. At the core of the encyclopedia are 471 concise entries, from Acquisition and Adaptationism to Wundt and X-bar Theory. Each article, written by a leading researcher in the field, provides an accessible introduction to an important concept in the cognitive sciences, as well as references or further readings. Six extended essays, which collectively serve as a roadmap to the articles, provide overviews of each of six major areas of cognitive science: Philosophy; Psychology; Neurosciences; Computational Intelligence; Linguistics and Language; and Culture, Cognition, and Evolution. For both students and researchers, MITECS will be an indispensable guide to the current state of the cognitive sciences. System requirements: Compatible with Windows 95, Windows NT (16MB of RAM available to Acrobat Reader; 10MB hard-disk space); Windows 3.1 and 3.11 for Workgroups (12MB hard-disk space); Macintosh and Power Macintosh (8MB of RAM available to Acrobat Reader, Apple System Software version 7.1.2 or later, and 12.5MB hard-disk space).

Inheritance, defaults and the lexicon | 1994

Feature-based inheritance networks for computational lexicons

Hans-Ulrich Krieger; John Nerbonne

The virtues of viewing the lexicon as an inheritance network are its succinctness and its tendency to highlight significant clusters of linguistic properties. From its succinctness follow two practical advantages, namely its ease of maintenance and modification. In this paper we present a feature-based foundation for lexical inheritance. We argue that the feature-based foundation is both more economical and expressively more powerful than non-feature-based systems. It is more economical because it employs only mechanisms already assumed to be present elsewhere in the grammar (viz., in the feature system), and it is more expressive because feature systems are more expressive than other mechanisms used in expressing lexical inheritance (cf. DATR). The lexicon furthermore allows the use of default unification, based on the ideas of default unification, defined by Bouma. These claims are buttressed in sections sketching the opportunities for lexical description in feature-based lexicons in two central lexical topics, inflection and derivation. Briefly, we argue that the central notion of paradigm may be defined in feature structures, and that it may be more satisfactorily (in fact, immediately) linked to the syntactic information in this fashion. Our discussion of derivation is more programmatic; but here, too, we argue that feature structures of a suitably rich sort provide a foundation for the definition of lexical rules. We illustrate theoretical claims in application to German lexis. This work is currently under implementation in a natural language understanding effort (DISCO) at the German Artiffical Intelligence Center (Deutsches Forschungszentrum fur Kunstliche Intelligenz).

Language and Linguistics Compass | 2009

Data-Driven Dialectology

John Nerbonne

Most studies of language variation proceed from the geographic or social distribution of single elements (features), and find it difficult to proceed further. Data-driven dialectology, and more generally, data-driven variationist studies, begin instead from an aggregate view of language variation and reap immediate benefits in dealing with well-known exceptions in the distributions of single features and in avoiding the need to select which features to use as the basis of characterizations. But the major advance is the opportunity to characterize general tendencies in linguistic variation.

Language Variation and Change | 2001

Dialect Areas and Dialect Continua

Wilbert Heeringa; John Nerbonne

The organising concept behind dialect variation is still seen predominantly as realized by the areas within which similar varieties are spoken. The opposing view, that dialects are organised in a continuum without sharp boundaries is likewise popular. This paper introducing a new element into this traditional discussion, the opportunity to view dialectal differences in the aggregate. We employ a dialectometric technique which provides an additive measure of pronunciation difference the (aggregate) pronunciation distance. This allows us to determine how much of the linguistic variation we find is accounted for by geography – between 65% and 81% in our sample of 27 Dutch towns and villages, a fact which lends credence to the continuum view. The borders of well-established dialect areas nonetheless show large deviations from the expected aggregate pronunciation distance. We pay particular attention to a puzzle about the subjective perception of continua introduced by Chambers and Trudgill, who consider a traveller walking in a straight line and noticing successive small changes as he walks from village to village, but seldom, if ever large differences. This sounds like a justification of a the continuum view, but there is an added twist: might the traveller be misled by the perspective of most recent memory? We shall use the Chambers-Trudgill puzzle to organise this paper at several points.

PLOS ONE | 2011

Quantitative Social Dialectology : Explaining Linguistic Variation Geographically and Socially

Martijn Wieling; John Nerbonne; R. Harald Baayen

In this study we examine linguistic variation and its dependence on both social and geographic factors. We follow dialectometry in applying a quantitative methodology and focusing on dialect distances, and social dialectology in the choice of factors we examine in building a model to predict word pronunciation distances from the standard Dutch language to 424 Dutch dialects. We combine linear mixed-effects regression modeling with generalized additive modeling to predict the pronunciation distance of 559 words. Although geographical position is the dominant predictor, several other factors emerged as significant. The model predicts a greater distance from the standard for smaller communities, for communities with a higher average age, for nouns (as contrasted with verbs and adjectives), for more frequent words, and for words with relatively many vowels. The impact of the demographic variables, however, varied from word to word. For a majority of words, larger, richer and younger communities are moving towards the standard. For a smaller minority of words, larger, richer and younger communities emerge as driving a change away from the standard. Similarly, the strength of the effects of word frequency and word category varied geographically. The peripheral areas of the Netherlands showed a greater distance from the standard for nouns (as opposed to verbs and adjectives) as well as for high-frequency words, compared to the more central areas. Our findings indicate that changes in pronunciation have been spreading (in particular for low-frequency words) from the Hollandic center of economic power to the peripheral areas of the country, meeting resistance that is stronger wherever, for well-documented historical reasons, the political influence of Holland was reduced. Our results are also consistent with the theory of lexical diffusion, in that distances from the Hollandic norm vary systematically and predictably on a word by word basis.

Proceedings of the Workshop on Linguistic Distances | 2006

Evaluation of String Distance Algorithms for Dialectology

Wilbert Heeringa; Peter Kleiweg; Charlotte Gooskens; John Nerbonne

We examine various string distance measures for suitability in modeling dialect distance, especially its perception. We find measures superior which do not normalize for word length, but which are are sensitive to order. We likewise find evidence for the superiority of measures which incorporate a sensitivity to phonological context, realized in the form of n-grams--although we cannot identify which form of context (bigram, trigram, etc.) is best. However, we find no clear benefit in using gradual as opposed to binary segmental difference when calculating sequence distances.

Journal of Quantitative Linguistics | 2007

Toward a Dialectological Yardstick

John Nerbonne; Peter Kleiweg

Abstract Dialectometry measures the differences between dialects in ways which may involve many independently varying parameters which must be specified in combination in order to arrive at measures of difference. The existence of many parameters of measurement and their possible interaction introduces the problem of how to choose parameter values and combinations of them intelligently. This paper proceeds from the assumption that dialectology proper must reveal geographic coherence in language variation in order to propose a yardstick with which to compare measurements made using various parameter settings, and it presents some results of its application.

Computational Linguistics | 1992

Inheritance and complementation: a case study of easy adjectives and related nouns

Daniel Flickinger; John Nerbonne

Mechanisms for representing lexically the bulk of syntactic and semantic information for a language have been under active development, as is evident in the recent studies contained in this volume. Our study serves to highlight some of themost useful tools available for structured lexical representation, in particular (multiple) inheritance, default specification, and lexical rules. It then illustrates the value of these mechanisms in illuminating one corner of the lexicon involving an unusual kind of complementation among a group of adjectives exemplified by easy. The virtues of the structured lexicon are its succinctness and its tendency to highlight significant clusters of linguistic properties. From its succinctness follow two practical advantages, namely its ease of maintenance and modification. In order to suggest how important these may be practically, we extend the analysis of adjectival complementation in several directions. These further illustrate how the use of inheritance in lexical representation permits exact and explicit characterizations of phenomena in the language under study. We demonstrate how the use of the mechanisms employed in the analysis of easy enables us to give a unified account of related phenomena featuring nouns such as pleasure, and even the adverbs (adjectival specifiers) too and enough. Along the way we motivate some elaborations of the HPSG (head-driven phrase structure grammar) framework in which we couch our analysis, and offer several avenues for further study of this part of the English lexicon.

Philosophical Transactions of the Royal Society B | 2010

Measuring the diffusion of linguistic change

John Nerbonne

We examine situations in which linguistic changes have probably been propagated via normal contact as opposed to via conquest, recent settlement and large-scale migration. We proceed then from two simplifying assumptions: first, that all linguistic variation is the result of either diffusion or independent innovation, and, second, that we may operationalize social contact as geographical distance. It is clear that both of these assumptions are imperfect, but they allow us to examine diffusion via the distribution of linguistic variation as a function of geographical distance. Several studies in quantitative linguistics have examined this relation, starting with Séguy (Séguy 1971 Rev. Linguist. Romane 35, 335–357), and virtually all report a sublinear growth in aggregate linguistic variation as a function of geographical distance. The literature from dialectology and historical linguistics has mostly traced the diffusion of individual features, however, so that it is sensible to ask what sort of dynamic in the diffusion of individual features is compatible with Séguys curve. We examine some simulations of diffusion in an effort to shed light on this question.

Computers and The Humanities | 2003

Lexical Distance in LAMSAS

John Nerbonne; Peter Kleiweg

The Linguistic Atlas of the Middle and South Atlantic States(LAMSAS) is admirably accessible for reanalysis (seehttp://hyde.park.uga.edu/lamsas/,Kretzschmar, 1994). The present paper applies alexical distance measure to assess the lexical relatedness of LAMSASssites, a popular focus of investigation in the past(Kurath, 1949; Carver, 1989; McDavid, 1994). Several conclusions arenoteworthy: First, and least controversially, we note that LAMSAS isdialectometrically challenging at least due to the range of fieldworkers and questionnaires employed. Second, on the issue of whichareas ought to be recognized, we note that our investigations tend tosupport a three-wayNorth/South/Midlands division rather than a two-wayNorth/South division, i.e. they tend to support Kurath and McDavidrather than Carver, but this tendency is not conclusive. Third, weextend dialectometric technique in suggesting means of dealing withalternate forms and multiple responses.

Explore More