Reinhard Köhler
University of Trier
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Reinhard Köhler.
Journal of Quantitative Linguistics | 1994
Gejza Wimmer; Reinhard Köhler; Rüdiger Grotjahn; Gabriel Altmann
Abstract A method for modeling word length distributions and different models are presented. The compound Poisson and Ord family of distributions seems to be adequate. The relationship of word length to other language phenomena is discussed.
Glottotheory | 2009
Gabriel Altmann; Peter Grzybek; Bijapur D. Jayaram; Reinhard Köhler; Viktor Krupa; Ján Macutek; Regina Pustet; Ludmilla Uhlirová; Matummal N. Vidya; Ioan-Iovitz Popescu
Word frequency plays a prominent role in many scientific and applicational fields. The book presents innovative methods in research and new results important for language and text characterization. Based on a general theory, surprising interrelations are shown between word frequency and other linguistic properties. Interrelations between previously known methods and new characteristics such as the h-point and other measures developed in the book are investigated. Furthermore, new statistical tests are introduced.
Archive | 2012
Reinhard Köhler; Gabriel Altmann
This is the first book to bring together the fields of theoretical and empirical studies in syntax on the one hand and the methodology of quantitative linguistics on the other hand. An introduction into the aims and methods of the quantitative approach to linguistics in general and to syntax in particular prepares the reader for the following chapters, which inform about measurement and data acquisition methods and the most common mathematical models for the analysis of syntactic and syntagmatic material. Various examples illustrate how these models are applied and show the corresponding results.
Journal of Quantitative Linguistics | 2000
Reinhard Köhler; Gabriel Altmann
In Köhler (1999), an attempt was made to set up a basic functional-analytic model of a syntactic subsystem in the framework of synergetic linguistics. In that paper, functional dependencies among selected properties, viz. frequency, complexity, length, depth of embedding, and information, and the quantities polyfunctionality and synfunctionality are postulated, derived, and empirically tested on data from the Susanne corpus (Sampson, 1995). The analysis of the probability distributions of the quantities under consideration was postponed and will be tackled in the present study. It will be shown that the properties of syntactic constructions are lawfully distributed according to only a few distributions which belong to a common family of probability distributions, and that hypotheses can be set up from which the corresponding distributions can be derived, thus explaining the empirical findings. The empirical database is extended by another language, viz. by data from the German Negra-Korpus (Brants, 1999, p. 102). The empirical tests yield results which are compatible with the hypotheses. Syntactic constructions and categories were considered as basic units. In the case of the Susanne corpus, the clause, phrase, and word class tags were evaluated as operationalizations of these units, and in the case of the Negra-Korpus, all node tags and word tags.
Archive | 2007
Peter Grzybek; Reinhard Köhler
The collection contains more than 60 original papers and reflects current research topics in linguistics and text analysis. Most of the papers present recent results of empirical quantitative investigations; others focus on methodological issues, whereas some of them are of a more theoretical, systems-theoretical/semiotic character. Finally, a number of contributions form typical integrative deductive-inductive studies. The volume is a valuable source of information about the current state-of-the-art in quantitative linguistic research, presented by renowned representatives of the field.
Journal of Quantitative Linguistics | 1997
Reinhard Köhler
Abstract There is an increasing interest in questions of determining units in linguistics and musicology which can be used for measuring quantitative properties such as length, frequency, and complexity. Up to now, however, their discussion has been restricted to problems of the definition of basic elements in the particular field of research, of their features, of methods for their identification and segmentation in texts, etc. The present contribution considers the question of linguistic units of measurement. Is the kind of formula as used in quantitative linguistics incomplete and incorrect just as a large part of the equations in physics would be if the units (e.g., meter, kilogram, hour, newton, or farad) were omitted? If not, all linguistic quantities introduced so far would be dimensionless, and hence every dependence investigated and every law would be. We will show that this is the case and consider the consequences of this result for relations between elements on different levels of linguistic a...
Journal of Quantitative Linguistics | 2009
Relja Vulanović; Reinhard Köhler
Abstract Parts-of-speech systems, as defined by Hengeveld et al. in their 2004 article on parts-of-speech systems and word order, form the framework for this article, in which we investigate how the distribution of languages with respect to certain linguistic features is related to the number of propositional functions and the number of lexeme classes. The linguistic features are the presence or absence of fixed word order and markers, which may disambiguate between different propositional functions. We show that the relation can be modelled by a three-dimensional generalization of the sigmoid, for which we provide theoretical justification.
Physics of Life Reviews | 2014
Reinhard Köhler
We have long been used to the domination of qualitative methods in modern linguistics. Indeed, qualitative methods have advantages such as ease of use and wide applicability to many types of linguistic phenomena. However, this shall not overshadow the fact that a great part of human language is amenable to quantification. Moreover, qualitative methods may lead to over-simplification by employing the rigid yes/no scale. When variability and vagueness of human language must be taken into account, qualitative methods will prove inadequate and give way to quantitative methods [1, p. 11]. In addition to such advantages as exactness and precision, quantitative concepts and methods make it possible to find laws of human language which are just like those in natural sciences. These laws are fundamental elements of linguistic theories in the spirit of the philosophy of science [2,3]. Theorization effort of this type is what quantitative linguistics [1,4,5] is devoted to. The review of Cong and Liu [6] has provided an informative and insightful survey of linguistic complex networks as a young field of quantitative linguistics, including the basic concepts and measures, the major lines of research with linguistic motivation, and suggestions for future research. Complex linguistic networks provide a new approach to linguistic quantification, with different models and data types from those applied in traditional quantitative linguistics. With traditional methods of quantitative linguistics, we deal with plain texts of human language or those annotated with particular purposes (such as syntactic treebanks). The complex network approach to the actual use of human language, on the other hand, examines “networked” texts, with linguistic units in them as vertices and their relations of a particular type as edges. Data types often analyzed in quantitative linguistics are attribute data [7, p. 2], i.e., those concerning properties of individual linguistic units such as frequency, length and degree of polysemy. As far as this kind of properties of individual units is concerned, mathematical methods applied in quantitative linguistics belong to the apparatus of probability theory and statistics in the form of theoretical probability and frequency distributions. Relational data describe at least two properties of units, e.g. the relation between frequency and length or the relation between polysemy and polytextuality of units. The data appropriate to network analysis of human language, on the other hand, are generally relational data [7, p. 3] in a highly condensed form. The network models and relational data aim at probing into the collective
Archive | 2015
Gabriel Altmann; Reinhard Köhler
The volume presents objective methods to detect and analyse various forms of repetitions. Repetition of textual elements is more than a superficial phenomenon. It may even be considered as constitutive for units and relations in a text: on a primary level when no other way exists to establish a unit, and on a secondary, artistic level, where repetition is a consequence of the transfer of the equivalence principle.
Archive | 2013
Reinhard Köhler
The notion of comparable corpora implies the notion of comparability. The present paper aims at explicating this notion with respect to statistical methods because statistical comparison requires the use of statistical tests, which again require certain properties of the data under analysis. Linguistic data, however, do not automatically meet these requirements. In corpus linguistics and other linguistic fields, statistical methods are often applied without any previous check of their applicability. The paper will give some warnings and show some examples of corresponding test procedures. A number of other frequently used terms and concepts, such as representativeness, homogeneity, and balanced corpora, play a central role in corpus-linguistic argumentations and will be analysed in the paper, too, as they concern compilation and use of comparable corpora.