Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tom Ruette is active.

Publication


Featured researches published by Tom Ruette.


American Speech | 2013

Site-restricted web searches for data collection in regional dialectology

Jack Grieve; Costanza Asnaghi; Tom Ruette

This article presents a new method for data collection in regional dialectology based on site-restricted web searches. The method measures the usage and determines the distribution of lexical variants across a region of interest using common web search engines, such as Google or Bing. The method involves estimating the proportions of the variants of a lexical alternation variable over a series of cities by counting the number of webpages that contain the variants on newspaper websites originating from these cities through site-restricted web searches. The method is evaluated by mapping the 26 variants of 10 lexical variables with known distributions in American English. In almost all cases, the maps based on site-restricted web searches align closely with traditional dialect maps based on data gathered through questionnaires, demonstrating the accuracy of this method for the observation of regional linguistic variation. However, unlike collecting dialect data using traditional methods, which is a relatively slow process, the use of site-restricted web searches allows for dialect data to be collected from across a region as large as the United States in a matter of days.


Archive | 2014

Lexical variation in aggregate perspective

Tom Ruette; Dirk Speelman; Dirk Geeraerts

The current paper shows how a sociolectometric approach is needed to disentangle the multidimensional structure of the varieties in a pluricentric language. There are different sociolectometric approaches, i.e. corpus-based methods, perception experiments, or attitude questionaires. Although the focus of a sociolectometric approach is on the varieties, the choice of the variables under analysis is crucial; we focus on lexical variation. Furthermore, in this paper we compare two quantitative corpus-based methods, which differ in their conceptual control of lexical variables: on the one hand, we take a method that ignores the conceptual relationship between the lexemes in the variable set, on the other hand, there is a method that incorporates knowledge about conceptual identity between lexemes. The importance and difficulties of conceptual control when studying variation in the lexicon as a whole is shown by means of a case-study on the pluricentric language Dutch. The pluricentric character of Dutch is now widely accepted: Dutch is used both in Belgium and in the Netherlands, but each nation has its own norm generating center (cf. Clyne, 1992). This is different from the imposed situation in earlier years, especially the sixties, where Dutch in Belgium was supposed to be exogenically modeled on the norms of the Netherlands. Recently, by means of empirical work of e.g. Geeraerts et al. (1999) and experimental work of e.g. Impe et al. (2008), this historical view had to be adjusted to the current view, as described in Auer (2005). Rather than providing further empirical proof of the pluricentric character of the Dutch lexicon, the case-study aims to show the pertinence of a sociolectometric methodology that can aggregate patterns of non-categorical lexical variation while incorporating an appropriate amount of conceptual control — in contrast to a methodology that discards any conceptual knowledge. As such, the study touches upon two general issues in the broader field of variationist linguistics: on the level of words, we look at the problematic status of lexical variation and the difficulty of delineating word meaning; on the level of structure, we run


Archive | 2016

Frequency effects in lexical sociolectometry are insubstantial

Tom Ruette; Katharina Ehret; Benedikt Szmrecsanyi

This contribution investigates frequency effects in lexical sociolectometry, and explores by way of a case study variation in written English as s mpled in the well-known Brown family of corpora. Lexical sociolectometry is a productive research paradigm that is concerned with studying aggregate lexical distances between variet ies of a language. Lexical distance quantifies the extent to which different varieties u e different labels to describe the same concept. If different labels are used in different varieties, then this will increase the lexical distance between the varieties We aggregate over ma ny different concepts, in order to make generalizable claims about the distance between var ieties, independently of a specific concept. Our central question is, “When generalizing across concepts, does concept frequency play a role in the aggregation?” To answer this question, we examine three types of frequency weighting (i) boosting low-frequency concepts, (ii) boosting high-frequency concepts, and (iii) no frequency weighting at all, and investigate whet her they have an effect on the aggregation. We find no such frequency effect, and discuss reaso ns for this absence in lexical sociolectometry.


Archive | 2014

Semantic weighting mechanisms in scalable lexical sociolectometry

Tom Ruette; Yves Peirsman; Dirk Speelman; Dirk Geeraerts


ri: variação linguística e dimensões sociocognitivas, 2011, ISBN 978-972-697-201-3, págs. 541-554 | 2011

Measuring the lexical distance between registers in national varieties of dutch

Dirk Speelman; Dirk Geeraerts; Tom Ruette


Archive | 2013

Register analysis in blogs: Correlation between professional sector and functional dimensions

Jocelyne Daems; Dirk Speelman; Tom Ruette


International Journal of Corpus Linguistics | 2016

A lectometric analysis of aggregated lexical variation in written Standard English with Semantic Vector Space models

Tom Ruette; Katharina Ehret; Benedikt Szmrecsanyi


Proceedings of the 11th International Conference on Textual Data Statistical Analysis | 2012

Applying individual differences scaling to measurements of lexical convergence between Netherlandic and Belgian Dutch

Tom Ruette; Dirk Speelman


Archive | 2017

Social functional linguistic variation in conversational Dutch

Jack Grieve; Tom Ruette; Dirk Speelman; Dirk Geeraerts


Archive | 2016

Interpreting aggregated distances. The case of Old German

Tom Ruette; Dirk Speelman

Collaboration


Dive into the Tom Ruette's collaboration.

Top Co-Authors

Avatar

Dirk Speelman

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Dirk Geeraerts

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Costanza Asnaghi

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Benedikt Szmrecsanyi

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Yves Peirsman

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jocelyne Daems

Katholieke Universiteit Leuven

View shared research outputs
Researchain Logo
Decentralizing Knowledge