F.A. Grootjen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where F.A. Grootjen is active.

Explore More

Publication

Featured researches published by F.A. Grootjen.

data and knowledge engineering | 2006

Conceptual query expansion

F.A. Grootjen

This article presents a new, hybrid approach that projects an initial query result onto global information, yielding a local conceptual overview. Concepts found this way are candidates for query refinement.We show that the resulting conceptual structure after a typical short query of 2 terms, contains refinements that perform just as well as a most accurate query formulation.Subsequently we illustrate that query by navigation is an effective mechanism which in most cases finds the optimal concept in a small number of steps. When an optimal concept is not found, the navigation process still finds an acceptable sub-optimum.

Bilingualism: Language and Cognition | 2012

Distributions of cognates in Europe as based on Levenshtein distance

Job Schepens; Ton Dijkstra; F.A. Grootjen

Researchers on bilingual processing can benefit from computational tools developed in artificial intelligence. We show that a normalized Levenshtein distance function can efficiently and reliably simulate bilingual orthographic similarity ratings. Orthographic similarity distributions of cognates and non-cognates were identified across pairs of six European languages: English, German, French, Spanish, Italian, and Dutch. Semantic equivalence was determined using the conceptual structure of a translation database. By using a similarity threshold, large numbers of cognates could be selected that nearly completely included the stimulus materials of experimental studies. The identified numbers of form-similar and identical cognates correlated highly with branch lengths of phylogenetic language family trees, supporting the usefulness of the new measure for cross-language comparison. The normalized Levenshtein distance function can be considered as a new formal model of cross-language orthographic similarity.

PLOS ONE | 2013

Cross-language distributions of high frequency and phonetically similar cognates.

Job Schepens; Ton Dijkstra; F.A. Grootjen; Walter J. B. van Heuven

The coinciding form and meaning similarity of cognates, e.g. ‘flamme’ (French), ‘Flamme’ (German), ‘vlam’ (Dutch), meaning ‘flame’ in English, facilitates learning of additional languages. The cross-language frequency and similarity distributions of cognates vary according to evolutionary change and language contact. We compare frequency and orthographic (O), phonetic (P), and semantic similarity of cognates, automatically identified in semi-complete lexicons of six widely spoken languages. Comparisons of P and O similarity reveal inconsistent mappings in language pairs with deep orthographies. The frequency distributions show that cognate frequency is reduced in less closely related language pairs as compared to more closely related languages (e.g., French-English vs. German-English). These frequency and similarity patterns may support a better understanding of cognate processing in natural and experimental settings. The automatically identified cognates are available in the supplementary materials, including the frequency and similarity measurements.

Multimedia Tools and Applications | 2014

Requirements for multimedia metadata schemes in surveillance applications for security

Jeroen van Rest; F.A. Grootjen; Marc Grootjen; Remco Wijn; Olav Aarts; M.L. Roelofs; Gertjan J. Burghouts; Henri Bouma; Lejla Alic; Wessel Kraaij

Surveillance for security requires communication between systems and humans, involves behavioural and multimedia research, and demands an objective benchmarking for the performance of system components. Metadata representation schemes are extremely important to facilitate (system) interoperability and to define ground truth annotations for surveillance research and benchmarks. Surveillance places specific requirements on these metadata representation schemes. This paper offers a clear and coherent terminology, and uses this to present these requirements and to evaluate them in three ways: their fitness in breadth for surveillance design patterns, their fitness in depth for a specific surveillance scenario, and their realism on the basis of existing schemes. It is also validated that no existing metadata representation scheme fulfils all requirements. Guidelines are offered to those who wish to select or create a metadata scheme for surveillance for security.

systems, man and cybernetics | 2002

Conceptual relevance feedback

F.A. Grootjen; Th.P. van der Weide

Formulating a query is not an easy task. Web search engines observe users spending large amounts of time reformulating their queries to accomplish effective retrieval [l]. Precise query formulation is difficult: Do I know what I am looking for? This often neglected aspect of information retrieval can be best explained by the fact that information need is created by a knowledge gap. This gap can range from being fairly specific to very broad. During the searching process users may learn things about their knowledge gap and even may discover aspects of this gap they were initially not aware of [6]). Search methods like Query By Navigation (31 may help users to find out what they need. How do I formulate what I am looking for? As in human dialogs, the participants must know each other’s language and somehow predict the impact of the words they use. The same holds for query formulation. Good query formulation requires that a user can somehow predict which terms appear in documents relevant to the information need. Accurate term prediction requires extensive knowledge about the document collection. Such knowledge may be hard to obtain, especially in large document collections. Experiments show that users usually submit short (one or two word) queries that result in large inaccurate document sets, apparently preferring recall above precision. Relevance feedback, introduced over 30 years ago, is a well known approach to deal with this problem. This method treats the user’s first query as an initial attempt: a rough representation of the user’s information need hopefully covering (part of) the knowledge gap. The documents resulting from this initial query (the initial set) may be analyzed for relevance, to get an impression of the document collection, and used to formulate a new improved query. Usually query reformulation methods are grouped in three categories: User feedback approaches. A drawback of this approach is that users are not inclined in providing this feedback. There is no point in blaming the user for this, providing feedback might be not cost-effective. Local approaches, based on information obtained from the initial set of documents. 3. Global approaches that incorporate knowledge of the document collection.

applications of natural language to data bases | 2004

Effectiveness of index expressions

F.A. Grootjen; T. P. van der Weide

The quest for improving retrieval performance has led to the deployment of larger syntactical units than just plain words. This article presents a retrieval experiment that compares the effectiveness of two unsupervised language models which generate terms that exceed the word boundary. In particular, this article tries to show that index expressions provide, beside their navigational properties, a good way to capture the semantics of inter-word relations and by doing so, form an adequate base for information retrieval applications.

international conference on frontiers in handwriting recognition | 2008