Is this you? Create Your Porfile

Jussi Karlgren

Swedish Institute of Computer Science

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jussi Karlgren is active.

Explore More

Publication

Featured researches published by Jussi Karlgren.

international conference on computational linguistics | 1994

Recognizing text genres with simple metrics using discriminant analysis

Jussi Karlgren; Douglas R. Cutting

A simple method for categorizing texts into pre-determined text genre categories using the statistical standard technique of discriminant analysis is demonstrated with application to the Brown corpus. Discriminant analysis makes it possible use a large number of parameters that may be specific for a certain corpus or information stream, and combine them into a small number of functions, with the parameters weighted on basis of how useful they are for discriminating text genres. An application to information retrieval is discussed.

Lecture Notes in Computer Science | 2007

ENSM-SE at CLEF 2006 : Fuzzy Proximity Method with an Adhoc Influence Function in Evaluation of Multilingual and Multi-modal Information Retrieval 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain

Carol Peters; Paul D. Clough; Fredric C. Gey; Jussi Karlgren; Bernardo Magnini; Douglas W. Oard; Maarten de Rijke; Maximilian Stempfhuber

This book constitutes the thoroughly refereed postproceedings of the 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, held in Alicante, Spain, September 2006. The revised papers presented together with an introduction were carefully reviewed and selected for inclusion in the book. The papers are organized in topical sections on Multilingual Textual Document Retrieval, Domain-Specifig Information Retrieval, i-CLEF, QA@CLEF, ImageCLEF, CLSR, WebCLEF and GeoCLEF.We experiment a new influence function in our information retrieval method that uses the degree of fuzzy proximity of key terms in a document to compute the relevance of the document to the query. The model is based on the idea that the closer the query terms in a document are to each other the more relevant the document. Our model handles Boolean queries but, contrary to the traditional extensions of the basic Boolean information retrieval model, does not use a proximity operator explicitly. A single parameter makes it possible to control the proximity degree required. To improve our system we use a stemming algorithm before indexing, we take a specific influence function and we merge fuzzy proximity result lists built with different width of influence function. We explain how we construct the queries and report the results of our experiments in the ad-hoc monolingual French task of the CLEF 2006 evaluation campaign.

user interface software and technology | 1999

WEST: a Web browser for small terminals

Staffan Björk; Lars Erik Holmquist; Johan Redström; Ivan Bretan; Rolf Danielsson; Jussi Karlgren; Kristofer Franzén

We describe WEST, a WEb browser for Small Terminals, that aims to solve some of the problems associated with accessing web pages on hand-held devices. Through a novel combination of text reduction and focus+context visualization, users can access web pages from a very limited display environment, since the system will provide an overview of the contents of a web page even when it is too large to be displayed in its entirety. To make maximum use of the limited resources available on a typical hand-held terminal, much of the most demanding work is done by a proxy server, allowing the terminal to concentrate on the task of providing responsive user interaction. The system makes use of some interaction concepts reminiscent of those defined in the Wireless Application Protocol (WAP), making it possible to utilize the techniques described here for WAP-compliant devices and services that may become available in the near future.

User Modeling and User-adapted Interaction | 1996

A glass box approach to adaptive hypermedia

Kristina Höök; Jussi Karlgren; Annika Waern; Nils Dahlbäck; Carl Gustaf Jansson; Klas Karlgren; Benoît Lemaire

Utilising adaptive interface techniques in the design of systems introduces certain risks. An adaptive interface is not static, but will actively adapt to the perceived needs of the user. Unless carefully designed, these changes may lead to an unpredictable, obscure and uncontrollable interface. Therefore the design of adaptive interfaces must ensure that users can inspect the adaptivity mechanisms, and control their results. One way to do this is to rely on the users understanding of the application and the domain, and relate the adaptivity mechanisms to domain-specific concepts. We present an example of an adaptive hypertext help system POP, which is being built according to these principles, and discuss the design considerations and empirical findings that lead to this design.

Archive | 1999

Stylistic experiments for information retrieval

Jussi Karlgren

A discussion on various experiments to utilize stylistic variation among texts for information retrieval purposes.

Natural Language Engineering | 2005

Automatic bilingual lexicon acquisition using random indexing of parallel corpora

Magnus Sahlgren; Jussi Karlgren

This paper presents a very simple and effective approach to using parallel corpora for automatic bilingual lexicon acquisition. The approach, which uses the Random Indexing vector space methodology, is based on finding correlations between terms based on their distributional characteristics. The approach requires a minimum of preprocessing and linguistic knowledge, and is efficient, fast and scalable. In this paper, we explain how our approach differs from traditional cooccurrence-based word alignment algorithms, and we demonstrate how to extract bilingual lexica using the Random Indexing approach applied to aligned parallel data. The acquired lexica are evaluated by comparing them to manually compiled gold standards, and we report overlap of around 60%. We also discuss methodological problems with evaluating lexical resources of this kind.

international conference on computational linguistics | 2001

Automatic Keyword Extraction Using Domain Knowledge

Anette Hulth; Jussi Karlgren; Anna Jonsson; Henrik Boström; Lars Asker

Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By including a hierarchically organised domain specific thesaurus as a second knowledge source the quality of such keywords was improved considerably, as measured by match to previously manually assigned keywords. In the presented experiment, the combination of the evidence from frequency analysis and the hierarchically organised thesaurus was done using inductive logic programming.

Computer Communications | 1996

Discussion: Issues when designing filters in messaging systems

Jacob Palme; Jussi Karlgren; Daniel Pargman

The increasing size of messaging communities increases the risk of information overload, especially when group communication tools like mailing lists or asynchronous conferencing systems (like Usenet News) are used. Future messaging systems will require more capable filters to aid users in the selection of what to read. The increasing use of networks by non-computer professionals requires filters that are easier to use and manage than most filtering software today. Filters might use evaluations of messages made by certain users as an aid to filtering these messages for other users.

human language technology | 1993

A speech to speech translation system built from standard components

Manny Rayner; Hiyan Alshawi; Ivan Bretan; David M. Carter; Vassilios Digalakis; Björn Gambäck; Jaan Kaja; Jussi Karlgren; Bertil Lyberg; Stephen Pulman; Patti Price; Christer Samuelsson

This paper describes a speech to speech translation system using standard components and a suite of generalizable customization techniques. The system currently translates air travel planning queries from English to Swedish. The modular architecture is designed to be easy to port to new domains and languages, and consists of a pipelined series of processing phases. The output of each phase consists of multiple hypotheses; statistical preference mechanisms, the data for which is derived from automatic processing of domain corpora, are used between each pair of phases to filter hypotheses. Linguistic knowledge is represented throughout the system in declarative form. We summarize the architectures of the component systems and the interfaces between them, and present initial performance results.

conference on information and knowledge management | 2009

Terminology mining in social media

Magnus Sahlgren; Jussi Karlgren

The highly variable and dynamic word usage in social media presents serious challenges for both research and those commercial applications that are geared towards blogs or other user-generated non-editorial texts. This paper discusses and exemplifies a terminology mining approach for dealing with the productive character of the textual environment in social media. We explore the challenges of practically acquiring new terminology, and of modeling similarity and relatedness of terms from observing realistic amounts of data. We also discuss semantic evolution and density, and investigate novel measures for characterizing the preconditions for terminology mining.

Explore More