Markus Koskela | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Markus Koskela is active.

Explore More

Publication

Featured researches published by Markus Koskela.

international conference on machine learning | 2005

The 2005 PASCAL visual object classes challenge

Mark Everingham; Andrew Zisserman; Christopher K. I. Williams; Luc Van Gool; Moray Allan; Christopher M. Bishop; Olivier Chapelle; Navneet Dalal; Thomas Deselaers; Gyuri Dorkó; Stefan Duffner; Jan Eichhorn; Jason Farquhar; Mario Fritz; Christophe Garcia; Thomas L. Griffiths; Frédéric Jurie; Daniel Keysers; Markus Koskela; Jorma Laaksonen; Diane Larlus; Bastian Leibe; Hongying Meng; Hermann Ney; Bernt Schiele; Cordelia Schmid; Edgar Seemann; John Shawe-Taylor; Amos J. Storkey; Sandor Szedmak

The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). Four object classes were selected: motorbikes, bicycles, cars and people. Twelve teams entered the challenge. In this chapter we provide details of the datasets, algorithms used by the teams, evaluation criteria, and results achieved.

scandinavian conference on image analysis | 2000

PicSOM—content-based image retrieval with self-organizing maps

Jorma Laaksonen; Markus Koskela; Sami Laakso; Erkki Oja

We have developed a novel system for content-based image retrieval in large, unannotated databases. The system is called PicSOM, and it is based on tree structured self-organizing maps (TS-SOMs). Given a set of reference images, PicSOM is able to retrieve another set of images which are similar to the given ones. Each TS-SOM is formed with a diAerent image feature representation like color, texture, or shape. A new technique introduced in PicSOM facilitates automatic combination of responses from multiple TS-SOMs and their hierarchical levels. This mechanism adapts to the user’s preferences in selecting which images resemble each other. Thus, the mechanism implements a relevance feedback technique on content-based image retrieval. The image queries are performed through the World Wide Web and the queries are iteratively refined as the system exposes more images to the user. ” 2000 Elsevier Science B.V. All rights reserved.

international symposium on neural networks | 1999

PicSOM: self-organizing maps for content-based image retrieval

Jorma Laaksonen; Markus Koskela; Erkki Oja

Content-based image retrieval is an important approach to the problem of processing the increasing amount of visual data. It is based on automatically extracted features from the content of the images, such as color, texture, shape and structure. We have started a project to study methods for content-based image retrieval using the self-organizing map (SOM) as the image similarity scoring method. Our image retrieval system, named PicSOM, can be seen as a SOM-based approach to relevance feedback which is a form of supervised learning to adjust the subsequent queries based on the users responses during the information retrieval session. In PicSOM, a separate tree structured SOM (TS-SOM) is trained for each feature vector type in use. The system then adapts to the users preferences by returning her more images from those SOMs where her responses have been most densely mapped.

Pattern Analysis and Applications | 2001

Self-Organising Maps as a relevance feedback technique in Content-Based image retrieval

Jorma Laaksonen; Markus Koskela; Sami Laakso; Erkki Oja

Abstract:Self-Organising Maps (SOMs) can be used in implementing a powerful relevance feedback mechanism for Content-Based Image Retrieval (CBIR). This paper introduces the PicSOM CBIR system, and describes the use of SOMs as a relevance feedback technique in it. The technique is based on the SOM’s inherent property of topology-preserving mapping from a high-dimensional feature space to a two-dimensional grid of artificial neurons. On this grid similar images are mapped in nearby locations. As image similarity must, in unannotated databases, be based on low-level visual features, the similarity of images is dependent on the feature extraction scheme used. Therefore, in PicSOM there exists a separate tree-structured SOM for each different feature type. The incorporation of the relevance feedback and the combination of the outputs from the SOMs are performed as two successive processing steps. The proposed relevance feedback technique is described, analysed qualitatively, and visualised in the paper. Also, its performance is compared with a reference method.

Virtual Reality | 2011

An augmented reality interface to contextual information

Antti Ajanki; Mark Billinghurst; Hannes Gamper; Toni Järvenpää; Melih Kandemir; Samuel Kaski; Markus Koskela; Mikko Kurimo; Jorma Laaksonen; Kai Puolamäki; Teemu Ruokolainen; Timo Tossavainen

In this paper, we report on a prototype augmented reality (AR) platform for accessing abstract information in real-world pervasive computing environments. Using this platform, objects, people, and the environment serve as contextual channels to more information. The user’s interest with respect to the environment is inferred from eye movement patterns, speech, and other implicit feedback signals, and these data are used for information filtering. The results of proactive context-sensitive information retrieval are augmented onto the view of a handheld or head-mounted display or uttered as synthetic speech. The augmented information becomes part of the user’s context, and if the user shows interest in the AR content, the system detects this and provides progressively more information. In this paper, we describe the first use of the platform to develop a pilot application, Virtual Laboratory Guide, and early evaluation results of this application.

IEEE Transactions on Multimedia | 2007

Measuring Concept Similarities in Multimedia Ontologies: Analysis and Evaluations

Markus Koskela; Alan F. Smeaton; Jorma Laaksonen

The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing.

acm multimedia | 2014

Convolutional Network Features for Scene Recognition

Markus Koskela; Jorma Laaksonen

Convolutional neural networks have recently been used to obtain record-breaking results in many vision benchmarks. In addition, the intermediate layer activations of a trained network when exposed to new data sources have been shown to perform very well as generic image features, even when there are substantial differences between the original training data of the network and the new domain. In this paper, we focus on scene recognition and show that convolutional networks trained on mostly object recognition data can successfully be used for feature extraction in this task as well. We train a total of four networks with different training data and architectures, and show that the proposed method combining multiple scales and multiple features obtains state-of-the-art performance on four standard scene datasets.

international conference on multimodal interfaces | 2013

Online RGB-D gesture recognition with extreme learning machines

Xi Chen; Markus Koskela

Gesture recognition is needed in many applications such as human-computer interaction and sign language recognition. The challenges of building an actual recognition system do not lie only in reaching an acceptable recognition accuracy but also with requirements for fast online processing. In this paper, we propose a method for online gesture recognition using RGB-D data from a Kinect sensor. Frame-level features are extracted from RGB frames and the skeletal model obtained from the depth data, and then classified by multiple extreme learning machines. The outputs from the classifiers are aggregated to provide the final classification results for the gestures. We test our method on the ChaLearn multi-modal gesture challenge data. The results of the experiments demonstrate that the method can perform effective multi-class gesture recognition in real-time.

conference on image and video retrieval | 2004

Use of Image Subset Features in Image Retrieval with Self-Organizing Maps

Markus Koskela; Jorma Laaksonen; Erkki Oja

In content-based image retrieval (CBIR), the images in a database are indexed on the basis of low-level statistical features that can be automatically derived from the images. Due to the semantic gap, the performance of CBIR systems often remains quite modest especially on broad image domains. One method for improving the results is to incorporate automatic image classification methods to the CBIR system. The resulting subsets can be indexed separately with features suitable for those particular images or used to limit an image query only to certain promising image subsets. In this paper, a method for supporting different types of image subsets within a generic framework based on multiple parallel Self-Organizing Maps and binary clusterings is presented.

Neurocomputing | 2015

Skeleton-based action recognition with extreme learning machines

Xi Chen; Markus Koskela

Action and gesture recognition from motion capture and RGB-D camera sequences has recently emerged as a renowned and challenging research topic. The current methods can usually be applied only to small datasets with a dozen or so different actions, and the systems often require large amounts of time to train the models and to classify new sequences. In this paper, we first extract simple but effective frame-level features from the skeletal data and build a recognition system based on the extreme learning machine. We then propose three modeling methods for post-processing the classification outputs to obtain the recognition results on the action sequence level. We test the proposed method on three public datasets ranging from 11 to 40 action classes. For all datasets, the method can classify the sequences with accuracies reaching 96-99% and with the average classification time for one sequence on a single computer core around 4ms. Fast training and testing and the high accuracy make the proposed method readily applicable for online recognition applications.

Explore More