Mårten Björkman
Royal Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mårten Björkman.
international conference on robotics and automation | 2006
Patric Jensfelt; Danica Kragic; John Folkesson; Mårten Björkman
This paper presents a framework for 3D vision based bearing only SLAM using a single camera, an interesting setup for many real applications due to its low cost. The focus in is on the management of the features to achieve real-time performance in extraction, matching and loop detection. For matching image features to map landmarks a modified, rotationally variant SIFT descriptor is used in combination with a Harris-Laplace detector. To reduce the complexity in the map estimation while maintaining matching performance only a few, high quality, image features are used for map landmarks. The rest of the features are used for matching. The framework has been combined with an EKF implementation for SLAM. Experiments performed in indoor environments are presented. These experiments demonstrate the validity and effectiveness of the approach. In particular they show how the robot is able to successfully match current image features to the map when revisiting an area
Robotics and Autonomous Systems | 2005
Danica Kragic; Mårten Björkman; Henrik I. Christensen; Jan-Olof Eklundh
In this paper, we present a vision system for robotic object manipulation tasks in natural, domestic environments. Given complex fetch-and-carry robot tasks, the issues related to the whole detect-approach-grasp loop are considered. Our vision system integrates a number of algorithms using monocular and binocular cues to achieve robustness in realistic settings. The cues are considered and used in connection to both foveal and peripheral vision to provide depth information, segmentation of the object(s) of interest, object recognition, tracking and pose estimation. One important property of the system is that the step from object recognition to pose estimation is completely automatic combining both appearance and geometric models. Experimental evaluation is performed in a realistic indoor environment with occlusions, clutter, changing lighting and background conditions.
The International Journal of Robotics Research | 2010
Babak Rasolzadeh; Mårten Björkman; Kai Huebner; Danica Kragic
The ability to autonomously acquire new knowledge through interaction with the environment is an important research topic in the field of robotics. The knowledge can only be acquired if suitable perception— action capabilities are present: a robotic system has to be able to detect, attend to and manipulate objects in its surrounding. In this paper, we present the results of our long-term work in the area of vision-based sensing and control. The work on finding, attending, recognizing and manipulating objects in domestic environments is studied. We present a stereo-based vision system framework where aspects of top-down and bottom-up attention as well as foveated attention are put into focus and demonstrate how the system can be utilized for robotic object grasping.
international conference on robotics and automation | 2010
Mårten Björkman; Danica Kragic
We present an active vision system for segmentation of visual scenes based on integration of several cues. The system serves as a visual front end for generation of object hypotheses for new, previously unseen objects in natural scenes. The system combines a set of foveal and peripheral cameras where, through a stereo based fixation process, object hypotheses are generated. In addition to considering the segmentation process in 3D, the main contribution of the paper is integration of different cues in a temporal framework and improvement of initial hypotheses over time.
intelligent robots and systems | 2013
Mårten Björkman; Yasemin Bekiroglu; Virgile Högman; Danica Kragic
Object shape information is an important parameter in robot grasping tasks. However, it may be difficult to obtain accurate models of novel objects due to incomplete and noisy sensory measurements. In addition, object shape may change due to frequent interaction with the object (cereal boxes, etc). In this paper, we present a probabilistic approach for learning object models based on visual and tactile perception through physical interaction with an object. Our robot explores unknown objects by touching them strategically at parts that are uncertain in terms of shape. The robot starts by using only visual features to form an initial hypothesis about the object shape, then gradually adds tactile measurements to refine the object model. Our experiments involve ten objects of varying shapes and sizes in a real setup. The results show that our method is capable of choosing a small number of touches to construct object models similar to real object shapes and to determine similarities among acquired models.
international conference on robotics and automation | 2004
Mårten Björkman; Danica Kragic
In this paper, we present a real-time vision system that integrates a number of algorithms using monocular and binocular cues to achieve robustness in realistic settings, for tasks such as object recognition, tracking and pose estimation. The system consists of two sets of binocular cameras; a peripheral set for disparity based attention and a foveal one for higher level processes. Thus the conflicting requirements of a wide field of view and high resolution can be overcome. One important property of the system is that the step from task specification through object recognition to pose estimation is completely automatic, combining both appearance and geometric models. Experimental evaluation is performed in a realistic indoor environment with occlusions, clutter, changing lighting and background conditions.
intelligent robots and systems | 2010
Matthew Johnson-Roberson; Jeannette Bohg; Mårten Björkman; Danica Kragic
In this paper we present a framework for the segmentation of multiple objects from a 3D point cloud. We extend traditional image segmentation techniques into a full 3D representation. The proposed technique relies on a state-of-the-art min-cut framework to perform a fully 3D global multi-class labeling in a principled manner. Thereby, we extend our previous work in which a single object was actively segmented from the background. We also examine several seeding methods to bootstrap the graphical model-based energy minimization and these methods are compared over challenging scenes. All results are generated on real-world data gathered with an active vision robotic head. We present quantitive results over aggregate sets as well as visual results on specific examples.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2002
Mårten Björkman; Jan-Olof Eklundh
Stereo is an important cue for visually guided robots. While moving around in the world, such a robot can use dynamic fixation to overcome limitations in image resolution and field of view. In this paper, a binocular stereo system capable of dynamic fixation is presented. The external calibration is performed continuously taking temporal consistency into consideration, greatly simplifying the process. The essential matrix, which is estimated in real-time, is used to describe the epipolar geometry. It will be shown, how outliers can be identified and excluded from the calculations. An iterative approach based on a differential model of the optical flow, commonly used in structure from motion, is also presented and tested towards the essential matrix. The iterative method will be shown to be superior in terms of both computational speed and robustness, when the vergence angles are less than about 15. For larger angles, the differential model is insufficient and the essential matrix is preferably used instead.
intelligent robots and systems | 2011
Niklas Bergström; Mårten Björkman; Danica Kragic
We propose a method for interactive modeling of objects and object relations based on real-time segmentation of video sequences. In interaction with a human, the robot can perform multi-object segmentation through principled modeling of physical constraints. The key contribution is an efficient multi-labeling framework, that allows object modeling and disambiguation in natural scenes. Object modeling and labeling is done in a real-time segmentation system, to which hypotheses and constraints denoting relations between objects can be added incrementally. Through instructions such as key presses or spoken words, a scene can be segmented in regions corresponding to multiple physical objects. The approach solves some of the difficult problems related to disambiguation of objects merged due to their direct physical contact. Results show that even a limited set of simple interactions with a human operator can substantially improve segmentation results.
International Journal of Imaging Systems and Technology | 2006
Mårten Björkman; Jan-Olof Eklundh
In this paper we discuss the notion of a “seeing” system that uses vision to interact with its environment. The requirements on such a system depend on the tasks it is involved in and should be evaluated with these in mind. Here we consider the task of finding and recognizing objects in the real world. After a discussion of the needed functionalities and issues about the design we present an integrated real‐time vision system capable of finding, attending and recognizing objects in real settings. The system is based on a dual set of cameras, a wide field set for attention and a foveal one for recognition. The continuously running attentional process uses top‐down object characteristics in terms of hue and 3D size. Recognition is performed with objects of interest foveated and segmented from its background. We describe the system structure as well as the different components in detail and present experimental evaluations of its overall performance.