Akihito Seki | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Akihito Seki is active.

Explore More

Publication

Featured researches published by Akihito Seki.

ieee intelligent vehicles symposium | 2006

Robust Obstacle Detection in General Road Environment Based on Road Extraction and Pose Estimation

Akihito Seki; Masatoshi Okutomi

Understanding the general road environment is a vital task for obstacle detection in complicated situations. That task is easier to perform for highway environments than for general roads because road environments are well-established in highways and obstacle classes are limited. On the other hand, general roads are not always well-established and various small obstacles, as well as larger ones, must be detected. For the purpose of discerning obstacles and road patterns, it is important to determine the relative positions of the camera and the road surface. This paper presents an efficient solution using a stereo-vision-based obstacle detection method for general roads. The relative position is estimated dynamically even without any clear lane markings. Additionally, obstacles are detected without applying explicit models. We present experimental results to demonstrate the effectiveness of our proposed method under various conditions

british machine vision conference | 2016

Patch Based Confidence Prediction for Dense Disparity Map.

Akihito Seki; Marc Pollefeys

In this paper, we propose a novel method to predict the correctness of stereo correspondences, which we call confidence, and a confidence fusion method for dense disparity estimation. The input of our method consists in a two channels local window (disparity patch) which is designed by taking into account ideas of conventional confidence features. 1st channel is coming from the idea that neighboring pixels which have consistent disparities are more likely to be correct matching. In 2nd channel, a disparity from another image is considered such that the matches from left to right image should be consistent with those from right to left. The disparity patches are used as inputs of Convolutional Neural Networks so that the features and classifiers are simultaneously trained unlike what is done by existing methods. Moreover, the confidence is incorporated into Semi-Global Matching(SGM) by adjusting its parameters directly. We show the prominent performance of both confidence prediction and dense disparity estimation on KITTI datasets which are real world scenery.

british machine vision conference | 2009

Stereo-based Pedestrian Detection using Multiple Patterns

Hiroshi Hattori; Akihito Seki; Manabu Nishiyama; Tomoki Watanabe

Detecting pedestrians from a moving vehicle is a challenging problem since the essence of the task is to search non-rigid moving objects with various appearances in a dynamic and outdoor environment. In order to alleviate these difficulties, we propose a new human detection framework which makes the most use of stereo vision. While the conventional stereo-based detection methods initially generate regions of interest or ROIs on one of stereo images, the proposed one defines the ROIs on both left and right images. This paper presents two different ways for utilizing the stereo ROIs. Thefirst one is to classify the stereo ROIs individually and integrate the classification scores to obtain the final decision. The second one is to extend gradient-based local descriptors [1, 14] to multiple views and present new feature descriptors which we call Stereo HOG and Stereo CoHOG. Through experiments we show that both methods significantly reduce the false alarm rate while keeping the detection rate comparing with monocular-based methods.

computer vision and pattern recognition | 2017

SGM-Nets: Semi-Global Matching with Neural Networks

Akihito Seki; Marc Pollefeys

This paper deals with deep neural networks for predicting accurate dense disparity map with Semi-global matching (SGM). SGM is a widely used regularization method for real scenes because of its high accuracy and fast computation speed. Even though SGM can obtain accurate results, tuning of SGMs penalty-parameters, which control a smoothness and discontinuity of a disparity map, is uneasy and empirical methods have been proposed. We propose a learning based penalties estimation method, which we call SGM-Nets that consist of Convolutional Neural Networks. A small image patch and its position are input into SGMNets to predict the penalties for the 3D object structures. In order to train the networks, we introduce a novel loss function which is able to use sparsely annotated disparity maps such as captured by a LiDAR sensor in real environments. Moreover, we propose a novel SGM parameterization, which deploys different penalties depending on either positive or negative disparity changes in order to represent the object structures more discriminatively. Our SGM-Nets outperformed state of the art accuracy on KITTI benchmark datasets.

international conference on robotics and automation | 2006

Ego-motion estimation by matching dewarped road regions using stereo images

Akihito Seki; Masatoshi Okutomi

This paper proposes a method for vehicle ego-motion estimation using vehicle-mounted stereo cameras. Estimating ego-motion using cameras requires extraction of static regions from the images. We first use stereo images to estimate regions which correspond to the road plane and can be considered as static areas. Subsequently, we propose a virtual projection plane (VPP) image that is equivalent to the top view of the road scene. A vehicles ego-motion is obtained by matching the sequential VPP images of the road patterns in the extracted regions. We use a vehicle-motion model and consider a matching method to easily and accurately determine the ego-motion. Finally, we present experimental results obtained using our method

computer vision and pattern recognition | 2017

Quad-Networks: Unsupervised Learning to Rank for Interest Point Detection

Nikolay Savinov; Akihito Seki; Lubor Ladicky; Torsten Sattler; Marc Pollefeys

Several machine learning tasks require to represent the data using only a sparse set of interest points. An ideal detector is able to find the corresponding interest points even if the data undergo a transformation typical for a given domain. Since the task is of high practical interest in computer vision, many hand-crafted solutions were proposed. In this paper, we ask a fundamental question: can we learn such detectors from scratch? Since it is often unclear what points are interesting, human labelling cannot be used to find a truly unbiased solution. Therefore, the task requires an unsupervised formulation. We are the first to propose such a formulation: training a neural network to rank points in a transformation-invariant manner. Interest points are then extracted from the top/bottom quantiles of this ranking. We validate our approach on two tasks: standard RGB image interest point detection and challenging cross-modal interest point detection between RGB and depth images. We quantitatively show that our unsupervised method performs better or on-par with baselines.

international conference on image processing | 2011

Co-occurrence flow for pedestrian detection

Atsuto Maki; Akihito Seki; Tomoki Watanabe; Roberto Cipolla

The last few years have seen considerable progress in pedestrian detection. Recent work has established a combination of oriented gradients and optic flow as effective features although the detection rates are still unsatisfactory for practical use. This paper introduces a new type of motion feature, the co-occurrence flow (CoF). The advance is to capture relative movements of different parts of the entire body, unlike existing motion features which extract internal motion in a local fashion. Through evaluations on the TUD-Brussels pedestrian dataset, we show that our motion feature based on co-occurrence flow contributes to boost the performance of existing methods.

international conference on 3d vision | 2014

Reconstructing Fukushima: A Case Study

Akihito Seki; Oliver J. Woodford; Satoshi Ito; Björn Stenger; Makoto Hatakeyama; Junichi Shimamura

We present the application of 3D reconstruction technology to the inspection and decommissioning work at the damaged Fukushima Daiichi nuclear power station in Japan. We discuss the challenges of this project, such as the difficult image capture conditions (including under water), required use of limited imaging hardware, and capture by personnel inexperienced in 3D reconstruction. We present an overview of the system developed for this project, a real-time reconstruction pipeline with robust camera pose estimation, low-latency probabilistic dense depth estimation and a novel descriptor for point cloud alignment - the Co-occurrence Histogram of Angle and Distance (CHAD). We discuss the modifications required to standard algorithms in order to perform reliably in such a scenario. As well as quantitative evaluations of these components on existing datasets, we show qualitative 3D reconstruction results of debris from the damaged plant and its spent fuel pool. Such results have enabled planning of the critical process of debris removal, without the harmful requirement of extensive human presence on site.

computer vision and pattern recognition | 2007

Simultaneous Optimization of Structure and Motion in Dynamic Scenes Using Unsynchronized Stereo Cameras

Akihito Seki; Masatoshi Okutomi

In this paper, we propose a simultaneous estimation method of structure and motion in dynamic scenes. Usual methods for obtaining structure and motion using stereo cameras require two kinds of operations: stereo correspondence and tracking. Therefore, we must separately determine the correspondence between stereo images and sequential images. This necessity complicates the algorithm and increases the possibility of mismatches because of the objects motion and visibility change in the images. Our proposed method makes two contributions. The first contribution is the method of corresponding all stereo images and sequential images at once. Therefore, we can obtain the structure and motion simultaneously and more accurately. On the other hand, most stereo correspondence algorithms are limited to use under a synchronized status. In a stereo rig using unsynchronized cameras, as are most commercially available cameras, the structure cannot be obtained by stereo correspondence and triangulation because of the unknown time offset between cameras. Therefore, our second contribution is a method of estimating structure, motion, and time offset simultaneously using unsynchronized stereo cameras. This latter task is accomplished by taking advantage of the first contribution scheme. Additionally, our method requires no preprocessing such as motion segmentation for separating identical-motion objects and advance calibration of the time offset. Finally, we present the experimental results using both synthetic and real images.

workshop on applications of computer vision | 2009

Temporal integration for on-board stereo-based pedestrian detection

Akihito Seki; Hiroshi Hattori; Manabu Nishiyama; Tomoki Watanabe

Pedestrian detection is a difficult task for the following reasons. Firstly, pedestrians have many variations of size, pose, clothing, and motion. Secondly, the background tends to be crowded. Thirdly, many pedestrian-like patterns are observed in real environments. As a result, even if state-of-the-art pattern recognition method is used, a frame-by-frame detection has many false positives and negatives. This article presents a method of temporal integration for stereo-based pedestrian detection for improving the detection performance. In our method, a pedestrian is detected by evaluating consistency of the extracted pedestrian candidate for a short period of time. In order to get the consistency, the pedestrian candidate at each frame is combined with temporally corresponded ones through a hypothesis selection process. We demonstrate the effectiveness of our method by checking a recall and false positive number with 20,000 frames recorded in complex urban environments and public data sets. The proposed method reduces the number of false detection to one hundredth with holding recall.

Explore More