Hirokatsu Kataoka
National Institute of Advanced Industrial Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hirokatsu Kataoka.
european conference on computer vision | 2016
Yun He; Soma Shirakabe; Yutaka Satoh; Hirokatsu Kataoka
The objective of this paper is to evaluate “human action recognition without human”. Motion representation is frequently discussed in human action recognition. We have examined several sophisticated options, such as dense trajectories (DT) and the two-stream convolutional neural network (CNN). However, some features from the background could be too strong, as shown in some recent studies on human action recognition. Therefore, we considered whether a background sequence alone can classify human actions in current large-scale action datasets (e.g., UCF101).
Sensors | 2018
Hirokatsu Kataoka; Yutaka Satoh; Yoshimitsu Aoki; Shoko Oikawa; Yasuhiro Matsui
The paper presents an emerging issue of fine-grained pedestrian action recognition that induces an advanced pre-crush safety to estimate a pedestrian intention in advance. The fine-grained pedestrian actions include visually slight differences (e.g., walking straight and crossing), which are difficult to distinguish from each other. It is believed that the fine-grained action recognition induces a pedestrian intention estimation for a helpful advanced driver-assistance systems (ADAS). The following difficulties have been studied to achieve a fine-grained and accurate pedestrian action recognition: (i) In order to analyze the fine-grained motion of a pedestrian appearance in the vehicle-mounted drive recorder, a method to describe subtle change of motion characteristics occurring in a short time is necessary; (ii) even when the background moves greatly due to the driving of the vehicle, it is necessary to detect changes in subtle motion of the pedestrian; (iii) the collection of large-scale fine-grained actions is very difficult, and therefore a relatively small database should be focused. We find out how to learn an effective recognition model with only a small-scale database. Here, we have thoroughly evaluated several types of configurations to explore an effective approach in fine-grained pedestrian action recognition without a large-scale database. Moreover, two different datasets have been collected in order to raise the issue. Finally, our proposal attained 91.01% on National Traffic Science and Environment Laboratory database (NTSEL) and 53.23% on the near-miss driving recorder database (NDRDB). The paper has improved +8.28% and +6.53% from baseline two-stream fusion convnets.
european conference on computer vision | 2016
Hirokatsu Kataoka; Yun He; Soma Shirakabe; Yutaka Satoh
Information of time differentiation is extremely important cue for a motion representation. We have applied first-order differential velocity from a positional information, moreover we believe that second-order differential acceleration is also a significant feature in a motion representation. However, an acceleration image based on a typical optical flow includes motion noises. We have not employed the acceleration image because the noises are too strong to catch an effective motion feature in an image sequence. On one hand, the recent convolutional neural networks (CNN) are robust against input noises.
society of instrument and control engineers of japan | 2015
Yudai Miyahsita; Hirokatsu Kataoka; Akio Nakamura
The purpose of this paper is to evaluate proficiency of manual micro-operation skill based on appearance-based information, considering effects of individual habit. First, we extract trajectory features from micro-operation using Dense Trajectories. Second, we calculate histogram from the features using Bag of Features. Then, common elements of histogram corresponding to experts are evaluated using Random Forests to remove individual habit. Finally, we calculate similarity of the histograms as the proficiency. We have experimentally verified that the proposed methodology demonstrated proficiency evaluation in removing individual habit.
international conference on machine vision | 2017
Yuta Matsuzaki; Kazushige Okayasu; Takaaki Imanari; Naomichi Kobayashi; Yoshihiro Kanehara; Ryousuke Takasawa; Akio Nakamura; Hirokatsu Kataoka
In this paper, we aim to estimate the Winner of world-wide film festival from the exhibited movie poster. The task is an extremely challenging because the estimation must be done with only an exhibited movie poster, without any film ratings and box-office takings. In order to tackle this problem, we have created a new database which is consist of all movie posters included in the four biggest film festivals. The movie poster database (MPDB) contains historic movies over 80 years which are nominated a movie award at each year. We apply a couple of feature types, namely hand-craft, mid-level and deep feature to extract various information from a movie poster. Our experiments showed suggestive knowledge, for example, the Academy award estimation can be better rate with a color feature and a facial emotion feature generally performs good rate on the MPDB. The paper may suggest a possibility of modeling human taste for a movie recommendation.
international conference on computer vision theory and applications | 2016
Hirokatsu Kataoka; Yoshimitsu Aoki; Kenji Iwata; Yutaka Satoh
We present a technique to address the new challenge of activity prediction in computer vision field. In activity prediction, we infer the next human activity through “classified activities” and “activity data analysis. Moreover, the prediction should be processed in real-time to avoid dangerous or anomalous activities. The combination of space–time convolutional neural networks (ST-CNN) and improved dense trajectories (iDT) are able to effectively understand human activities in image sequences. After categorizing human activities, we insert activity tags into an activity database in order to sample a distribution of human activity. A naive Bayes classifier allows us to achieve real-time activity prediction because only three elements are needed for parameter estimation. The contributions of this paper are: (i) activity prediction within a Bayesian framework and (ii) ST-CNN and iDT features for activity recognition. Moreover, human activity prediction in real-scenes is achieved with 81.0% accuracy.
computer vision and pattern recognition | 2016
Hirokatsu Kataoka; Kenji Iwata; Yutaka Satoh; Masaki Hayashi; Yoshimitsu Aoki; Slobodan Ilic
In this paper, we propose a framework for recognizing human activities that uses only in-topic dominant codewords and a mixture of intertopic vectors. Latent Dirichlet allocation (LDA) is used to develop approximations of human motion primitives, these are mid-level representations, and they adaptively integrate dominant vectors when classifying human activities. In LDA topic modeling, action videos (documents) are represented by a bag-of-words (input from a dictionary), and these are based on improved dense trajectories ([18]). The output topics correspond to human motion primitives, such as finger moving or subtle leg motion. We eliminate the impurities, such as missed tracking or changing light conditions, in each motion primitive. The assembled vector of motion primitives is an improved representation of the action. We demonstrate our method on four different datasets.
systems, man and cybernetics | 2015
Tomoaki Yamabe; Yudai Miyashita; Shin'ichi Sato; Yudai Yamamoto; Akio Nakamura; Hirokatsu Kataoka
We investigated effective features for human detection. The histogram of oriented gradients (HOG), which was proposed by N. Dalal, is an important representation that accumulates the edge-magnitude into a quantized histogram. Effective features similar to the HOG have been proposed. We question what the most effective feature is. We thus evaluate several features on three datasets of pedestrians, faces, and vehicles. We select the scale-invariant feature transform, local binary pattern, higher-order local auto correlation (HLAC), co-occurrence HOG, and extended CoHOG in addition to the HOG as features. These features have been adopted as effective features in related works. The features are applied to human detection on each dataset employing the real AdaBoost classifier. A comparison of classification results reveals that the combination of the HLAC and CoHOG is an effective feature for human detection.
international symposium on visual computing | 2015
Hirokatsu Kataoka; Yoshimitsu Aoki; Kenji Iwata; Yutaka Satoh
Activity recognition has been an active research topic in computer vision. Recently, the most successful approaches use dense trajectories that extract a large number of trajectories and features on the trajectories into a codeword. In this paper, we evaluate various features in the framework of dense trajectories on several types of datasets. We implement 13 features in total by including five different types of descriptor, namely motion-, shape-, texture- trajectory- and co-occurrence-based feature descriptors. The experimental results show a relationship between feature descriptors and performance rate at each dataset. Different scenes of traffic, surgery, daily living and sports are used to analyze the feature characteristics. Moreover, we test how much the performance rate of concatenated vectors depends on the type, top-ranked in experiment and all 13 feature descriptors on fine-grained datasets. Feature evaluation is beneficial not only in the activity recognition problem, but also in other domains in spatio-temporal recognition.
computer vision and pattern recognition | 2018
Kensho Hara; Hirokatsu Kataoka; Yutaka Satoh
Collaboration
Dive into the Hirokatsu Kataoka's collaboration.
National Institute of Advanced Industrial Science and Technology
View shared research outputsNational Institute of Advanced Industrial Science and Technology
View shared research outputsNational Institute of Advanced Industrial Science and Technology
View shared research outputsNational Institute of Advanced Industrial Science and Technology
View shared research outputs