Is this you? Create Your Porfile

Vijay John

Toyota Technological Institute

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vijay John is active.

Explore More

Publication

Featured researches published by Vijay John.

international conference on intelligent transportation systems | 2014

Traffic light recognition in varying illumination using deep learning and saliency map

Vijay John; Keisuke Yoneda; Bin Qi; Zheng Liu; Seiichi Mita

The accurate detection and recognition of traffic lights is important for autonomous vehicle navigation and advanced driver aid systems. In this paper, we present a traffic light recognition algorithm for varying illumination conditions using computer vision and machine learning. More specifically, a convolutional neural network is used to extract and detect features from visual camera images. To improve the recognition accuracy, an on-board GPS sensor is employed to identify the region-of-interest, in the visual image, that contains the traffic light. In addition, a saliency map containing the traffic light location is generated using the normal illumination recognition to assist the recognition under low illumination conditions. The proposed algorithm was evaluated on our data sets acquired in a variety of real world environments and compared with the performance of a baseline traffic signal recognition algorithm. The experimental results demonstrate the high recognition accuracy of the proposed algorithm in varied illumination conditions.

IEEE Transactions on Computational Imaging | 2015

Saliency Map Generation by the Convolutional Neural Network for Real-Time Traffic Light Detection Using Template Matching

Vijay John; Keisuke Yoneda; Zheng Liu; Seiichi Mita

A critical issue in autonomous vehicle navigation and advanced driver assistance systems (ADAS) is the accurate real-time detection of traffic lights. Typically, vision-based sensors are used to detect the traffic light. However, the detection of traffic lights using computer vision, image processing, and learning algorithms is not trivial. The challenges include appearance variations, illumination variations, and reduced appearance information in low illumination conditions. To address these challenges, we present a visual camera-based real-time traffic light detection algorithm, where we identify the spatially constrained region-of-interest in the image containing the traffic light. Given, the identified region-of-interest, we achieve high traffic light detection accuracy with few false positives, even in adverse environments. To perform robust traffic light detection in varying conditions with few false positives, the proposed algorithm consists of two steps, an offline saliency map generation and a real-time traffic light detection. In the offline step, a convolutional neural network, i.e., a deep learning framework, detects and recognizes the traffic lights in the image using region-of-interest information provided by an onboard GPS sensor. The detected traffic light information is then used to generate the saliency maps with a modified multidimensional density-based spatial clustering of applications with noise (M-DBSCAN) algorithm. The generated saliency maps are indexed using the vehicle GPS information. In the real-time step, traffic lights are detected by retrieving relevant saliency maps and performing template matching by using colour information. The proposed algorithm is validated with the datasets acquired in varying conditions and different countries, e.g., USA, Japan, and France. The experimental results report a high detection accuracy with negligible false positives under varied illumination conditions. More importantly, an average computational time of 10 ms/frame is achieved. A detailed parameter analysis is conducted and the observations are summarized and reported in this paper.

Information Fusion | 2017

Statistical comparison of image fusion algorithms: Recommendations

Zheng Liu; Erik Blasch; Vijay John

Abstract Pixel-level image fusion has been applied in a variety of applications, including multi-modal medical imaging, remote sensing, industrial inspection, video surveillance, and night vision etc. Various algorithms are being proposed for numerous applications which requires a comprehensive method of assessment to discern which methods provide decision support. Currently, the validation or assessment of newly proposed algorithms is done either subjectively or objectively. A subjective assessment is costly and affected by a number of factors that are difficult to control. On the other hand, an objective assessment is carried out with a fusion performance metric which is defined to evaluate the effectiveness and/or efficiency of the fusion operation. There are a number of fusion metrics proposed for fusion processes taking different perspectives. Most image fusion research presents a comparison of the proposed and existing fusion algorithms with selected fusion metric(s) over multiple image data sets. The proposed algorithm advantage is justified by the relative difference with the best or better metric values. However, the statistical significance of such difference is unknown leading to a misperception of the quantitative differences between methods. This paper proposes the use of non-parametric statistical analysis for comparisons of fusion algorithms along with the Image fusion Toolbox Employing Significance Testing (ImTEST). Strategies to use different tests in varied scenarios are presented and recommended. Experiments with recently published algorithms demonstrate the necessity to adopt the statistical comparison to establish a baseline for image fusion research.

international conference on machine vision | 2015

Pedestrian detection in thermal images using adaptive fuzzy C-means clustering and convolutional neural networks

Vijay John; Seiichi Mita; Zheng Liu; Bin Qi

Pedestrian detection is paramount for advanced driver assistance systems (ADAS) and autonomous driving. As a key technology in computer vision, it also finds many other applications, such as security and surveillance etc. Generally, pedestrian detection is conducted for images in visible spectrum, which are not suitable for night time detection. Infrared (IR) or thermal imaging is often adopted for night time due to its capability of capturing the emitted energy from pedestrians. The detection process firstly extracts candidate pedestrians from the captured IR image. Robust feature descriptors are formulated to represent those candidates. A binary classification of the extract features is then performed with trained classifier models. In this paper, an algorithm for pedestrian detection from IR image is proposed, where an adaptive fuzzy C-means clustering and convolutional neural networks are adopted. The adaptive fuzzy C-means clustering is used to segment the IR images and retrieve the candidate pedestrians. The candidate pedestrians are then pruned using human posture characteristics and the second central moments ellipse. The convolutional neural network is used to simultaneously learn relevant features and perform the binary classification. The performance of the proposed algorithm is compared with state-of-the-art algorithms on publicly available data set. A better detection accuracy with reduced computational accuracy is achieved.

international conference on intelligent transportation systems | 2014

Pedestrian detection from thermal images with a scattered difference of directional gradients feature descriptor

Bin Qi; Vijay John; Zheng Liu; Seiichi Mita

Pedestrian detection is a rapidly evolving research area in computer vision with great impact on the quality of peoples daily life. In pedestrian detection, a robust feature descriptor that discriminates pedestrians from the background is a paramount step. Generally, pedestrians are detected with features extracted from visible images. However, those features can easily be contaminated by the changes of clothing color, illumination, body deformation, and complex backgrounds. These factors present great challenges for designing robust feature descriptors. In this study, we address this issue by proposing a new feature descriptor, namely, scattered difference of directional gradients (SDDG), for thermal images. Unlike visible images, thermal images are insensitive to illumination changes and immune to the variation of clothing color as well as the complexity of backgrounds. Compared with other feature descriptors, the SDDG captures more detailed local gradient information so that objects can be well described along certain directions. Experimental results demonstrate the comparable performance of the proposed feature descriptor with well-known feature descriptors, e.g. histogram of oriented gradients (HOG) and Haar wavelets (HWs).

digital image computing techniques and applications | 2016

Deep Learning-Based Fast Hand Gesture Recognition Using Representative Frames

Vijay John; Ali Boyali; Seiichi Mita; Masayuki Imanishi; Norio Sanma

In this paper, we propose a vision-based hand gesture recognition system for intelligent vehicles. Vision-based gesture recognition systems are employed in automotive user interfaces to increase the driver comfort without compromising their safety. In our algorithm, the long-term recurrent convolution network is used to classify the video sequences of hand gestures. In the standard long-term recurrent convolution network-based action classifier, multiple frames sampled from the video sequence are given as an input to the network, to perform classification. However, the use of multiple frames increases the computational complexity, apart from reducing the classification accuracy of the classifier. We propose to address these issues by extracting a fewer representative frames from the video sequence, and inputting them to the long-term recurrent convolution network. To extract the representative frames, we propose to use novel tiled image patterns and tiled binary pattern within a semantic segmentation- based deep learning framework, the deconvolutional neural network. The novel tiled image patterns contain multiple non-overlapping blocks and represent the entire gesture-based video sequence within a single tiled image. These image patterns represent the input to the deconvolution network and are generated from the video sequence. The novel tiled binary pattern also contain multiple non-overlapping blocks and represent the representative frames of the video sequence. These binary patterns represent the output of the deconvolution network. The training binary patterns are generated from the training video sequences using the dictionary learning and sparse modeling framework. We validate our proposed algorithm on the public Cambridge gesture recognition dataset. A comparative analysis is performed with baseline algorithms and an improved classification accuracy is observed. We also perform a detailed parametric analysis of the proposed algorithm. We report a gesture classification accuracy of 91% and report a near real-time computational complexity of

pacific rim symposium on image and video technology | 2015

Real-Time Lane Estimation Using Deep Features and Extra Trees Regression

Vijay John; Zheng Liu; Chunzhao Guo; Seiichi Mita; Kiyosumi Kidono

110

Information Fusion | 2018

Fusing synergistic information from multi-sensor images: An overview from implementation to performance assessment

Zheng Liu; Erik Blasch; Gaurav Bhatnagar; Vijay John; Wei Wu; Rick S. Blum

~ms per video sequence.

ieee intelligent vehicles symposium | 2017

3D point cloud map based vehicle localization using stereo camera

Yuquan Xu; Vijay John; Seiichi Mita; Hossein Tehrani; Kazuhisa Ishimaru; Sakiko Nishino

In this paper, we present a robust real-time lane estimation algorithm by adopting a learning framework using the convolutional neural network and extra trees. By utilising the learning framework, the proposed algorithm predicts the ego-lane location in the given image even under conditions of lane marker occlusion or absence. In the algorithm, the convolutional neural network is trained to extract robust features from the road images. While the extra trees regression model is trained to predict the ego-lane location from the extracted road features. The extra trees are trained with input-output pairs of road features and ego-lane image points. The ego-lane image points correspond to Bezier spline control points used to define the left and right lane markers of the ego-lane. We validate our proposed algorithm using the publicly available Caltech dataset and an acquired dataset. A comparative analysis with a baseline algorithms, shows that our algorithm reports better lane estimation accuracy, besides being robust to the occlusion and absence of lane markers. We report a computational time of 45i¾?ms per frame. Finally, we report a detailed parameter analysis of our proposed algorithm.

Signal, Image and Video Processing | 2017

Fusion of thermal and visible cameras for the application of pedestrian detection

Vijay John; Shogo Tsuchizawa; Zheng Liu; Seiichi Mita

Abstract Image fusion is capable of processing multiple heterogeneous images acquired by single or multi-sensor imaging systems for an improved interpretation of the targeted object or scene. A diversity of applications have benefited from the fusion of multi-sensor images through a more reliable and comprehensive fused result. Likewise, numerous approaches to fuse multi-sensor images have been proposed and published in literature. However, due to a lack of benchmark resources and commonly accepted assessment measures, it is hard to identify the significance of new image fusion algorithms and implementations. This paper reviews and categorizes recent algorithms for image fusion and performance assessment based on reported comparative results. We recommend using non-parametric statistical tests to verify the performance of the pixel-level fusion algorithms. Furthermore, a comprehensive evaluation of 40 fusion algorithms from recently published results is conducted to demonstrate the significance of these algorithms in terms of statistical analyses within their respective applications. Although the results of these performance tests are limited by available data sets, baseline algorithms, and selected assessment metrics; it is a critical step for comparative image fusion research. This paper aims to advance image fusion development by creating a complete inventory of state-of-the-art image fusion techniques and advocating statistical comparison tests to avoid unnecessary duplication of development efforts. Establishing a benchmark study for image fusion is critical for performance comparisons of contemporary methods.

Explore More