Ferdous Ahmed Sohel
Murdoch University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ferdous Ahmed Sohel.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014
Yulan Guo; Mohammed Bennamoun; Ferdous Ahmed Sohel; Min Lu; Jianwei Wan
3D object recognition in cluttered scenes is a rapidly growing research area. Based on the used types of features, 3D object recognition methods can broadly be divided into two categories-global or local feature based methods. Intensive research has been done on local surface feature based methods as they are more robust to occlusion and clutter which are frequently present in a real-world scene. This paper presents a comprehensive survey of existing local surface feature based 3D object recognition methods. These methods generally comprise three phases: 3D keypoint detection, local surface feature description, and surface matching. This paper covers an extensive literature survey of each phase of the process. It also enlists a number of popular and contemporary databases together with their relevant attributes.
International Journal of Computer Vision | 2016
Yulan Guo; Mohammed Bennamoun; Ferdous Ahmed Sohel; Min Lu; Jianwei Wan; Ngai Ming Kwok
A number of 3D local feature descriptors have been proposed in the literature. It is however, unclear which descriptors are more appropriate for a particular application. A good descriptor should be descriptive, compact, and robust to a set of nuisances. This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling. We first evaluate the descriptiveness of these descriptors on eight popular datasets which were acquired using different techniques. We then analyze their compactness using the recall of feature matching per each float value in the descriptor. We also test the robustness of the selected descriptors with respect to support radius variations, Gaussian noise, shot noise, varying mesh resolution, distance to the mesh boundary, keypoint localization error, occlusion, clutter, and dataset size. Moreover, we present the performance results of these descriptors when combined with different 3D keypoint detection methods. We finally analyze the computational efficiency for generating each descriptor.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016
Salman Hameed Khan; Mohammed Bennamoun; Ferdous Ahmed Sohel; Roberto Togneri
We present a framework to automatically detect and remove shadows in real world scenes from a single image. Previous works on shadow detection put a lot of effort in designing shadow variant and invariant hand-crafted features. In contrast, our framework automatically learns the most relevant features in a supervised manner using multiple convolutional deep neural networks (ConvNets). The features are learned at the super-pixel level and along the dominant boundaries in the image. The predicted posteriors based on the learned features are fed to a conditional random field model to generate smooth shadow masks. Using the detected shadow masks, we propose a Bayesian formulation to accurately extract shadow matte and subsequently remove shadows. The Bayesian formulation is based on a novel model which accurately models the shadow generation process in the umbra and penumbra regions. The model parameters are efficiently estimated using an iterative optimization procedure. Our proposed framework consistently performed better than the state-of-the-art on all major shadow databases collected under a variety of conditions.
IEEE Transactions on Multimedia | 2014
Yulan Guo; Ferdous Ahmed Sohel; Mohammed Bennamoun; Jianwei Wan; Min Lu
Range image registration is a fundamental research topic for 3D object modeling and recognition. In this paper, we propose an accurate and robust algorithm for pairwise and multi-view range image registration. We first extract a set of Rotational Projection Statistics (RoPS) features from a pair of range images, and perform feature matching between them. The two range images are then registered using a transformation estimation method and a variant of the Iterative Closest Point (ICP) algorithm. Based on the pairwise registration algorithm, we propose a shape growing based multi-view registration algorithm. The seed shape is initialized with a selected range image and then sequentially updated by performing pairwise registration between itself and the input range images. All input range images are iteratively registered during the shape growing process. Extensive experiments were conducted to test the performance of our algorithm. The proposed pairwise registration algorithm is accurate, and robust to small overlaps, noise and varying mesh resolutions. The proposed multi-view registration algorithm is also very accurate. Rigorous comparisons with the state-of-the-art show the superiority of our algorithm.
computer vision and pattern recognition | 2014
Salman Hameed Khan; Mohammed Bennamoun; Ferdous Ahmed Sohel; Roberto Togneri
We present a practical framework to automatically detect shadows in real world scenes from a single photograph. Previous works on shadow detection put a lot of effort in designing shadow variant and invariant hand-crafted features. In contrast, our framework automatically learns the most relevant features in a supervised manner using multiple convolutional deep neural networks (ConvNets). The 7-layer network architecture of each ConvNet consists of alternating convolution and sub-sampling layers. The proposed framework learns features at the super-pixel level and along the object boundaries. In both cases, features are extracted using a context aware window centered at interest points. The predicted posteriors based on the learned features are fed to a conditional random field model to generate smooth shadow contours. Our proposed framework consistently performed better than the state-of-the-art on all major shadow databases collected under a variety of conditions.
Information Sciences | 2015
Yulan Guo; Ferdous Ahmed Sohel; Mohammed Bennamoun; Jianwei Wan; Min Lu
This paper presents a highly distinctive local surface feature called the TriSI feature for recognizing 3D objects in the presence of clutter and occlusion. For a feature point, we first construct a unique and repeatable Local Reference Frame (LRF) using the implicit geometrical information of neighboring triangular faces. We then generate three signatures from the three orthogonal coordinate axes of the LRF. These signatures are concatenated and then compressed into a TriSI feature. Finally, we propose an effective 3D object recognition algorithm based on hierarchical feature matching. We tested our TriSI feature on two popular datasets. Rigorous experimental results show that the TriSI feature was highly descriptive and outperformed existing algorithms under all levels of Gaussian noise, Laplacian noise, shot noise, varying mesh resolutions, occlusion, and clutter. Moreover, we tested our TriSI-based 3D object recognition algorithm on four standard datasets. The experimental results show that our algorithm achieved the best overall recognition results on these datasets.
european conference on computer vision | 2014
Salman Hameed Khan; Mohammed Bennamoun; Ferdous Ahmed Sohel; Roberto Togneri
We present a discriminative graphical model which integrates geometrical information from RGBD images in its unary, pairwise and higher order components. We propose an improved geometry estimation scheme which is robust to erroneous sensor inputs. At the unary level, we combine appearance based beliefs defined on pixels and planes using a hybrid decision fusion scheme. Our proposed location potential gives an improved representation of the planar classes. At the pairwise level, we learn a balanced combination of various boundaries to consider the spatial discontinuity. Finally, we treat planar regions as higher order cliques and use graphcuts to make efficient inference. In our model based formulation, we use structured learning to fine tune the model parameters. We test our approach on two RGBD datasets and demonstrate significant improvements over the state-of-the-art scene labeling techniques.
computer vision and pattern recognition | 2017
Qiuhong Ke; Mohammed Bennamoun; Senjian An; Ferdous Ahmed Sohel; Farid Boussaid
This paper presents a new method for 3D action recognition with skeleton sequences (i.e., 3D trajectories of human skeleton joints). The proposed method first transforms each skeleton sequence into three clips each consisting of several frames for spatial temporal feature learning using deep neural networks. Each clip is generated from one channel of the cylindrical coordinates of the skeleton sequence. Each frame of the generated clips represents the temporal information of the entire skeleton sequence, and incorporates one particular spatial relationship between the joints. The entire clips include multiple frames with different spatial relationships, which provide useful spatial structural information of the human skeleton. We propose to use deep convolutional neural networks to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and then use a Multi-Task Learning Network (MTLN) to jointly process all frames of the clips in parallel to incorporate spatial structural information for action recognition. Experimental results clearly show the effectiveness of the proposed new representation and feature learning method for 3D action recognition.
IEEE Transactions on Neural Networks | 2018
Salman Hameed Khan; Munawar Hayat; Mohammed Bennamoun; Ferdous Ahmed Sohel; Roberto Togneri
Class imbalance is a common problem in the case of real-world object detection and classification tasks. Data of some classes are abundant, making them an overrepresented majority, and data of other classes are scarce, making them an underrepresented minority. This imbalance makes it challenging for a classifier to appropriately learn the discriminating boundaries of the majority and minority classes. In this paper, we propose a cost-sensitive (CoSen) deep neural network, which can automatically learn robust feature representations for both the majority and minority classes. During training, our learning procedure jointly optimizes the class-dependent costs and the neural network parameters. The proposed approach is applicable to both binary and multiclass problems without any modification. Moreover, as opposed to data-level approaches, we do not alter the original data distribution, which results in a lower computational cost during the training process. We report the results of our experiments on six major image classification data sets and show that the proposed approach significantly outperforms the baseline algorithms. Comparisons with popular data sampling techniques and CoSen classifiers demonstrate the superior performance of our proposed method.
Pattern Recognition Letters | 2006
Ferdous Ahmed Sohel; Laurence S. Dooley; Gour C. Karmakar
This paper addresses a fundamental limitation of the existing distortion measures embedded in vertex-based operational-rate-distortion shape coding techniques by introducing a new accurate distortion measurement algorithm based upon the actual distance, rather than either the shortest absolute distance or a distortion band.