Richard M. Jiang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Richard M. Jiang is active.

Explore More

Publication

Featured researches published by Richard M. Jiang.

Archive | 2009

Advances in Video Summarization and Skimming

Richard M. Jiang; Abdul H. Sadka; Danny Crookes

This chapter summarizes recent advances in video abstraction for fast content browsing, skimming, transmission, and retrieval of massive video database which are demanded in many system applications, such as web multimedia, mobile multimedia, interactive TV, and emerging 3D TV. Video summarization and skimming aims to provide an abstract of a long video for shortening the navigation and browsing the original video. The challenge of video summarization is to effectively extract certain content of the video while preserving essential messages of the original video. In this chapter, the preliminary on video temporal structure analysis is introduced, various video summarization schemes, such as using low-level features, motion descriptors and Eigen-features, are described, and case studies on two practical summarization schemes are presented with experimental results.

IEEE Transactions on Biomedical Engineering | 2010

Live-Cell Tracking Using SIFT Features in DIC Microscopic Videos

Richard M. Jiang; Danny Crookes; Nie Luo; Michael W. Davidson

In this paper, a novel motion-tracking scheme using scale-invariant features is proposed for automatic cell motility analysis in gray-scale microscopic videos, particularly for the live-cell tracking in low-contrast differential interference contrast (DIC) microscopy. In the proposed approach, scale-invariant feature transform (SIFT) points around live cells in the microscopic image are detected, and a structure locality preservation (SLP) scheme using Laplacian Eigenmap is proposed to track the SIFT feature points along successive frames of low-contrast DIC videos. Experiments on low-contrast DIC microscopic videos of various live-cell lines shows that in comparison with principal component analysis (PCA) based SIFT tracking, the proposed Laplacian-SIFT can significantly reduce the error rate of SIFT feature tracking. With this enhancement, further experimental results demonstrate that the proposed scheme is a robust and accurate approach to tackling the challenge of live-cell tracking in DIC microscopy.

IEEE Transactions on Consumer Electronics | 2009

Hierarchical video summarization in reference subspace

Richard M. Jiang; Abdul H. Sadka; Danny Crookes

In this paper, a hierarchical video structure summarization approach using Laplacian Eigenmap is proposed, where a small set of reference frames is selected from the video sequence to form a reference subspace to measure the dissimilarity between two arbitrary frames. In the proposed summarization scheme, the shot-level key frames are first detected from the continuity of inter-frame dissimilarity, and the sub-shot level and scene level representative frames are then summarized by using k-mean clustering. The experiment is carried on both test videos and movies, and the results show that in comparison with a similar approach using latent semantic analysis, the proposed approach using Laplacian Eigenmap can achieve a better recall rate in keyframe detection, and gives an efficient hierarchical summarization at sub shot, shot and scene levels subsequently.

systems man and cybernetics | 2010

Multimodal Biometric Human Recognition for Perceptual Human–Computer Interaction

Richard M. Jiang; Abdul H. Sadka; Danny Crookes

In this paper, a novel video-based multimodal biometric verification scheme using the subspace-based low-level feature fusion of face and speech is developed for specific speaker recognition for perceptual human-computer interaction (HCI). In the proposed scheme, human face is tracked and face pose is estimated to weight the detected facelike regions in successive frames, where ill-posed faces and false-positive detections are assigned with lower credit to enhance the accuracy. In the audio modality, mel-frequency cepstral coefficients are extracted for voice-based biometric verification. In the fusion step, features from both modalities are projected into nonlinear Laplacian Eigenmap subspace for multimodal speaker recognition and combined at low level. The proposed approach is tested on the video database of ten human subjects, and the results show that the proposed scheme can attain better accuracy in comparison with the conventional multimodal fusion using latent semantic analysis as well as the single-modality verifications. The experiment on MATLAB shows the potential of the proposed scheme to attain the real-time performance for perceptual HCI applications.

IEEE Transactions on Information Forensics and Security | 2010

Face Recognition in Global Harmonic Subspace

Richard M. Jiang; Danny Crookes; Nie Luo

In this paper, a novel pattern recognition scheme, global harmonic subspace analysis (GHSA), is developed for face recognition. In the proposed scheme, global harmonic features are extracted at the semantic scale to capture the 2-D semantic spatial structures of a face image. Laplacian Eigenmap is applied to discriminate faces in their global harmonic subspace. Experimental results on the Yale and PIE face databases show that the proposed GHSA scheme achieves an improvement in face recognition accuracy when compared with conventional subspace approaches, and a further investigation shows that the proposed GHSA scheme has impressive robustness to noise.

content based multimedia indexing | 2008

Feature extraction for speech and music discrimination

Huiyu Zhou; Abdul H. Sadka; Richard M. Jiang

Driven by the demand of information retrieval, video editing and human-computer interface, in this paper we propose a novel spectral feature for music and speech discrimination. This scheme attempts to simulate a biological model using the averaged cepstrum, where human perception tends to pick up the areas of large cepstral changes. The cepstrum data that is away from the mean value will be exponentially reduced in magnitude. We conduct experiments of music/speech discrimination by comparing the performance of the proposed feature with that of previously proposed features in classification. The dynamic time warping based classification verifies that the proposed feature has the best quality of music/speech classification in the test database.

pacific-asia conference on circuits, communications and systems | 2010

Human silhouette extraction on FPGAs for infrared night vision military surveillance

Iffat Zafar; Usman Zakir; Ilya V. Romanenko; Richard M. Jiang; Eran A. Edirisinghe

Infrared visual surveillance has become an important mean to secure military camps, reassure soldier security, and detect suspected terror activities in the battle fields. An intelligent infrared surveillance system is aimed to provide real-time intelligent analysis of the perceived scene and find out human targets instantly to assist the soldier/commanders to make the right decision in a just-in-time mode to save our soldiers from life risks. To attain this, automatic detection of moving human objects from the scene is an essential step. In this paper, we present an FPGAs-based architecture to perform on-chip human silhouette extraction using a parallel architecture with systolic arrays. The architecture is designed in VHDL and simulated with real FLIR videos. The experiment shows that the designed processor on FPGAs can efficiently extract the human contour instantly from infrared videos, which exhibits great potential to facilitate the further analysis of the battlefield scenes for military-purpose surveillance systems.

content based multimedia indexing | 2008

Automatic human face detection for content-based image annotation

Richard M. Jiang; Abdul H. Sadka; Huiyu Zhou

In this paper, an automatic human face detection approach using colour analysis is applied for content-based image annotation. In the face detection, the probable face region is detected by adaptive boosting algorithm, and then combined with a colour filtering classifier to enhance the accuracy in face detection. The initial experimental benchmark shows the proposed scheme can be efficiently applied for image annotation with higher fidelity.

workshop on image analysis for multimedia interactive services | 2008

Speech Enhancement in Noisy Environments for Video Retrieval

Huiyu Zhou; Abdul H. Sadka; Richard M. Jiang

In this paper, we propose a novel spectral subtraction approach for speech enhancement via maximum likelihood estimate (MLE). This scheme attempts to simulate the probability distribution of useful speech signals and hence maximally reduce the noise. To evaluate the quality of speech enhancement, we extract cepstral features from the enhanced signals, and then apply them to a dynamic time warping framework for similarity check between the clean and filtered signals. The performance of the proposed enhancement method is compared to that of other classical techniques. The entire framework does not assume any model for the background noise and does not require any noise training data.

workshop on image analysis for multimedia interactive services | 2008

3D Inference and Modelling for Video Retrieval

Huiyu Zhou; Abdul H. Sadka; Richard M. Jiang

A new scheme is proposed for extracting planar surfaces from 2D image sequences. We firstly perform feature correspondence over two neighboring frames, followed by the estimation of disparity and depth maps, provided a calibrated camera. We then apply iterative random sample consensus (RANSAC) plane fitting to the generated 3D points to find a dominant plane in a maximum likelihood estimation style. Object points on or off this dominant plane are determined by measuring their Euclidean distance to the plane. Experimental work shows that the proposed scheme leads to better plane fitting results than the classical RANSAC method.

Explore More