Is this you? Create Your Porfile

Duc Anh Duong

Information Technology University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Duc Anh Duong is active.

Explore More

Publication

Featured researches published by Duc Anh Duong.

International Journal of Distributed Sensor Networks | 2014

ERI-MAC: An Energy-Harvested Receiver-Initiated MAC Protocol for Wireless Sensor Networks

Kien Nguyen; Vu Hoang Nguyen; Duy-Dinh Le; Yusheng Ji; Duc Anh Duong; Shigeki Yamada

Energy harvesting technology potentially solves the problem of energy efficiency, which is the biggest challenge in wireless sensor networks. The sensor node, which has a capability of harvesting energy from the surrounding environment, is able to achieve infinitive lifetime. The technology promisingly changes the fundamental principle of communication protocols in wireless sensor networks. Instead of saving energy as much as possible, the protocols should guarantee that the harvested energy is equal to or bigger than the consumed energy. At the same time, the protocols are designed to have the efficient operation and maximum network performance. In this paper, we propose ERI-MAC, a new receiver-initiated MAC protocol for energy harvesting sensor networks. ERI-MAC leverages the benefit of receiver-initiated and packet concatenation to achieve good performance both in latency and in energy efficiency. Moreover, ERI-MAC employs a queueing mechanism to adjust the operation of a sensor node following the energy harvesting rate from the surrounding environment. We have extensively evaluated ERI-MAC in a large scale network with a realistic traffic model using the network simulator ns-2. The simulation results show that ERI-MAC achieves good network performance, as well as enabling infinitive lifetime of sensor networks.

conference on image and video retrieval | 2010

An efficient method for face retrieval from large video datasets

Thao Ngoc Nguyen; Thanh Duc Ngo; Duy-Dinh Le; Shin'ichi Satoh; Bac Le; Duc Anh Duong

The human face is one of the most important objects in videos since it provides rich information for spotting certain people of interest, such as government leaders in news video, or the hero in a movie, and is the basis for interpreting facts. Therefore, detecting and recognizing faces appearing in video are essential tasks of many video indexing and retrieval applications. Due to large variations in pose changes, illumination conditions, occlusions, hairstyles, and facial expressions, robust face matching has been a challenging problem. In addition, when the number of faces in the dataset is huge, e.g. tens of millions of faces, a scalable method for matching is needed. To this end, we propose an efficient method for face retrieval in large video datasets. In order to make the face retrieval robust, the faces of the same person appearing in individual shots are grouped into a single face track by using a reliable tracking method. The retrieval is done by computing the similarity between face tracks in the databases and the input face track. For each face track, we select one representative face and the similarity between two face tracks is the similarity between their two representative faces. The representative face is the mean face of a subset selected from the original face track. In this way, we can achieve high accuracy in retrieval while maintaining low computational cost. For experiments, we extracted approximately 20 million faces from 370 hours of TRECVID video, of which scale has never been addressed by the former attempts. The results evaluated on a subset consisting of manually annotated 457,320 faces show that the proposed method is effective and scalable.

Multimedia Tools and Applications | 2017

Evaluation of multiple features for violent scenes detection

Vu Lam; Sang Phan; Duy-Dinh Le; Duc Anh Duong; Shin'ichi Satoh

Violent scenes detection (VSD) is a challenging problem because of the heterogeneous content, large variations in video quality, and complex semantic meanings of the concepts involved. In the last few years, combining multiple features from multi-modalities has proven to be an effective strategy for general multimedia event detection (MED), but the specific event detection like VSD has been comparatively less studied. Here, we evaluated the use of multiple features and their combination in a violent scenes detection system. We rigorously analyzed a set of low-level features and a deep learning feature that captures the appearance, color, texture, motion and audio in video. We also evaluated the utility of mid-level visual information obtained from detecting related violent concepts. Experiments were performed on the publicly available MediaEval VSD 2014 dataset. The results showed that visual and motion features are better than audio features. Moreover, the performance of the mid-level features was nearly as good as that of the low-level visual features. Experiments with a number of fusion methods showed that all single features are complementary and help to improve overall performance. This study also provides an empirical foundation for selecting feature sets that are capable of dealing with heterogeneous content comprising violent scenes in movies.

signal-image technology and internet-based systems | 2008

Robust Face Track Finding in Video Using Tracked Points

Thanh Duc Ngo; Duy-Dinh Le; Shin'ichi Satoh; Duc Anh Duong

We present an robust method for detecting face tracks in video in which each face track represents one individual. Such face tracks are important for many potential applications such as video face recognition, face matching, and face-name association. The basic idea is to use the Kanade-Lucas-Tomasi (KLT) tracker to track interest points throughout video frames, and each face track is formed by the faces detected in different frames that share a large enough number of tracked points. However, since interest points are sensitive to illumination changes, occlusions, and false face detections, face tracks are often fragmented. Our proposed method maintains tracked points of faces instead of shots, and interest points are re-computed in every frame to avoid these issues. Experimental results on different long video sequences show the effectiveness of our approach.

signal processing systems | 2014

Multimedia Event Detection Using Segment-Based Approach for Motion Feature

Sang Phan; Thanh Duc Ngo; Vu Lam; Son Tran; Duy-Dinh Le; Duc Anh Duong; Shin'ichi Satoh

Multimedia event detection has become a popular research topic due to the explosive growth of video data. The motion features in a video are often used to detect events because an event may contain some specific actions or moving patterns. Raw motion features are extracted from the entire video first and then aggregated to form the final video representation. However, this video-based representation approach is ineffective when used for realistic videos because the video length can be very different and the clues for determining an event may happen in only a small segment of the entire video. In this paper, we propose using a segment-based approach for video representation. Basically, original videos are divided into segments for feature extraction and classification, while still keeping the evaluation at the video level. The experimental results on recent TRECVID Multimedia Event Detection datasets proved the effectiveness of our approach.

conference on multimedia modeling | 2015

AttRel: An Approach to Person Re-Identification by Exploiting Attribute Relationships

Ngoc-Bao Nguyen; Vu-Hoang Nguyen; Thanh Ngo Duc; Duy-Dinh Le; Duc Anh Duong

Person Re-Identification refers to recognizing people across cameras with non-overlapping capture areas. To recognize people, their images must be represented by feature vectors for matching. Recent state-of-the-art approaches employ semantic features, also known as attributes (e.g. wearing-bags, jeans, skirt), for presentation. However, such presentations are sensitive to attribute detection results which can be irrelevant due to noise. In this paper, we propose an approach to exploit relationships between attributes for refining attribute detection results. Experimental results on benchmark datasets (VIPeR and PRID) demonstrate the effectiveness of our proposed approach.

conference on multimedia modeling | 2013

NII-UIT-VBS: A Video Browsing Tool for Known Item Search

Duy-Dinh Le; Vu Lam; Thanh Duc Ngo; Vinh Quang Tran; Vu Hoang Nguyen; Duc Anh Duong; Shin'ichi Satoh

This paper introduces a video browsing tool for the known item search task. The key idea is to reduce the number of segments to further investigate by several ways such as applying visual filters and skimming representative keyframes. The user interface is optimally designed so as to reduce unnecessary navigations. Furthermore, a coarse-to-fine based approach is employed to quickly find the target clip.

multimedia signal processing | 2008

A text segmentation based approach to video shot boundary detection

Duy-Dinh Le; Shin'ichi Satoh; Thanh Duc Ngo; Duc Anh Duong

Video shot boundary detection is one of the fundamental tasks of video indexing and retrieval applications. Although many methods have been proposed for this task, finding a general and robust shot boundary method that is able to handle the various transition types caused by photo flashes, rapid camera movement and object movement is still challenging. We present a novel approach for detecting video shot boundaries in which we cast the problem of shot boundary detection into the problem of text segmentation in natural language processing. This is possible by assuming that each frame is a word and then the shot boundaries are treated as text segment boundaries (e.g. topics). The text segmentation based approaches in natural language processing can be used. The experimental results from various long video sequences have proved the effectiveness of our approach.

soft computing and pattern recognition | 2013

Re-ranking for person re-identification

Vu Hoang Nguyen; Thanh Duc Ngo; Khang M. T. T. Nguyen; Duc Anh Duong; Kien Nguyen; Duy-Dinh Le

Person Re-Identification problem aims at matching people across a network of non-overlapping cameras. When multiple probe people appear concurrently, human could compare them together to give a more accurate matching. However, existing approaches treat each probe person independently, skipping the concurrent information. In this paper, we propose a re-ranking method which utilize that kind of information to refine ranked lists produced by any person re-identification method to create more precise ranked lists. The experimental results on VIPeR dataset show the improved performance when our method is applied.

international symposium on multimedia | 2014

Integrating Spatial Information into Inverted Index for Large-Scale Image Retrieval

Bien-Van Nguyen; Duy Pham; Thanh Duc Ngo; Duy-Dinh Le; Duc Anh Duong

In recent years, large-scale image retrieval has been shown remarkable potential in real-life applications. To reduce retrieval time as searched database may contain thousands of images, Inverted Indexing is the basic technique, given images are represented by Bag-of-Words model. However, one major limitation of both standard Inverted Index and Bag-of-Words model is that they ignore spatial information of the visual words in images. This might reduce retrieval accuracy. In this paper, we introduce an approach to integrate spatial information into inverted index to improve accuracy while maintaining short retrieval time. Experiments conducted on several benchmark datasets (Oxford Building 5K, Paris 6K and Oxford Building 5K+100K) demonstrate the effectiveness of our proposed approach.

Explore More