Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tam V. Nguyen is active.

Publication


Featured researches published by Tam V. Nguyen.


european conference on computer vision | 2012

Depth matters: influence of depth cues on visual saliency

Congyan Lang; Tam V. Nguyen; Harish Katti; Karthik Yadati; Mohan S. Kankanhalli; Shuicheng Yan

Most previous studies on visual saliency have only focused on static or dynamic 2D scenes. Since the human visual system has evolved predominantly in natural three dimensional environments, it is important to study whether and how depth information influences visual saliency. In this work, we first collect a large human eye fixation database compiled from a pool of 600 2D-vs-3D image pairs viewed by 80 subjects, where the depth information is directly provided by the Kinect camera and the eye tracking data are captured in both 2D and 3D free-viewing experiments. We then analyze the major discrepancies between 2D and 3D human fixation data of the same scenes, which are further abstracted and modeled as novel depth priors. Finally, we evaluate the performances of state-of-the-art saliency detection models over 3D images, and propose solutions to enhance their performances by integrating the depth priors.


acm multimedia | 2013

Static saliency vs. dynamic saliency: a comparative study

Tam V. Nguyen; Mengdi Xu; Guangyu Gao; Mohan S. Kankanhalli; Qi Tian; Shuicheng Yan

Recently visual saliency has attracted wide attention of researchers in the computer vision and multimedia field. However, most of the visual saliency-related research was conducted on still images for studying static saliency. In this paper, we give a comprehensive comparative study for the first time of dynamic saliency (video shots) and static saliency (key frames of the corresponding video shots), and two key observations are obtained: 1) video saliency is often different from, yet quite related with, image saliency, and 2) camera motions, such as tilting, panning or zooming, affect dynamic saliency significantly. Motivated by these observations, we propose a novel camera motion and image saliency aware model for dynamic saliency prediction. The extensive experiments on two static-vs-dynamic saliency datasets collected by us show that our proposed method outperforms the state-of-the-art methods for dynamic saliency prediction. Finally, we also introduce the application of dynamic saliency prediction for dynamic video captioning, assisting people with hearing impairments to better entertain videos with only off-screen voices, e.g., documentary films, news videos and sports videos.


IEEE Transactions on Circuits and Systems for Video Technology | 2015

STAP: Spatial-Temporal Attention-Aware Pooling for Action Recognition

Tam V. Nguyen; Zheng Song; Shuicheng Yan

Human action recognition is valuable for numerous practical applications, e.g., gaming, video surveillance, and video search. In this paper we hypothesize that the classification of actions can be boosted by designing a smart feature pooling strategy under the prevalently used bag-of-words-based representation. Founded on automatic video saliency analysis, we propose the spatial-temporal attention-aware pooling scheme for feature pooling. First, the video saliencies are predicted using the video saliency model, and the localized spatial-temporal features are pooled at different saliency levels and video-saliency-guided channels are formed. Saliency-aware matching kernels are thus derived as the similarity measurement of these channels. Intuitively, the proposed kernels calculate the similarities of the video foreground (salient areas) or background (nonsalient areas) at different levels. Finally, the kernels are fed into popular support vector machines for action classification. Extensive experiments on three popular data sets for action classification validate the effectiveness of our proposed method, which outperforms the state-of-the-art methods, namely 95.3% on UCF Sports (better by 4.0%), 87.9% on YouTube data set (better by 2.5%), and achieves comparable results on Hollywood2 dataset.


acm multimedia | 2012

Sense beauty via face, dressing, and/or voice

Tam V. Nguyen; Si Liu; Bingbing Ni; Jun Tan; Yong Rui; Shuicheng Yan

Discovering the secret of beauty has been the pursuit of artists and philosophers for centuries. Nowadays, the computational model for beauty estimation has been actively explored in computer science community, yet with the focus mainly on facial features. In this work, we perform a comprehensive study of female attractiveness conveyed by single/multiple modalities of cues, i.e., face, dressing and/or voice, and aim to uncover how different modalities individually and collectively affect the human sense of beauty. To this end, we collect the first Multi-Modality Beauty (M2B) dataset in the world for female attractiveness study, which is thoroughly annotated with attractiveness levels converted from manual k-wise ratings and semantic attributes of different modalities. A novel Dual-supervised Feature-Attribute-Task (DFAT) network is proposed to jointly learn the beauty estimation models of single/multiple modalities as well as the attribute estimation models. The DFAT network differentiates itself by its supervision in both attribute and task layers. Several interesting beauty-sense observations over single/multiple modalities are reported, and the extensive experimental evaluations on the collected M2B dataset well demonstrate the effectiveness of the proposed DFAT network for female attractiveness estimation.


wireless and optical communications networks | 2008

Network traffic anomalies detection and identification with flow monitoring

Huy A. Nguyen; Tam V. Nguyen; Dong Il Kim; Deokjai Choi

Network management and security is currently one of the most vibrant research areas, among which, research on detecting and identifying anomalies has attracted a lot of interest. Researchers are still struggling to find an effective and lightweight method for anomaly detection purpose. In this paper, we propose a simple, robust method that detects network anomalous traffic data based on flow monitoring. Our method works based on monitoring the four predefined metrics that capture the flow statistics of the network. In order to prove the power of the new method, we did build an application that detects network anomalies using our method. And the result of the experiments proves that by using the four simple metrics from the flow data, we do not only effectively detect but can also identify the network traffic anomalies.


IEEE Transactions on Multimedia | 2013

Image Re-Attentionizing

Tam V. Nguyen; Bingbing Ni; Hairong Liu; Wei Xia; Jiebo Luo; Mohan S. Kankanhalli; Shuicheng Yan

In this paper, we propose a computational framework, called Image Re-Attentionizing, to endow the target region in an image with the ability of attracting human visual attention. In particular, the objective is to recolor the target patches by color transfer with naturalness and smoothness preserved yet visual attention augmented. We propose to approach this objective within the Markov Random Field (MRF) framework and an extended graph cuts method is developed to pursue the solution. The input image is first over-segmented into patches, and the patches within the target region as well as their neighbors are used to construct the consistency graphs. Within the MRF framework, the unitary potentials are defined to encourage each target patch to match the patches with similar shapes and textures from a large salient patch database, each of which corresponds to a high-saliency region in one image, while the spatial and color coherence is reinforced as pairwise potentials. We evaluate the proposed method on the direct human fixation data. The results demonstrate that the target region(s) successfully attract human attention and in the meantime both spatial and color coherence is well preserved.


IEEE Transactions on Circuits and Systems for Video Technology | 2014

Audio Matters in Visual Attention

Yanxiang Chen; Tam V. Nguyen; Mohan S. Kankanhalli; Jun Yuan; Shuicheng Yan; Meng Wang

There is a dearth of information on how perceived auditory information guides image-viewing behavior. To investigate auditory-driven visual attention, we first generated a human eye-fixation database from a pool of 200 static images and 400 image-audio pairs viewed by 48 subjects. The eye tracking data for the image-audio pairs were captured while participants viewed images, which took place immediately after exposure to coherent/incoherent audio samples. The database was analyzed in terms of time to first fixation, fixation durations on the target object, entropy, AUC, and saliency ratio. It was found that coherent audio information is an important cue for enhancing the feature-specific response to the target object. Conversely, incoherent audio information attenuates this response. Finally, a system predicting the image-viewing with the influence of different audio sources was developed. The detailedly discussed top-down module in the system is composed of auditory estimation based on Gaussian mixture model-maximum a posteriori algorithm-universal background model structure, as well as visual estimation based on the conditional random field model and sparse latent variables. The evaluation experiments show that the proposed models in the system exhibit strong consistency with eye fixations.


ACM Transactions on Multimedia Computing, Communications, and Applications | 2013

Towards decrypting attractiveness via multi-modality cues

Tam V. Nguyen; Si Liu; Bingbing Ni; Jun Tan; Yong Rui; Shuicheng Yan

Decrypting the secret of beauty or attractiveness has been the pursuit of artists and philosophers for centuries. To date, the computational model for attractiveness estimation has been actively explored in computer vision and multimedia community, yet with the focus mainly on facial features. In this article, we conduct a comprehensive study on female attractiveness conveyed by single/multiple modalities of cues, that is, face, dressing and/or voice, and aim to discover how different modalities individually and collectively affect the human sense of beauty. To extensively investigate the problem, we collect the Multi-Modality Beauty (M2B) dataset, which is annotated with attractiveness levels converted from manual k-wise ratings and semantic attributes of different modalities. Inspired by the common consensus that middle-level attribute prediction can assist higher-level computer vision tasks, we manually labeled many attributes for each modality. Next, a tri-layer Dual-supervised Feature-Attribute-Task (DFAT) network is proposed to jointly learn the attribute model and attractiveness model of single/multiple modalities. To remedy possible loss of information caused by incomplete manual attributes, we also propose a novel Latent Dual-supervised Feature-Attribute-Task (LDFAT) network, where latent attributes are combined with manual attributes to contribute to the final attractiveness estimation. The extensive experimental evaluations on the collected M2B dataset well demonstrate the effectiveness of the proposed DFAT and LDFAT networks for female attractiveness prediction.


IEEE Transactions on Circuits and Systems for Video Technology | 2015

Adaptive Nonparametric Image Parsing

Tam V. Nguyen; Canyi Lu; Jose Sepulveda; Shuicheng Yan

In this paper, we present an adaptive nonparametric solution to the image parsing task, namely, annotating each image pixel with its corresponding category label. For a given test image, first, a locality-aware retrieval set is extracted from the training data based on superpixel matching similarities, which are augmented with feature extraction for better differentiation of local superpixels. Then, the category of each superpixel is initialized by the majority vote of the k -nearest-neighbor superpixels in the retrieval set. Instead of fixing k as in traditional nonparametric approaches, here, we propose a novel adaptive nonparametric approach that determines the sample-specific k for each test image. In particular, k is adaptively set to be the number of the fewest nearest superpixels that the images in the retrieval set can use to get the best category prediction. Finally, the initial superpixel labels are further refined by contextual smoothing. Extensive experiments on challenging data sets demonstrate the superiority of the new solution over other state-of-the-art nonparametric solutions.


Archive | 2013

Fuzzy Online Reputation Analysis Framework

Edy Portmann; Tam V. Nguyen; Jose Sepulveda; Adrian David Cheok

In the Social Semantic Web an organization, a brand, the name of a high-profile executive, or a particular product can be defined as the hodgepodge of all online conversations taking place around it, and this is happening regardless of whether or not an organization participates in the conversationscape’s dialogue. Long story short, organizations in the first place are forced to listen to the Social Web so in order to take part in and, in this way, improve their online reputation. To do that intuitively, the FORA framework is conceptualized as a pertinent listening application. So, the term FORA originates from the plural form of forum, the Latin word for marketplaces (Portmann, Nguyen, Sepulveda, & Cheok, 2012). Thus, the framework allows organizations’ communication operatives a fuzzy exploration of reputation in online marketplaces. Listening and then increasing engagement within social media elements is a hard task. There is a constant flow of information and many organizations do not know how to harness and gain actionable insights from this rich source of customer conversations. The idea beyond the conceptualization of the framework is to listen and in doing so automatically identify key social media elements 24/7 to simplify online reputation analysis and, by that, impart onto communication operatives insightful information on which they can actually act upon. To make this system reality, a design science approach is pursued.

Collaboration


Dive into the Tam V. Nguyen's collaboration.

Top Co-Authors

Avatar

Deokjai Choi

Chonnam National University

View shared research outputs
Top Co-Authors

Avatar

Shuicheng Yan

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mohan S. Kankanhalli

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Huy A. Nguyen

Chonnam National University

View shared research outputs
Top Co-Authors

Avatar

Wontaek Lim

Chonnam National University

View shared research outputs
Top Co-Authors

Avatar

Bingbing Ni

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Luoqi Liu

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Ashkan Yousefpour

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Bilal Mirza

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge