Is this you? Create Your Porfile

Dinh Viet Sang

Hanoi University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dinh Viet Sang is active.

Explore More

Publication

Featured researches published by Dinh Viet Sang.

international symposium on information and communication technology | 2015

An Efficient Framework for Pixel-wise Building Segmentation from Aerial Images

Nguyen Tien Quang; Nguyen Thi Thuy; Dinh Viet Sang; Huynh Thi Thanh Binh

Detection of buildings in aerial images is an important and challenging task in computer vision and aerial image interpretation. This paper presents an efficient approach that combines Random forest (RF) and a fully connected conditional random field (CRF) on various features for the detection and segmentation of buildings at pixel level. RF allows one to learn extremely fast on big aerial image data. The unary potentials given by RF are then combined in a fully connected conditional random field model for pixel-wise classification. The use of high dimensional Gaussian filter for pairwise potentials makes the inference tractable while obtaining high classification accuracy. Experiments have been conducted on a challenging aerial image dataset from a recent ISPRS Semantic Labeling Contest [9]. We obtained state-of-the-art accuracy with a reasonable computation time.

international symposium on information and communication technology | 2015

A Denoising Method Based on Total Variation

Thanh N. H. Dang; Dvoenko Sergey; Dinh Viet Sang

Today large amounts of digital images are created by various modern devices such as digital cameras, X-Ray scanners, and so on. Noise reduces image quality and result of the processing. For example, biomedical images are a type of digital images. In these images, there is a combination of two types of noises: Gaussian noise and Poisson noise. In this paper, we propose a method to remove these noises. This method is based on the total variation of an image intensity (brightness) function. We combine two famous denoising models to remove this combination of noises.

knowledge and systems engineering | 2017

Facial expression recognition using deep convolutional neural networks

Dinh Viet Sang; Nguyen Van Dat; Do Phan Thuan

Facial expressions convey non-verbal information between humans in face-to-face interactions. Automatic facial expression recognition, which plays a vital role in human-machine interfaces, has attracted increasing attention from researchers since the early nineties. Classical machine learning approaches often require a complex feature extraction process and produce poor results. In this paper, we apply recent advances in deep learning to propose effective deep Convolutional Neural Networks (CNNs) that can accurately interpret semantic information available in faces in an automated manner without hand-designing of features descriptors. We also apply different loss functions and training tricks in order to learn CNNs with a strong classification power. The experimental results show that our proposed networks outperform state-of-the-art methods on the well-known FERC-2013 dataset provided on the Kaggle facial expression recognition competition. In comparison to the winning model of this competition, the number of parameters in our proposed networks intensively decreases, that accelerates the overall performance speed and makes the proposed networks well suitable for real-time systems.

knowledge and systems engineering | 2017

Facial smile detection using convolutional neural networks

Dinh Viet Sang; Le Tran Bao Cuong; Do Phan Thuan

Facial expression analysis plays a key role in analyzing emotions and human behaviors. Smile detection is a special task in facial expression analysis with various potential applications such as photo selection, user experience analysis, smiling payment and patient monitoring. Conventional approaches often extract low-level face descriptors and detect smile based on a strong binary classifier. In this paper, we propose an effective architecture of Convolutional Neural Networks (CNNs) to detect smile in real-time speed with high accuracy. The experimental results show that our proposed network outperforms recent state-of-the-art methods.

symposium on information and communication technology | 2016

Label associated dictionary pair learning for face recognition

Dao Duy Son; Dinh Viet Sang; Huynh Thi Thanh Binh; Nguyen Thi Thuy

Dictionary learning (DL) has been successfully applied to various pattern classification tasks. Sparse coding has played a vital role in the success of such DL-based models. However, the popular sparsity constraints using l0 or l1-norm often make the training phase time-consuming. Recently, an emerging trend in using l2-norm has shown its advantages in both accuracy and computational speed. However, the supervised approach that exploits label information in the training phase has not been investigated in such l2-norm based methods. In this paper, we propose a novel supervised dictionary learning method that incorporates label information in the objective function. Based on that, we also propose an effective classification schema. Experiments on three popular face recognition datasets show that our method has promising results. Especially, our method has extremely fast speed in test phase, while maintaining competitive accuracy in comparison with other state-of-the-art models.

international symposium on information and communication technology | 2017

Multi-task learning for smile detection, emotion recognition and gender classification

Dinh Viet Sang; Le Tran Bao Cuong; Vu Van Thieu

Facial expression analysis plays a key role in analyzing emotions and human behaviors. Smile detection, emotion recognition and gender classification are special tasks in facial expression analysis with various potential applications. In this paper, we propose an effective architecture of Convolutional Neural Network (CNN) which can jointly learn representations for three tasks: smile detection, emotion recognition and gender classification. In addition, this model can be trained from multiple sources of data with different kinds of task-specific class labels. The extensive experiments show that our model achieves superior accuracy over recent state-of-the-art techniques in all of three tasks on popular benchmarks. We also show that the joint learning helps the tasks with less data considerably benefit from other tasks with richer data.

symposium on information and communication technology | 2016

Colour image denoising based on a combined model

Dang N. H. Thanh; Dinh Viet Sang; Dvoenko Sergey

In this paper, we propose a method to remove noise in RGB-color images. This method is based on a total variation of intensity function of images. Here, the proposed method is a developed version of our previous method to remove a linear combination of Gaussian and Poisson noises in grayscale images. This method works well with the wide range of proportion of Poisson and Gaussian noises. We show here, the developed version can be used to well approximate real noises in color raster images.

knowledge and systems engineering | 2015

Uniform Detection in Social Image Streams

Nguyen Quang Manh; Nguyen Duc Tuan; Dinh Viet Sang; Huynh Thi Thanh Binh; Nguyen Thi Thuy

Social media mining from Internet has been an emerging research topic. The problem is challenging because of massive data contents from various sources, especially image data from user upload. In recent years, dictionary learning based image classification has been widely studied and gained significant success. In this paper, we propose a framework for automatic detection of interested uniforms in image streams from social networks. The systems is composed of a powerful feature extraction module based on dense SIFT feature and a state-of-the-art discriminative dictionary learning approach. Beside that, a parallel implementation of feature extraction is deployed to make the system work real time. An extensive set of experiments has been conducted on four real-life datasets. The experimental results show that we can obtain the detection rate up to 100% on some datasets. We also get real time performance with a speed of image stream of about 40 images per second. The framework can be applied to emerging applications such as uniform detection, automated image tagging, content base image retrieval or online advertisement based on image content.

knowledge and systems engineering | 2015

A Study on Non-sparse Dictionary Learning for Pattern Classification

Nguyen Duc Tuan; Nguyen Quang Manh; Dinh Viet Sang; Huynh Thi Thanh Binh; Nguyen Thi Thuy

Dictionary learning (DL) approach has been successfully applied to many pattern classification problems. Sparse property has played an important role in the success of DL-based classification models. However, the sparsity constraints make the learning problem expensive. Recently, there has been an emerged trend in relaxing the sparsity constraints by using L2-norm constraint. The new approach has shown its advantages in both accuracy and classification time. However, the relationship between the quality of the data and the dictionary learning issues that affect the performance of the system has not been investigated. In this paper, we present a comparative study on non-sparse coding dictionary learning for pattern classification. We then propose a dictionary learning model with a non-sparsity constraint on representation coefficients using L2-norm. Our experimental results on three popular benchmark datasets for image classification show that our proposed model can outperform state-of-the-art models and be a promising approach for dictionary learning based classification.

symposium on information and communication technology | 2014

Improving semantic texton forests with a Markov random field for image segmentation

Dinh Viet Sang; Mai Dinh Loi; Nguyen Tien Quang; Huynh Thi Thanh Binh; Nguyen Thi Thuy

Semantic image segmentation is a major and challenging problem in computer vision, which has been widely researched over decades. Recent approaches attempt to exploit contextual information at different levels to improve the segmentation results. In this paper, we propose a new approach for combining semantic texton forests (STFs) and Markov random fields (MRFs) for improving segmentation. STFs allow fast computing of texton codebooks for powerful low-level image feature description. MRFs, with the most effective algorithm in message passing for training, will smooth out the segmentation results of STFs using pairwise coherent information between neighboring pixels. We evaluate the performance of the proposed method on two well-known benchmark datasets including the 21-class MSRC dataset and the VOC 2007 dataset. The experimental results show that our method impressively improved the segmentation results of STFs. Especially, our method successfully recognizes many challenging image regions that STFs failed to do.

Explore More