Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vishnu Monn Baskaran is active.

Publication


Featured researches published by Vishnu Monn Baskaran.


Journal of Network and Computer Applications | 2013

Software-based serverless endpoint video combiner architecture for high-definition multiparty video conferencing

Vishnu Monn Baskaran; Yoong Choon Chang; KokSheik Wong

This paper proposes an endpoint video combiner architecture in a multipoint control unit (MCU) system for high definition multiparty video conferencing. The proposed architecture addresses the current reliability, computational and quality drawbacks of a conventional centralized based video combiner architecture. This is achieved by redesigning the MCU video to move away the video combiner from the bridge and into the client endpoints. Moreover, the proposed architecture represents a serverless system and is able to scale a large number of clients at high resolutions in a multipoint video conferencing session. In order to realize this design, this paper also proposes a custom robust sustainable session management protocol which allows a dynamic multi-port management between the MCU video and client endpoints. In addition, the proposed custom session management protocol includes recommendation for a session protection structure. Experimental results suggest that the proposed architecture exhibits significant computational frame rate performance gains of up to 762.95% in comparison with the conventional centralized video combiner architecture based on a series of four and eight high definition combined video assessments. Moreover, reliability analysis suggests that the proposed architecture is also able to consistently sustain a high frame rate performance within a long duration high definition multipoint video conferencing session.


ieee conference on open systems | 2013

Active participant identification and tracking using depth sensing technology for video conferencing

Yee-Hui Oh; Cheng Yew Tan; Vishnu Monn Baskaran

Video conferencing represents an effective method of point-to-point or multipoint real-time communication between two or more participants. However, persistent manual adjustments of the video capture device to focus on an active participant represent a challenge, especially if the conference participant moves out of the video capture window. As such, this paper proposes an active-based participant identification and tracking system, which continuously tracks and automatically adjusts the video capture device to maintain focus of the active conference participant. The proposed system first applies a haarcascade face detection algorithm to register and store a set of facial images of the active participant. By leveraging on the depth sensing technology of Microsoft Kinect, this system compares the captured skeletal head position images of participants within the Kinect camera viewpoint, which is then compared against the aforementioned stored face detection images using the principle component analysis face recognition algorithm. The recognized user by the system is then continuously tracked as a skeletal object via a custom designed vertical and horizontal servo controlled motorized system. The custom motorized system sits under the Kinect sensor and is able to achieve 180 degrees in horizontal panning and 22.7 degrees in vertical tilting in line with tracking the movement of the active conference participant.


asia pacific conference on circuits and systems | 2010

Audio mixer with Automatic Gain Controller for software based Multipoint Control Unit

Vishnu Monn Baskaran; KokSheik Wong

This paper proposes two audio mixing algorithms in software based Multipoint Control Unit (MCU) using Automatic Gain Controller. The objectives of the proposed algorithms include performing selective mixing, minimizing audio clipping, assigning higher amplitude priority to the loudest speaker and ensuring a smooth transition as the primary speaker changes from one participant to another. The proposed algorithms are able to suppress unexpected signal spikes by considering previously computed gain value(s) in the calculation of the current gain. The proposed algorithms are also able to mix multiple VoIP packets for up to eight conference participants within 150µs, which is well below the default RTP audio packet timeframe of 20ms. The proposed mixing algorithms are integrated with the audio decoder and encoder modules to complete theMCU system. Experiments were carried out to verify the basic performances of the proposed mixing algorithms.


asia pacific conference on circuits and systems | 2010

Building a real-time multiple H.264 video streaming system based on intel IPP

Vishnu Monn Baskaran; YeongSheng Low; KokSheik Wong

This paper describes the required building blocks in developing a software based real-time multiple H.264 video streaming system. Here, we consider video capture using Microsoft® DirectShow and video codec based on Intel® Integrated Performance Primitives (IPP) H.264 codec library where the Real-time Transport Protocol (RTP) is utilized as the transport protocol for the compressed video packets. These modules are implemented and combined to develop a system, which is able to stream and receive four simultaneous CIF video streams or one 720p High Definition video stream. Experiments were carried out to verify the basic performance of the developed system. Performance of the developed system is also compared to that of Intels IPP H.263 codec library and a commercial video streaming solution.


Journal of Electronic Imaging | 2015

Real-time high-resolution downsampling algorithm on many-core processor for spatially scalable video coding

Adamu Muhammad Buhari; Huo-Chong Ling; Vishnu Monn Baskaran; KokSheik Wong

Abstract. The progression toward spatially scalable video coding (SVC) solutions for ubiquitous endpoint systems introduces challenges to sustain real-time frame rates in downsampling high-resolution videos into multiple layers. In addressing these challenges, we put forward a hardware accelerated downsampling algorithm on a parallel computing platform. First, we investigate the principal architecture of a serial downsampling algorithm in the Joint-Scalable-Video-Model reference software to identify the performance limitations for spatially SVC. Then, a parallel multicore-based downsampling algorithm is studied as a benchmark. Experimental results for this algorithm using an 8-core processor exhibit performance speedup of 5.25× against the serial algorithm in downsampling a quantum extended graphics array at 1536p video resolution into three lower resolution layers (i.e., Full-HD at 1080p, HD at 720p, and Quarter-HD at 540p). However, the achieved speedup here does not translate into the minimum required frame rate of 15 frames per second (fps) for real-time video processing. To improve the speedup, a many-core based downsampling algorithm using the compute unified device architecture parallel computing platform is proposed. The proposed algorithm increases the performance speedup to 26.14× against the serial algorithm. Crucially, the proposed algorithm exceeds the target frame rate of 15 fps, which in turn is advantageous to the overall performance of the video encoding process.


international conference on consumer electronics | 2016

Low complexity watermarking scheme for scalable video coding

Adamu Muhammad Buhari; Huo Chong Ling; Vishnu Monn Baskaran; KokSheik Wong

This paper presents a low complexity human visual system based watermarking algorithm for H.264 spatial scalable coding. The proposed algorithm extracts textural feature from a set of 7 high energy quantized coefficients in 4 × 4 luma INTRA-predicted blocks of all-slices and embeds watermark into the highly textured block which has at least one non-zero coefficient in 6 selected locations. Experiments were conducted by embedding up to 8192 watermark bits into a four-layer spatial scalable coded video. Results suggest that the proposed scheme produces watermarked video with an average visual quality degradation of 0.36 dB at the expense of 2.18% bitrate overhead. In addition, the proposed watermarking scheme achieves the detection rates of 0.96, 0.94 and 0.66 against re-encoding, recompression and Gaussian filtering attacks, respectively.


international conference on consumer electronics | 2015

Dominant speaker detection using discrete Markov chain for multi-user video conferencing

Vishnu Monn Baskaran; Yoong Choon Chang; KokSheik Wong; Ming-Tao Gan

This paper puts forward a discrete-time Markov chain algorithm in predicting a pair of active or dominant speakers in an ultra-high definition multi-user video conferencing system. The applied Markov chain minimizes false dominant speaker classification due to transient noise during a video conferencing session. This algorithm also includes a set of linear weights-based assignment for both the initial state vector and transition probability matrix, which improves the response of the algorithm towards changing dominant speakers. Experimental results suggests that this algorithm accurately predicts the most dominant speaker at a rate of 83% for 11 clients in a combined video with 86% reduction in false dominant speaker classification, based on given a set of artificial speaker data.


Information Sciences | 2018

Dominant speaker detection in multipoint video communication using Markov chain with non-linear weights and dynamic transition window

Vishnu Monn Baskaran; Yoong Choon Chang; KokSheik Wong; Ming-Tao Gan

This paper proposes an enhanced discrete-time Markov chain algorithm in predicting dominant speaker(s) for multipoint video communication system in the presence of transient speech. The proposed algorithm exploits statistical properties of the past speech patterns to accurately predict the dominant speaker for the next time state. Non-linear weights-based coefficients are employed in the enhanced Markov chain for both the initial state vector and transition probability matrix. These weights significantly improve the time taken to predict a new dominant speaker during a conference session. In addition, a mechanism to dynamically modify the size of the transition probability matrix window/container is introduced to improve the adaptability of the Markov chain towards the variability of speech characteristics. Simulation results indicate that for an 11 conference participants test scenario, the enhanced Markov chain prediction algorithm registered an 85% accuracy in predicting a dominant speaker when compared to an ideal case where there is no transient speech. Misclassification of dominant speakers due to transient speech was also reduced by 87%.


Frontiers in Psychology | 2018

A survey of automatic facial micro-expression analysis: Databases, methods, and challenges

Yee-Hui Oh; John See; Anh Cat Le Ngo; Raphael C.-W. Phan; Vishnu Monn Baskaran

Over the last few years, automatic facial micro-expression analysis has garnered increasing attention from experts across different disciplines because of its potential applications in various fields such as clinical diagnosis, forensic investigation and security systems. Advances in computer algorithms and video acquisition technology have rendered machine analysis of facial micro-expressions possible today, in contrast to decades ago when it was primarily the domain of psychiatrists where analysis was largely manual. Indeed, although the study of facial micro-expressions is a well-established field in psychology, it is still relatively new from the computational perspective with many interesting problems. In this survey, we present a comprehensive review of state-of-the-art databases and methods for micro-expressions spotting and recognition. Individual stages involved in the automation of these tasks are also described and reviewed at length. In addition, we also deliberate on the challenges and future directions in this growing field of automatic facial micro-expression analysis.


international conference on consumer electronics | 2016

Active surveillance using depth sensing technology — Part III: Real-time intrusion mapping with remote notification

Jun Chieh Chua; Vishnu Monn Baskaran

In the final part of the three-part series on active surveillance using depth-sensing technology, this paper proposes a system that provides both real-time geographical tracking of an intruder and remote alarm notification. This is achieved by first translating both the skeletal depth and rotation angle from a set of cascaded Kinect depth sensors mounted on a pan tilt unit into a geographical coordinate system. These coordinates are then relayed to multiple notification modules, representing a unified remote alarm notification system of a surveilled premise(s). This system also includes a real-time plot of the intruder on a map during the tracking phase and a proximity algorithm to compute the distance between the intruder and each premise. Experiment results validates the feasibility of the proposed system in realizing a unified real-time intruder mapping and notification platform.

Collaboration


Dive into the Vishnu Monn Baskaran's collaboration.

Top Co-Authors

Avatar

KokSheik Wong

Monash University Malaysia Campus

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge