Featured Researches

Multimedia

A Multimodal CNN-based Tool to Censure Inappropriate Video Scenes

Due to the extensive use of video-sharing platforms and services for their storage, the amount of such media on the internet has become massive. This volume of data makes it difficult to control the kind of content that may be present in such video files. One of the main concerns regarding the video content is if it has an inappropriate subject matter, such as nudity, violence, or other potentially disturbing content. More than telling if a video is either appropriate or inappropriate, it is also important to identify which parts of it contain such content, for preserving parts that would be discarded in a simple broad analysis. In this work, we present a multimodal~(using audio and image features) architecture based on Convolutional Neural Networks (CNNs) for detecting inappropriate scenes in video files. In the task of classifying video files, our model achieved 98.95\% and 98.94\% of F1-score for the appropriate and inappropriate classes, respectively. We also present a censoring tool that automatically censors inappropriate segments of a video file.

Read more
Multimedia

A Noise-aware Enhancement Method for Underexposed Images

A novel method of contrast enhancement is proposed for underexposed images, in which heavy noise is hidden. Under low light conditions, images taken by digital cameras have low contrast in dark or bright regions. This is due to a limited dynamic range that imaging sensors have. For these reasons, various contrast enhancement methods have been proposed so far. These methods, however, have two problems: (1) The loss of details in bright regions due to over-enhancement of contrast. (2) The noise is amplified in dark regions because conventional enhancement methods do not consider noise included in images. The proposed method aims to overcome these problems. In the proposed method, a shadow-up function is applied to adaptive gamma correction with weighting distribution, and a denoising filter is also used to avoid noise being amplified in dark regions. As a result, the proposed method allows us not only to enhance contrast of dark regions, but also to avoid amplifying noise, even under strong noise environments.

Read more
Multimedia

A Novel Local Binary Pattern Based Blind Feature Image Steganography

Steganography methods in general terms tend to embed more and more secret bits in the cover images. Most of these methods are designed to embed secret information in such a way that the change in the visual quality of the resulting stego image is not detectable. There exists some methods which preserve the global structure of the cover after embedding. However, the embedding capacity of these methods is very less. In this paper a novel feature based blind image steganography technique is proposed, which preserves the LBP (Local binary pattern) feature of the cover with comparable embedding rates. Local binary pattern is a well known image descriptor used for image representation. The proposed scheme computes the local binary pattern to hide the bits of the secret image in such a way that the local relationship that exists in the cover are preserved in the resulting stego image. The performance of the proposed steganography method has been tested on several images of different types to show the robustness. State of the art LSB based steganography methods are compared with the proposed method to show the effectiveness of feature based image steganography

Read more
Multimedia

A Revision Control System for Image Editing in Collaborative Multimedia Design

Revision control is a vital component in the collaborative development of artifacts such as software code and multimedia. While revision control has been widely deployed for text files, very few attempts to control the versioning of binary files can be found in the literature. This can be inconvenient for graphics applications that use a significant amount of binary data, such as images, videos, meshes, and animations. Existing strategies such as storing whole files for individual revisions or simple binary deltas, respectively consume significant storage and obscure semantic information. To overcome these limitations, in this paper we present a revision control system for digital images that stores revisions in form of graphs. Besides, being integrated with Git, our revision control system also facilitates artistic creation processes in common image editing and digital painting workflows. A preliminary user study demonstrates the usability of the proposed system.

Read more
Multimedia

A Robust Billboard-based Free-viewpoint Video Synthesizing Algorithm for Sports Scenes

We present a billboard-based free-viewpoint video synthesizing algorithm for sports scenes that can robustly reconstruct and render a high-fidelity billboard model for each object, including an occluded one, in each camera. Its contributions are (1) applicable to a challenging shooting condition where a high precision 3D model cannot be built because a small number of cameras featuring wide-baseline are equipped; (2) capable of reproducing appearances of occlusions, that is one of the most significant issues for billboard-based approaches due to the ineffective detection of overlaps. To achieve contributions above, the proposed method does not attempt to find a high-quality 3D model but utilizes a raw 3D model that is obtained directly from space carving. Although the model is insufficiently accurate for producing an impressive visual effect, precise objects segmentation and occlusions detection can be performed by back-projecting it onto each camera plane. The billboard model of each object in each camera is rendered according to whether it is occluded or not, and its location in the virtual stadium is determined considering the location of its 3D model. We synthesized free-viewpoint videos of two soccer sequences recorded by five cameras with the proposed and state-of-art methods to demonstrate its performance.

Read more
Multimedia

A Robust Blind 3-D Mesh Watermarking based on Wavelet Transform for Copyright Protection

Nowadays, three-dimensional meshes have been extensively used in several applications such as, industrial, medical, computer-aided design (CAD) and entertainment due to the processing capability improvement of computers and the development of the network infrastructure. Unfortunately, like digital images and videos, 3-D meshes can be easily modified, duplicated and redistributed by unauthorized users. Digital watermarking came up while trying to solve this problem. In this paper, we propose a blind robust watermarking scheme for three-dimensional semiregular meshes for Copyright protection. The watermark is embedded by modifying the norm of the wavelet coefficient vectors associated with the lowest resolution level using the edge normal norms as synchronizing primitives. The experimental results show that in comparison with alternative 3-D mesh watermarking approaches, the proposed method can resist to a wide range of common attacks, such as similarity transformations including translation, rotation, uniform scaling and their combination, noise addition, Laplacian smoothing, quantization, while preserving high imperceptibility.

Read more
Multimedia

A Robust Image Watermarking System Based on Deep Neural Networks

Digital image watermarking is the process of embedding and extracting watermark covertly on a carrier image. Incorporating deep learning networks with image watermarking has attracted increasing attention during recent years. However, existing deep learning-based watermarking systems cannot achieve robustness, blindness, and automated embedding and extraction simultaneously. In this paper, a fully automated image watermarking system based on deep neural networks is proposed to generalize the image watermarking processes. An unsupervised deep learning structure and a novel loss computation are proposed to achieve high capacity and high robustness without any prior knowledge of possible attacks. Furthermore, a challenging application of watermark extraction from camera-captured images is provided to validate the practicality as well as the robustness of the proposed system. Experimental results show the superiority performance of the proposed system as comparing against several currently available techniques.

Read more
Multimedia

A Simple Model for Subject Behavior in Subjective Experiments

In a subjective experiment to evaluate the perceptual audiovisual quality of multimedia and television services, raw opinion scores collected from test subjects are often noisy and unreliable. To produce the final mean opinion scores (MOS), recommendations such as ITU-R BT.500, ITU-T P.910 and ITU-T P.913 standardize post-test screening procedures to clean up the raw opinion scores, using techniques such as subject outlier rejection and bias removal. In this paper, we analyze the prior standardized techniques to demonstrate their weaknesses. As an alternative, we propose a simple model to account for two of the most dominant behaviors of subject inaccuracy: bias and inconsistency. We further show that this model can also effectively deal with inattentive subjects that give random scores. We propose to use maximum likelihood estimation to jointly solve the model parameters, and present two numeric solvers: the first based on the Newton-Raphson method, and the second based on an alternating projection (AP). We show that the AP solver generalizes the ITU-T P.913 post-test screening procedure by weighing a subject's contribution to the true quality score by her consistency (thus, the quality scores estimated can be interpreted as bias-subtracted consistency-weighted MOS). We compare the proposed methods with the standardized techniques using real datasets and synthetic simulations, and demonstrate that the proposed methods are the most valuable when the test conditions are challenging (for example, crowdsourcing and cross-lab studies), offering advantages such as better model-data fit, tighter confidence intervals, better robustness against subject outliers, the absence of hard coded parameters and thresholds, and auxiliary information on test subjects. The code for this work is open-sourced at this https URL.

Read more
Multimedia

A Study of Annotation and Alignment Accuracy for Performance Comparison in Complex Orchestral Music

Quantitative analysis of commonalities and differences between recorded music performances is an increasingly common task in computational musicology. A typical scenario involves manual annotation of different recordings of the same piece along the time dimension, for comparative analysis of, e.g., the musical tempo, or for mapping other performance-related information between performances. This can be done by manually annotating one reference performance, and then automatically synchronizing other performances, using audio-to-audio alignment algorithms. In this paper we address several questions related to those tasks. First, we analyze different annotations of the same musical piece, quantifying timing deviations between the respective human annotators. A statistical evaluation of the marker time stamps will provide (a) an estimate of the expected timing precision of human annotations and (b) a ground truth for subsequent automatic alignment experiments. We then carry out a systematic evaluation of different audio features for audio-to-audio alignment, quantifying the degree of alignment accuracy that can be achieved, and relate this to the results from the annotation study.

Read more
Multimedia

A Study on Impacts of Multiple Factors on Video Qualify of Experience

HTTP Adaptive Streaming (HAS) has become a cost-effective means for multimedia delivery nowadays. However, how the quality of experience (QoE) is jointly affected by 1) varying perceptual quality and 2) interruptions is not well-understood. In this paper, we present the first attempt to quantitatively quantify the relative impacts of these factors on the QoE of streaming sessions. To achieve this purpose, we first model the impacts of the factors using histograms, which represent the frequency distributions of the individual factors in a session. By using a large dataset, various insights into the relative impacts of these factors are then provided, serving as suggestions to improve the QoE of streaming sessions.

Read more

Ready to get started?

Join us today