Computer Science Multimedia - Researchain

Featured Researches

An Ultra-Specific Image Dataset for Automated Insect Identification

Automated identification of insects is a tough task where many challenges like data limitation, imbalanced data count, and background noise needs to be overcome for better performance. This paper describes such an image dataset which consists of a limited, imbalanced number of images regarding six genera of subfamily Cicindelinae (tiger beetles) of order Coleoptera. The diversity of image collection is at a high level as the images were taken from different sources, angles and on different scales. Thus, the salient regions of the images have a large variation. Therefore, one of the main intentions in this process was to get an idea about the image dataset while comparing different unique patterns and features in images. The dataset was evaluated on different classification algorithms including deep learning models based on different approaches to provide a benchmark. The dynamic nature of the dataset poses a challenge to the image classification algorithms. However transfer learning models using softmax classifier performed well on current dataset. The tiger beetle classification can be challenging even to a trained human eye, therefore, this dataset opens a new avenue for the classification algorithms to develop, to identify features which human eyes have not identified.

Multimedia

An adaptive algorithm for embedding information into compressed JPEG images using the QIM method

The widespread use of JPEG images makes them good covers for secret messages storing and transmitting. This paper proposes a new algorithm for embedding information in JPEG images based on the steganographic QIM method. The main problem of such embedding is the vulnerability to statistical steganalysis. To solve this problem, it is proposed to use a variable quantization step, which is adaptively selected for each block of the JPEG cover image. Experimental results show that the proposed approach successfully increases the security of embedding.

Multimedia

An authorship protection technology for electronic documents based on image watermarking

In the field of information technology, information security technologies hold a special place. They ensure the security of the use of information technology. One of the urgent tasks is the protection of electronic documents during their transfer in information systems. This paper proposes a technology for protecting electronic documents containing digital images. The main idea is that the electronic document authorship protection can be implemented by digital watermark embedding in the images that are contained in this document. The paper considers three cases of using the proposed technology: full copying of an electronic document, copying of images contained in the document, and copying of text. It is shown that in all three cases the authorship confirmation can be successfully implemented. Computational experiments are conducted with robust watermarking algorithms that can be used within the technology. A scenario of technology implementation is proposed, which provides for the joint use of different class algorithms.

Multimedia

An efficient multi-language Video Search Engine to facilitate the HADJ and the UMRA

Videos clips became the most important and prominent multimedia document to illustrate the rituals process of Hajj and Umrah. Therefore, it is necessary to develop a system to facilitate access to information related to the duties, the pillars, the stages and the prayers. In this paper present a new project accomplishing a search engine in a large video database enabling any pilgrims to get the information that he care about as fast, accurate. This project is based on two techniques: (a) the weighting method to determine the degree of affiliation of a video clip to a particular topic (b) organizing data using several layers.

Multimedia

An enhanced performance for H.265/SHVC based on combined AEGBM3D filter and back-propagation neural network

This paper deals with the latest video coding standard H265 SHVC, a scalable extension to High Efficiency Video Coding (HEVC). HEVC introduces new coding tools compared to its predecessor and is backward compatible with all types of electronic gadgets. The gadgets with different display capabilities cannot be offered the same quality video due to the constraints in transmission bandwidth is a major problem. One solution to this problem will be the compression of the video sequence which is focused in this paper to preserve or increase PSNR while reducing bit-rate besides a novel method implemented in SHVC encoder. The novel method undergoes a combined AEGBM3D (adaptive edge guided block-matching and 3D) filtering and back-propagation technique. The technique includes an AEGBM3D filter which avoids spatial redundancy and de-noise frames; hence enhancement in PSNR is achieved. The obtained PSNR of the video is compared with the set threshold PSNR to maintain PSNR above the threshold by repeated AEGBM3D filtering. The BP technique based on the neural network machine learning approach continually restrains the output if the input block does not contain a feature they were trained to recognize. This frequent control over the output produces few bits; hence reduction in bit-rate is achieved. The simulation results show that the proposed technique delivers an average increment of 0.16 and 0.25dB in PSNR and an average decrement of 28 and 37% in bit-rate for 1.5 and 2 times spatial ratios respectively, compared with the existing methods.

Multimedia

An optimal mode selection algorithm for scalable video coding

Scalable video coding (SVC) is extended from its predecessor advanced video coding (AVC) because of its flexible transmission to all type of gadgets. However, SVC is more flexible and scalable than AVC, but it is more complex in determining the computations than AVC. The traditional full search method in the standard H.264 SVC consumes more encoding time for computation. This complexity in computation need to be reduced and many fast mode decision (FMD) algorithms were developed, but many fail to balance in all the three measures such as peak signal to noise ratio (PSNR), encoding time and bit rate. In this paper, the proposed optimal mode selection algorithm based on the orientation of pixels achieves better time saving, good PSNR and coding efficiency. The proposed algorithm is compared with the standard H.264 JSVM reference software and found to be 57.44% time saving, 0.43 dB increments in PSNR and 0.23% compression in bit rate.

Multimedia

Analysis and prediction of JND-based video quality model

The just-noticeable-difference (JND) visual perception property has received much attention in characterizing human subjective viewing experience of compressed video. In this work, we quantify the JND-based video quality assessment model using the satisfied user ratio (SUR) curve, and show that the SUR model can be greatly simplified since the JND points of multiple subjects for the same content in the VideoSet can be well modeled by the normal distribution. Then, we design an SUR prediction method with video quality degradation features and masking features and use them to predict the first, second and the third JND points and their corresponding SUR curves. Finally, we verify the performance of the proposed SUR prediction method with different configurations on the VideoSet. The experimental results demonstrate that the proposed SUR prediction method achieves good performance in various resolutions with the mean absolute error (MAE) of the SUR smaller than 0.05 on average.

Multimedia

Analysis of Problem Tokens to Rank Factors Impacting Quality in VoIP Applications

User-perceived quality-of-experience (QoE) in internet telephony systems is commonly evaluated using subjective ratings computed as a Mean Opinion Score (MOS). In such systems, while user MOS can be tracked on an ongoing basis, it does not give insight into which factors of a call induced any perceived degradation in QoE -- it does not tell us what caused a user to have a sub-optimal experience. For effective planning of product improvements, we are interested in understanding the impact of each of these degrading factors, allowing the estimation of the return (i.e., the improvement in user QoE) for a given investment. To obtain such insights, we advocate the use of an end-of-call "problem token questionnaire" (PTQ) which probes the user about common call quality issues (e.g., distorted audio or frozen video) which they may have experienced. In this paper, we show the efficacy of this questionnaire using data gathered from over 700,000 end-of-call surveys gathered from Skype (a large commercial VoIP application). We present a method to rank call quality and reliability issues and address the challenge of isolating independent factors impacting the QoE. Finally, we present representative examples of how these problem tokens have proven to be useful in practice.

Multimedia

Analysis of Rolling Shutter Effect on ENF based Video Forensics

ENF is a time-varying signal of the frequency of mains electricity in a power grid. It continuously fluctuates around a nominal value (50/60 Hz) due to changes in supply and demand of power over time. Depending on these ENF variations, the luminous intensity of a mains-powered light source also fluctuates. These fluctuations in luminance can be captured by video recordings. Accordingly, ENF can be estimated from such videos by analysis of steady content in the video scene. When videos are captured by using a rolling shutter sampling mechanism, as is done mostly with CMOS cameras, there is an idle period between successive frames. Consequently, a number of illumination samples of the scene are effectively lost due to the idle period. These missing samples affect ENF estimation, in the sense of the frequency shift caused and the power attenuation that results. This work develops an analytical model for videos captured using a rolling shutter mechanism. The model illustrates how the frequency of the main ENF harmonic varies depending on the idle period length, and how the power of the captured ENF attenuates as idle period increases. Based on this, a novel idle period estimation method for potential use in camera forensics that is able to operate independently of video frame rate is proposed. Finally, a novel time-of-recording verification approach based on use of multiple ENF components, idle period assumptions and interpolation of missing ENF samples is also proposed.

Multimedia

Application of Just-Noticeable Difference in Quality as Environment Suitability Test for Crowdsourcing Speech Quality Assessment Task

Crowdsourcing micro-task platforms facilitate subjective media quality assessment by providing access to a highly scale-able, geographically distributed and demographically diverse pool of crowd workers. Those workers participate in the experiment remotely from their own working environment, using their own hardware. In the case of speech quality assessment, preliminary work showed that environmental noise at the listener's side and the listening device (loudspeaker or headphone) significantly affect perceived quality, and consequently the reliability and validity of subjective ratings. As a consequence, ITU-T Rec. P.808 specifies requirements for the listening environment of crowd workers when assessing speech quality. In this paper, we propose a new Just Noticeable Difference of Quality (JNDQ) test as a remote screening method for assessing the suitability of the work environment for participating in speech quality assessment tasks. In a laboratory experiment, participants performed this JNDQ test with different listening devices in different listening environments, including a silent room according to ITU-T Rec. P.800 and a simulated background noise scenario. Results show a significant impact of the environment and the listening device on the JNDQ threshold. Thus, the combination of listening device and background noise needs to be screened in a crowdsourcing speech quality test. We propose a minimum threshold of our JNDQ test as an easily applicable screening method for this purpose.

Ready to get started?

Join us today

Archive Your Research