Ewa Kijak
University of Rennes
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ewa Kijak.
international conference on multimedia and expo | 2003
Ewa Kijak; Guillaume Gravier; Patrick Gros; Lionel Oisel; Frédéric Bimbot
This paper focuses on the use of hidden Markov models (HMMs) for structure analysis of videos, and demonstrates how they can be efficiently applied to merge audio and visual cues. Our approach is validated in the particular domain of tennis videos. The basic temporal unit is the video shot. Visual features describe the audio events within a video shot. The video structure parsing relies on the analysis of the temporal interleaving of video shots, with respect to prior information about tennis content and editing rules. As a result, typical tennis scenes are identified. In addition, each shot is assigned to a level in the hierarchy described in terms of point, game and set.
Multimedia Tools and Applications | 2006
Ewa Kijak; Guillaume Gravier; Lionel Oisel; Patrick Gros
This paper focuses on the integration of multimodal features for sport video structure analysis. The method relies on a statistical model which takes into account both the shot content and the interleaving of shots. This stochastic modelling is performed in the global framework of Hidden Markov Models (HMMs) that can be efficiently applied to merge audio and visual cues. Our approach is validated in the particular domain of tennis videos. The model integrates prior information about tennis content and editing rules. The basic temporal unit is the video shot. Visual features are used to characterize the type of shot view. Audio features describe the audio events within a video shot. Two sets of audio features are used in this study: the first one is extracted from a manual segmentation of the soundtrack and is more reliable. The second one is provided by an automatic segmentation and classification process. As a result of the overall HMM process, typical tennis scenes are simultaneously segmented and identified. The experiments illustrate the improvement of HMM-based fusion over indexing using only the best single media, when both media are of similar quality.
IEEE Journal of Selected Topics in Signal Processing | 2011
Joaquin Zepeda; Christine Guillemot; Ewa Kijak
We introduce a new image coder which uses the Iteration Tuned and Aligned Dictionary (ITAD) as a transform to code image blocks taken over a regular grid. We establish experimentally that the ITAD structure results in lower-complexity representations that enjoy greater sparsity when compared to other recent dictionary structures. We show that this superior sparsity can be exploited successfully for compressing images belonging to specific classes of images (e.g., facial images). We further propose a global rate-distortion criterion that distributes the code bits across the various image blocks. Our evaluation shows that the proposed ITAD codec can outperform JPEG2000 by more than 2 dB at 0.25 bpp and by 0.5 dB at 0.45 bpp, accordingly producing qualitatively better reconstructions.
acm multimedia | 2010
Thanh-Toan Do; Ewa Kijak; Teddy Furon; Laurent Amsaleg
Content-Based Image Retrieval Systems used in forensics related contexts require very good image recognition capabilities. Therefore they often use the SIFT local-feature description scheme as its robustness against a large spectrum of image distortions has been assessed. In contrast, the security of SIFT is still largely unexplored. We show in this paper that it is possible to conceal images from the SIFT-based recognition process by designing very SIFT-specific attacks. The attacks that are successful in deluding the system remove keypoints and simultaneously forge new keypoints in the images to be concealed. This paper details several strategies enforcing image concealment. A copy-detection oriented experimental study using a database of 100,000 real images together with a state-of-art image search system shows these strategies are effective. This is a very serious threat against systems, endangering forensics investigations.
international conference on image processing | 2003
Ewa Kijak; Lionel Oisel; Patrick Gros
This paper focuses on the use of hidden Markov models (HMMs) for structure analysis of sport videos. The video structure parsing relies on the analysis of the temporal interleaving of video shots, with respect to a priori information about video content and editing rules. The basic temporal unit is the video shot and visual features are used to characterize its type of view. Our approach is validated in the particular domain of tennis videos. As a result, typical tennis scenes are identified. In addition, each shot is assigned to a level in the hierarchy described in terms of point, game and set.
Storage and Retrieval for Image and Video Databases | 2003
Ewa Kijak; Lionel Oisel; Patrick Gros
This work aims at recovering the temporal structure of a broadcast tennis video from an analysis of the raw footage. Our method relies on a statistical model of the interleaving of shots, in order to group shots into predefined classes representing structural elements of a tennis video. This stochastic modeling is performed in the global framework of Hidden Markov Models (HMMs). The fundamental units are shots and transitions. In a first step, colors and motion attributes of segmented shots are used to map shots into 2 classes: game (view of the full tennis court) and not game (medium, close up views, and commercials). In a second step, a trained HMM is used to analyze the temporal interleaving of shots. This analysis results in the identification of more complex structures, such as first missed services, short rallies that could be aces or services, long rallies, breaks that are significant of the end of a game and replays that highlight interesting points. These higher-level unit structures can be used either to create summaries, or to allow non-linear browsing of the video.
international conference on acoustics, speech, and signal processing | 2011
Joaquin Zepeda; Christine Guillemot; Ewa Kijak
We present a new, block-based image codec based on sparse representations using a learned, structured dictionary called the Iteration-Tuned and Aligned Dictionary (ITAD). The question of selecting the number of atoms used in the representation of each image block is addressed with a new, global (image-wide), rate-distortion-based sparsity selection criterion. We show experimentally that our codec outperforms JPEG2000 in both quantitative evaluations (by 0.9 dB to 4 dB) and qualitative evaluations.
acm multimedia | 2010
Thanh-Toan Do; Ewa Kijak; Teddy Furon; Laurent Amsaleg
Many content-based retrieval systems (CBIRS) describe images using the SIFT local features because of their very robust recognition capabilities. While SIFT features proved to cope with a wide spectrum of general purpose image distortions, its security has not fully been assessed yet. In one of their scenario, Hsu et al. in [2] show that very specific anti-SIFT attacks can jeopardize the keypoint detection. These attacks can delude systems using SIFT targeting application such as image authentication and (pirated) copy detection. Having some expertise in CBIRS, we were extremely concerned by their analysis. This paper presents our own investigations on the impact of these anti SIFT attacks on a real CBIRS indexing a large collection of images. The attacks are indeed not able to break the system. A detailed analysis explains this assessment.
international conference on acoustics, speech, and signal processing | 2012
Thanh-Toan Do; Ewa Kijak
Recently, Histogram of Oriented Gradient (HOG) is applied in face recognition. In this paper, we apply Co-occurrence of Oriented Gradient (CoHOG), which is an extension of HOG, on the face recognition problem. Some weighted functions for magnitude gradient are tested. We also proposed a weighted approach for CoHOG, where a weight value is set for each subregion of face image. Numerical experiments performed on Yale and ORL datasets show that 1) CoHOG has recognition accuracy higher than HOG; 2) using gradient magnitude in CoHOG improves recognition results; and 3) weighted CoHOG approach improves accuracy recognition rate. The recognition results using CoHOG are competitive with some of the state of the art methods. This proves the effectiveness of CoHOG descriptor for face recognition.
international conference on acoustics, speech, and signal processing | 2012
Thanh-Toan Do; Ewa Kijak; Laurent Amsaleg; Teddy Furon
Content-Based Image Retrieval Systems (CBIRS) used in forensics related contexts require very good image recognition capabilities. Whereas the robustness criterion has been extensively covered by Computer Vision or Multimedia literature, none of these communities explored the security of CBIRS. Recently, preliminary studies have shown real systems can be deluded by applying transformations to images that are very specific to the SIFT local description scheme commonly used for recognition. The work presented in this paper adds one strategy for attacking images, and somehow enlarges the box of tools hackers can use for deluding systems. This paper shows how the orientation of keypoints can be tweaked, which in turn lowers matches since this deeply changes the final SIFT feature vectors. The method learns what visual patch should be applied to change the orientation of keypoints thanks to an SVM-based process. Experiments with a database made of 100,000 real world images confirms the effectiveness of this keypoint-orientation attacking scheme.