Michail Krinidis
Aristotle University of Thessaloniki
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michail Krinidis.
IEEE Transactions on Image Processing | 2009
Michail Krinidis; Ioannis Pitas
This paper presents a new approach for the segmentation of color textured images, which is based on a novel energy function. The proposed energy function, which expresses the local smoothness of an image area, is derived by exploiting an intermediate step of modal analysis that is utilized in order to describe and analyze the deformations of a 3-D deformable surface model. The external forces that attract the 3-D deformable surface model combine the intensity of the image pixels with the spatial information of local image regions. The proposed image segmentation algorithm has two steps. First, a color quantization scheme, which is based on the node displacements of the deformable surface model, is utilized in order to decrease the number of colors in the image. Then, the proposed energy function is used as a criterion for a region growing algorithm. The final segmentation of the image is derived by a region merge approach. The proposed method was applied to the Berkeley segmentation database. The obtained results show good segmentation robustness, when compared to other state of the art image segmentation algorithms.
IEEE Transactions on Circuits and Systems for Video Technology | 2009
Nikos Nikolaidis; Michail Krinidis; Ioannis Pitas
In this letter, we introduce a novel approach for lip activity detection and speaker detection, using solely visual information. The main idea in this work is to apply signal detection algorithms to a simple and easily extracted feature from the mouth region. We argue that the increased average value and standard deviation of the number of pixels with low intensities that the mouth region of a speaking person demonstrates can be used as visual cues for detecting visual speech. We then proceed in deriving a statistical algorithm that utilizes this fact for the efficient characterization of visual speech and silence in video sequences. Furthermore, we employ the lip activity detection method in order to determine the active speaker(s) in a multi-person environment.
visual communications and image processing | 2005
Georgios Stamou; Michail Krinidis; Nikos Nikolaidis; Ioannis Pitas
This paper presents a complete functional system capable of detecting people and tracking their motion in either live camera feed or pre-recorded video sequences. The system consists of two main modules, namely the detection and tracking modules. Automatic detection aims at locating human faces and is based on fusion of color and feature-based information. Thus, it is capable of handling faces in different orientations and poses (frontal, profile, intermediate). To avoid false detections, a number of decision criteria are employed. Tracking is performed using a variant of the well-known Kanade- Lucas-Tomasi tracker, while occlusion is handled through a re-detection stage. Manual intervention is allowed to assist both modules if required. In manual mode, the system can track any object of interest, so long as there are enough features to track.
international conference on acoustics, speech, and signal processing | 2005
Michail Krinidis; Georgios Stamou; Heinz Teutsch; Sascha Spors; Nikos Nikolaidis; Rudolf Rabenstein; L. Pitas
This paper presents an audio-visual database that can be used as a reference database for testing and evaluation of video, audio or joint audio-visual person tracking algorithms, as well as speaker localization methods. Additional possible uses include the testing of face detection and pose estimation algorithms. A number of different scenes are included in the database, ranging from simple to complex scenes that can challenge existing algorithms. They include different subjects, with appearances that can cause problems to video tracking algorithms, (e.g. facial features such as beards, glasses, etc.), optimal and artificially created sub-optimal lighting conditions, subject movement based on simple as well as random motion trajectories, different distances from the camera/microphones and occlusion. The database incorporates ground truth data (3D position in time) originating from a commercially available 4-camera infrared (IR) tracking system. Examples of how the database can be used to evaluate video and audio tracking algorithms are also provided.
IEEE Transactions on Circuits and Systems for Video Technology | 2007
Michail Krinidis; Nikos Nikolaidis; Ioannis Pitas
This paper presents a novel approach for selecting and tracking feature points in video sequences. In this approach, the image intensity is represented by a 3-D deformable surface model. The proposed approach relies on selecting and tracking feature points by exploiting the so-called generalized displacement vector that appears in the explicit surface deformation governing equations. This vector is proven to be a combination of the output of various line- and edge-detection masks, thus leading to distinct, robust features. The proposed method was compared, in terms of tracking accuracy and robustness, with a well-known tracking algorithm, Kanade-Lucas-Tomasi (KLT), and a tracking algorithm based on scale-invariant feature transform (SIFT) features. The proposed method was experimentally shown to be more precise and robust than both KLT and SIFT tracking. Moreover, the feature-point selection scheme was tested against the SIFT and Harris feature points, and it was demonstrated to provide superior results.
IEEE Transactions on Circuits and Systems for Video Technology | 2009
Michail Krinidis; Nikos Nikolaidis; Ioannis Pitas
This paper presents a novel approach for estimating 3-D head pose in single-view video sequences. Following initialization by a face detector, a tracking technique that utilizes a 3-D deformable surface model to approximate the facial image intensity is used to track the face in the video sequence. Head pose estimation is performed by using a feature vector which is a byproduct of the equations that govern the deformation of the surface model used in the tracking. The afore-mentioned vector is used as input in a radial basis function interpolation network in order to estimate the 3-D head pose. The proposed method was applied to IDIAP head pose estimation database. The obtained results show that the method can estimate the head direction vector with very good accuracy.
Signal Processing-image Communication | 2007
Michail Krinidis; Nikos Nikolaidis; Ioannis Pitas
This paper introduces the discrete modal transform (DMT), a 1D and 2D discrete, non-separable transform for signal processing, which, in the mathematical sense, is a generalization of the well-known discrete cosine transform (DCT). A 3D deformable surface model is used to represent the image intensity and the introduced discrete transform is a by-product of the explicit surface deformation governing equations. The properties of the proposed transform are similar to those of the DCT. To illustrate these properties, the proposed transform is applied to lossy image compression and the obtained results are compared to those of a DCT-based compression scheme. Experimental results show that DMT, which includes an embedded compression ratio selection mechanism, has excellent energy compaction properties and achieves comparable compression results to DCT at low compression ratios, while being in general better than DCT at high compression ratios.
Journal on Multimodal User Interfaces | 2007
Georgios Stamou; Michail Krinidis; Nikos Nikolaidis; Ioannis Pitas
This paper presents a complete functional system capable of detecting people and tracking their motion in either live camera feed or pre-recorded video sequences. The system consists of two main modules, namely the detection and tracking modules. Automatic detection aims at locating human faces and is based on fusion of color and feature-based information. Thus, it is capable of handling faces in different orientations and poses (frontal, profile, intermediate). To avoid false detections, a number of decision criteria are employed. Tracking is performed using a variant of the well-known Kanade-Lucas-Tomasi tracker, while occlusion is handled through a re-detection stage. Manual intervention is allowed to assist both modules if required. In manual mode, the system can track any object of interest, so long as there are enough features to track. The system caters for calibrated cameras and can provide 3-D coordinates of any tracked object(s) of interest. It has been tested with very good results on a variety of video sequences, including a database of studio video sequences, for which 3-D ground truth data, originating from a 4-camera infrared tracking system, exist.
information sciences, signal processing and their applications | 2007
Michail Krinidis; Nikos Nikolaidis; Ioannis Pitas
This paper presents a novel approach for estimating 3D head pose in single-view video sequences. Following initialization by a face detector, a tracking technique that utilizes a 3D deformable surface model to approximate the image intensity is used to track the face in the video sequence. Head pose estimation is performed by using a feature vector which is a by-product of the equations that govern the deformation of the surface model used in the tracking. The afore-mentioned vector is used for training support vector machines (SVM) in order to estimate the 3D head pose. The proposed method was applied to IDIAP head pose estimation database. The obtained results show that the proposed method can achieve an accuracy of 82% if angles are estimated in 10deg increments and 75% if angle are estimated in 5deg increments.
international conference on multimedia and expo | 2007
Michail Krinidis; Nikos Nikolaidis; Ioannis Pitas
This paper introduces a 2D discrete, non-separable transform for image processing, which can be regarded as a combination of the well known discrete cosine transform (DCT) with an analytically derived quantization table that includes a compression ratio selection parameter. A 3D deformable surface model is used to approximate the image intensity and the introduced discrete transform is an intermediate step of the explicit surface deformation governing equations. The proposed transform is applied to lossy image compression and the obtained results are compared to those of a DCT-based compression scheme.