Peter Eisert
Humboldt University of Berlin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Peter Eisert.
international conference on multimedia and expo | 2006
Aljoscha Smolic; Karsten Mueller; Philipp Merkle; Christoph Fehn; Peter Kauff; Peter Eisert; Thomas Wiegand
An overview of 3D and free viewpoint video is given in this paper with special focus on related standardization activities in MPEG. Free viewpoint video allows the user to freely navigate within real world visual scenes, as known from virtual worlds in computer graphics. Examples are shown, highlighting standards conform realization using MPEG-4. Then the principles of 3D video are introduced providing the user with a 3D depth impression of the observed scene. Example systems are described again focusing on their realization based on MPEG-4. Finally multi-view video coding is described as a key component for 3D and free viewpoint video systems. The conclusion is that the necessary technology including standard media formats for 3D and free viewpoint is available or will be available in the near future, and that there is a clear demand from industry and user side for such applications. 3D TV at home and free viewpoint video on DVD will be available soon, and will create huge new markets
IEEE Computer Graphics and Applications | 1998
Peter Eisert; Bernd Girod
The authors present a model-based algorithm that estimates 3D motion and facial expressions from 2D image sequences showing head and shoulder scenes typical of video telephone and teleconferencing applications.
computer games | 2009
Audrius Jurgelionis; Philipp Fechteler; Peter Eisert; Francesco Bellotti; Haggai David; Jukka-Pekka Laulajainen; R. Carmichael; Vassilis Poulopoulos; Arto Laikari; P. Perälä; A. De Gloria; Christos Bouras
Video games are typically executed on Windows platforms with DirectX API and require high performance CPUs and graphics hardware. For pervasive gaming in various environments like at home, hotels, or internet cafes, it is beneficial to run games also on mobile devices and modest performance CE devices avoiding the necessity of placing a noisy workstation in the living room or costly computers/consoles in each room of a hotel. This paper presents a new cross-platform approach for distributed 3D gaming in wired/wireless local networks. We introduce the novel system architecture and protocols used to transfer the game graphics data across the network to end devices. Simultaneous execution of video games on a central server and a novel streaming approach of the 3D graphics output to multiple end devices enable the access of games on low cost set top boxes and handheld devices that natively lack the power of executing a game with high-quality graphical output.
Computers & Graphics | 1998
Frank Hartung; Peter Eisert; Bernd Girod
Abstract The enforcement of intellectual property rights is difficult for digital data. One possibility of supporting enforcement is the embedding of digital watermarks containing information about copyright owner and/or receiver of the data. Watermarking methods have already been presented for audio, images, video, and polygonal 3D models. In this paper, we present a method for digital watermarking of MPEG-4 facial animation parameter data sets. The watermarks are additively embedded into the parameter values and can be retrieved either from the watermarked parameters or from video sequences rendered using the watermarked animation parameters for head animation. For retrieval from rendered sequences, the facial animation parameters have to be estimated first. We use a model-based approach for the estimation of the facial parameters that combines a motion model of an explicit 3D textured wireframe with the optical flow constraint from the video data. This leads to a linear algorithm that is robustly solved in a hierarchical framework with low computational complexity. Experimental results confirm the applicability of the presented watermarking technique.
international conference on image processing | 2005
Karsten Müller; Aljoscha Smolic; Matthias Kautzner; Peter Eisert; Thomas Wiegand
An efficient algorithm for compression of dynamic time-consistent 3D meshes is presented. Such a sequence of meshes contains a large degree of temporal statistical dependencies that can be exploited for compression using DPCM. The vertex positions are predicted at the encoder from a previously decoded mesh. The difference vectors are further clustered in an octree approach. Only a representative for a cluster of difference vectors is further processed providing a significant reduction of data rate. The representatives are scaled and quantized and finally entropy coded using CABAC, the arithmetic coding technique used in H.264/MPEG4-AVC. The mesh is then reconstructed at the encoder for prediction of the next mesh. In our experiments we compare the efficiency of the proposed algorithm in terms of bit-rate and quality compared to static mesh coding and interpolator compression indicating a significant improvement in compression efficiency.
IEEE Transactions on Circuits and Systems for Video Technology | 2000
Peter Eisert; Eckehard G. Steinbach; Bernd Girod
A system for the automatic reconstruction of real-world objects from multiple uncalibrated camera views is presented. The camera position and orientation for all views, the 3-D shape of the rigid object, as well as the associated color information, are recovered from the image sequence. The system proceeds in four steps. First, the internal camera parameters describing the imaging geometry are calibrated using a reference object. Second, an initial 3-D description of the object is computed from two views. This model information is then used in a third step to estimate the camera positions for all available views using a novel linear 3-D motion and shape estimation algorithm. The main feature of this third step is the simultaneous estimation of 3-D camera-motion parameters and object shape refinement with respect to the initial 3-D model. The initial 3-D shape model exhibits only a few degrees of freedom and the object shape refinement is defined as flexible deformation of the initial shape model. Our formulation of the shape deformation allows the object texture to slide on the surface, which differs from traditional flexible body modeling. This novel combined shape and motion estimation using sliding texture considerably improves the calibration data of the individual views in comparison to fixed-shape model based camera-motion estimation. Since the shape model used for model based camera-motion estimation is only approximate, a volumetric 3-D reconstruction process is initiated in the fourth step that combines the information from ail views simultaneously. The recovered object consists of a set of voxels with associated color information that describes even fine structures and details of the object. New views of the object can be rendered from the recovered 3-D model, which has potential applications in virtual reality or multimedia systems and the emerging field of video coding using 3-D scene models.
visual communications and image processing | 2003
Peter Eisert
In this paper, a next generation 3-D video conferencing system is presented that provides immersive tele-presence and natural representation of all participants in a shared virtual meeting space. The system is based on the principle of a shared virtual table environment which guarantees correct eye contact and gesture reproduction and enhances the quality of human-centered communication. The virtual environment is modeled in MPEG-4 which also allows the seamless integration of explicit 3-D head models for a low-bandwidth connection to mobile users. In this case, facial expression and motion information is transmitted instead of video streams resulting in bit-rates of a few kbit/s per participant. Beside low bit-rates, the model-based approach enables new possibilities for image enhancements like digital make-up, digital dressing, or modification of scene lighting.
IEEE Transactions on Circuits and Systems for Video Technology | 2000
Peter Eisert; Thomas Wiegand; Bernd Girod
We show that traditional waveform coding and 3-D model-based coding are not competing alternatives, but should be combined to support and complement each other. Both approaches are combined such that the generality of waveform coding and the efficiency of 3-D model-based coding are available where needed. The combination is achieved by providing the block-based video coder with a second reference frame for prediction, which is synthesized by the model-based coder. The model-based coder uses a parameterized 3-D head model, specifying the shape and color of a person. We therefore restrict our investigations to typical videotelephony scenarios that show head-and-shoulder scenes. Motion and deformation of the 3-D head model constitute facial expressions which are represented by facial animation parameters (FAPs) based on the MPEG-4 standard. An intensity gradient-based approach that exploits the 3-D model information is used to estimate the FAPs, as well as illumination parameters, that describe changes of the brightness in the scene. Model failures and objects that are not known at the decoder are handled by standard block-based motion-compensated prediction, which is not restricted to a special scene content, but results in lower coding efficiency. A Lagrangian approach is employed to determine the most efficient prediction for each block from either the synthesized model frame or the previous decoded frame. Experiments on five video sequences show that bit rate savings of about 35% are achieved at equal average peak signal-to-noise ratio (PSNR) when comparing the model-aided codec to TMN-10, the state-of-the-art test model of the M.263 standard. This corresponds to a gain of 2-3 dB in PSNR when encoding at the same average bit rate.
international conference on acoustics speech and signal processing | 1999
Peter Eisert; Eckehard G. Steinbach; Bernd Girod
In this paper we present a volumetric method for the 3-D reconstruction of real world objects from multiple calibrated camera views. The representation of the objects is fully volume-based and no explicit surface description is needed. The approach is based on multi-hypothesis tests of the voxel model back-projected into the image planes. All camera views are incorporated in the reconstruction process simultaneously and no explicit data fusion is needed. In a first step each voxel of the viewing volume is filled with several color hypotheses originating from different camera views. This leads to an overcomplete representation of the 3-D object and each voxel typically contains multiple hypotheses. In a second step only those hypotheses remain in the voxels which are consistent with all camera views where the voxel is visible. Voxels without a valid hypothesis are considered to be transparent. The methodology of our approach combines the advantages of silhouette-based and image feature-based methods. Experimental results on real and synthetic image data show the excellent visual quality of the voxel-based 3-D reconstruction.
international conference on image processing | 2010
Frederik Zilly; Marcus Müller; Peter Eisert; Peter Kauff
The paper discusses an assistance system for stereo shooting and 3D production, called Stereoscopic Analyzer (STAN). A feature-based scene analysis estimates in real-time the relative pose of the two cameras in order to allow optimal camera alignment and lens settings directly at the set. It automatically eliminates undesired vertical disparities and geometrical distortions through image rectification. In addition, it detects the position of near- and far objects in the scene to derive the optimal inter-axial distance (stereo baseline), and gives a framing alert in case of stereoscopic window violation. Against this background the paper describes the system architecture, explains the theoretical background and discusses future developments.