Giorgio Zoia
École Polytechnique Fédérale de Lausanne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Giorgio Zoia.
IEEE Transactions on Audio, Speech, and Language Processing | 2008
Ruohua Zhou; Marco Mattavelli; Giorgio Zoia
This paper describes a new method for music onset detection. The novelty of the approach consists mainly of two elements: the time-frequency processing and the detection stages. The resonator time frequency image (RTFI) is the basic time-frequency analysis tool. The time-frequency processing part is in charge of transforming the RTFI energy spectrum into more natural energy- change and pitch-change cues that are then used as input elements for the detection of music onsets by detection tools. Two detection algorithms have been developed: an energy-based algorithm and a pitch-based one. The energy-based detection algorithm exploits energy-change cues and performs particularly well for the detection of hard onsets. The pitch-based algorithm successfully exploits stable pitch cues for the onset detection in polyphonic music, and achieves much better performances than the energy-based algorithm when applied to the detection of soft onsets. Results for both the energy-based and pitch-based detection algorithms have been obtained on a large music dataset.
IEEE MultiMedia | 2005
Pierfrancesco Bellini; Paolo Nesi; Giorgio Zoia
With the spread of computer technology into the artistic fields, new application scenarios for computer-based applications of symbolic music representation (SMR) have been identified. The integration of SMR in a versatile multimedia framework such as MPEG will enable the development of a huge number of new applications in the entertainment, education, and information delivery domains.
EURASIP Journal on Advances in Signal Processing | 2009
Ruohua Zhou; Joshua D. Reiss; Marco Mattavelli; Giorgio Zoia
This paper presents a computationally efficient method for polyphonic pitch estimation. The method employs the Fast Resonator Time-Frequency Image (RTFI) as the basic time-frequency analysis tool. The approach is composed of two main stages. First, a preliminary pitch estimation is obtained by means of a simple peak-picking procedure in the pitch energy spectrum. Such spectrum is calculated from the original RTFI energy spectrum according to harmonic grouping principles. Then the incorrect estimations are removed according to spectral irregularity and knowledge of the harmonic structures of the music notes played on commonly used music instruments. The new approach is compared with a variety of other frame-based polyphonic pitch estimation methods, and results demonstrate the high performance and computational efficiency of the approach.
multimedia signal processing | 2004
Giorgio Zoia; Ruohua Zhou; Daniel Mlynek
The automatic analysis of a polyphonic sound is still a very challenging task, not only for computational reasons but also because of the lack in suitable techniques and restrictions in the application field, or sometimes due to unrealistic goals. A remarkable progress has been made in the last decade, but still, a practical and generic solution for this problem is hard to find. In this paper, we propose a rather general solution for a chord/harmony analyzer, which is able provide good results for different instruments and polyphonic sounds. It is based on the combination of signal processing and neural networks. The sound is analyzed in both time and frequency domains by a hybrid original technique. A rather innovative approach is then used to classify the chords and extract their evolution time. The proposed overall method aims to implement a general purpose listening machine, whose approximated results and approach are nevertheless general enough to allow the implementation of very useful applications.
IEEE Transactions on Multimedia | 2003
Giorgio Zoia; Claudio Alberti
The MPEG-4 audio standard provides several toolsets for natural and synthetic sound coding. Among them, the most innovative in terms of multimedia applications is Structured Audio (SA), which implements a high level, structured description of sound instead of the usual compression techniques based on psychoacoustics and subband analysis. SA permits to encode synthesis and processing algorithms by its Structured Audio Orchestra Language (SAOL), and it can theoretically be used to specify any other audio decoder. This great flexibility introduces a challenging implementation problem, which in a normative framework has to be solved by a systematic approach. In the first part of the paper, it is described how the SA decoding process can be analyzed in a platform independent way in order to determine the fundamental figures of this coding technique; it is then shown how the proposed method is being used for the MPEG-4 SA conformance test. In the second part the design of a virtual digital signal processor (DSP) architecture is presented, based on the results of the complexity analysis. This architecture is able to exploit the intrinsic data level parallelism of many Audio algorithms and to consistently reduce the implementation cost. Experimental results prove the effectiveness of the approach and its suitability for implementations on modern superscalar DSPs and multimedia processors.
international conference on consumer electronics | 1999
Laurent Le Bourhis; Giorgio Zoia; Marco Mattavelli; Daniel Mlynek
This paper presents an efficient software host/co-processor architecture for the implementation of MPEG-4 audio composition. The proposed solution is based on a specific partition between general-purpose tasks and DSP-oriented functionality, thus achieving portability, efficient partitioning of the processing resources and memory management.
bioRxiv | 2018
Claudio Alberti; Tom Paridaens; Jan Voges; Daniel Naro; Junaid Jameel Ahmad; Massimo Ravasi; Daniele Renzi; Giorgio Zoia; Idoia Ochoa; Marco Mattavelli; Jaime Delgado; Mikel Hernaez
The MPEG-G standardization initiative is a coordinated international effort to specify a compressed data format that enables large scale genomic data to be processed, transported and shared. The standard consists of a set of specifications (i.e., a book) describing: i) a nor-mative format syntax, and ii) a normative decoding process to retrieve the information coded in a compliant file or bitstream. Such decoding process enables the use of leading-edge com-pression technologies that have exhibited significant compression gains over currently used formats for storage of unaligned and aligned sequencing reads. Additionally, the standard provides a wealth of much needed functionality, such as selective access, data aggregation, ap-plication programming interfaces to the compressed data, standard interfaces to support data protection mechanisms, support for streaming and a procedure to assess the conformance of implementations. ISO/IEC is engaged in supporting the maintenance and availability of the standard specification, which guarantees the perenniality of applications using MPEG-G. Fi-nally, the standard ensures interoperability and integration with existing genomic information processing pipelines by providing support for conversion from the FASTQ/SAM/BAM file formats. In this paper we provide an overview of the MPEG-G specification, with particular focus on the main advantages and novel functionality it offers. As the standard only specifies the decoding process, encoding performance, both in terms of speed and compression ratio, can vary depending on specific encoder implementations, and will likely improve during the lifetime of MPEG-G. Hence, the performance statistics provided here are only indicative baseline examples of the technologies included in the standard.
international conference on multimedia and expo | 2002
Stefano Battista; Giorgio Zoia; Aleksandar Simeonov; Ruohua Zhou
Natural and structured audio representations can be characterized by the lack or presence of a model describing the sound, respectively; combination of the two approaches can lead to efficient and improved storage and transmission of both speech and music, mixing less efficient but general technologies with more compact and specialized models. Integration of natural audio tracks with structured sound and 3D spatial processing is a challenging effort, especially when the audio, scene requires high quality and precise synchronization with video and graphic information, as it is the case in professional multimedia and virtual reality frameworks. In this paper natural and structured sound are surveyed and a new player is presented, which supports all the mentioned technologies in a normative context.
international conference on image processing | 1998
Marco Mattavelli; Giorgio Zoia
Motion estimation represents the most computationally intensive task for all efficient motion compensated compression standards. This fact, despite the several efforts aiming at reducing its complexity, still constitutes a serious obstacle for obtaining the highest quality results theoretically achievable by the standard. This is particularly evident when critical conditions occur, such as when very large search windows are needed to have efficient motion prediction for sequences containing large displacements. In this paper we present a new block motion estimation technique, based on the combination of the tracing of trajectories obtained from the already coded motion field and a genetic heuristic search. This technique can be applied to any group of picture structure of any block based compression standard, with a complexity reduction factor up to more than two orders of magnitude at optimal coding results.
international symposium/conference on music information retrieval | 2005
Nicolas Scaringella; Giorgio Zoia