Andrea Cogliati
University of Rochester
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andrea Cogliati.
Optics Express | 2016
Andrea Cogliati; Cristina Canavesi; Adam Hayes; Patrice Tankam; Virgil-Florin Duma; Anand P. Santhanam; Kevin P. Thompson; Jannick P. Rolland
High-speed scanning in optical coherence tomography (OCT) often comes with either compromises in image quality, the requirement for post-processing of the acquired images, or both. We report on distortion-free OCT volumetric imaging with a dual-axis micro-electro-mechanical system (MEMS)-based handheld imaging probe. In the context of an imaging probe with optics located between the 2D MEMS and the sample, we report in this paper on how pre-shaped open-loop input signals with tailored non-linear parts were implemented in a custom control board and, unlike the sinusoidal signals typically used for MEMS, achieved real-time distortion-free imaging without post-processing. The MEMS mirror was integrated into a compact, lightweight handheld probe. The MEMS scanner achieved a 12-fold reduction in volume and 17-fold reduction in weight over a previous dual-mirror galvanometer-based scanner. Distortion-free imaging with no post-processing with a Gabor-domain optical coherence microscope (GD-OCM) with 2 μm axial and lateral resolutions over a field of view of 1 × 1 mm2 is demonstrated experimentally through volumetric images of a regular microscopic structure, an excised human cornea, and in vivo human skin.
international workshop on machine learning for signal processing | 2015
Andrea Cogliati; Zhiyao Duan; Brendt Wohlberg
Automatic music transcription (AMT) is the process of converting an acoustic musical signal into a symbolic musical representation, such as a MIDI file, which contains the pitches, the onsets and offsets of the notes and, possibly, their dynamics and sources (i.e., instruments). Most existing algorithms for AMT operate in the frequency domain, which introduces the well known time/frequency resolution trade-off of the Short Time Fourier Transform and its variants. In this paper, we propose a time-domain transcription algorithm based on an efficient convolutional sparse coding algorithm in an instrument-specific scenario, i.e., the dictionary is trained and tested on the same piano. The proposed method outperforms a current state-of-the-art AMT method by over 26% in F-measure, achieving a median F-measure of 93.6%, and drastically increases both time and frequency resolutions, especially for the lowest octaves of the piano keyboard.
IEEE Signal Processing Letters | 2017
Andrea Cogliati; Zhiyao Duan; Brendt Wohlberg
This letter extends our prior work on context-dependent piano transcription to estimate the length of the notes in addition to their pitch and onset. This approach employs convolutional sparse coding along with lateral inhibition constraints to approximate a musical signal as the sum of piano note waveforms (dictionary elements) convolved with their temporal activations. The waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. A dictionary containing multiple waveforms per pitch is generated by truncating a long waveform for each pitch to different lengths. During transcription, the dictionary elements are fixed and their temporal activations are estimated and postprocessed to obtain the pitch, onset, and note length estimation. A sparsity penalty promotes globally sparse activations of the dictionary elements, and a lateral inhibition term penalizes concurrent activations of different waveforms corresponding to the same pitch within a temporal neighborhood, to achieve note length estimation. Experiments on the MIDI aligned piano sounds dataset show that the proposed approach significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting in transcription accuracy.
IEEE Transactions on Audio, Speech, and Language Processing | 2016
Andrea Cogliati; Zhiyao Duan; Brendt Wohlberg
This paper presents a novel approach to automatic transcription of piano music in a context-dependent setting. This approach employs convolutional sparse coding to approximate the music waveform as the summation of piano note waveforms (dictionary elements) convolved with their temporal activations (onset transcription). The piano note waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. During transcription, the note waveforms are fixed and their temporal activations are estimated and post-processed to obtain the pitch and onset transcription. This approach works in the time domain, models temporal evolution of piano notes, and estimates pitches and onsets simultaneously in the same framework. Experiments show that it significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting, in both transcription accuracy and time precision, in various scenarios including synthetic, anechoic, noisy, and reverberant environments.
international conference on acoustics, speech, and signal processing | 2015
Andrea Cogliati; Zhiyao Duan
Automatic music transcription (AMT) is the process of converting an acoustic musical signal into a symbolic musical representation such as a MIDI piano roll, which contains the pitches, the onsets and offsets of the notes and, possibly, their dynamic and source (i.e., instrument). Existing algorithms for AMT commonly identify pitches and their saliences in each frame and then form notes in a post-processing stage, which applies a combination of thresholding, pruning and smoothing operations. Very few existing methods consider the note temporal evolution over multiple frames during the pitch identification stage. In this work we propose a note-based spectrogram factorization method that uses the entire temporal evolution of piano notes as a template dictionary. The method uses an artificial neural network to detect note onsets from the audio spectral flux. Next, it estimates the notes present in each audio segment between two successive onsets with a greedy search algorithm. Finally, the spectrogram of each segment is factorized using a discrete combination of note templates comprised of full note spectrograms of individual piano notes sampled at different dynamic levels. We also propose a new psychoacoustically informed measure for spectrogram similarity.
Optifab 2015 | 2015
Cristina Canavesi; Andrea Cogliati; Adam Hayes; Anand P. Santhanam; Patrice Tankam; Jannick P. Rolland
Fast, robust, nondestructive 3D imaging is needed for characterization of microscopic structures in industrial and clinical applications. A custom micro-electromechanical system (MEMS)-based 2D scanner system was developed to achieve 55 kHz A-scan acquisition in a Gabor-domain optical coherence microscopy (GD-OCM) instrument with a novel multilevel GPU architecture for high-speed imaging. GD-OCM yields high-definition volumetric imaging with dynamic depth of focusing through a bio-inspired liquid lens-based microscope design, which has no moving parts and is suitable for use in a manufacturing setting or in a medical environment. A dual-axis MEMS mirror was chosen to replace two single-axis galvanometer mirrors; as a result, the astigmatism caused by the mismatch between the optical pupil and the scanning location was eliminated and a 12x reduction in volume of the scanning system was achieved. Imaging at an invariant resolution of 2 μm was demonstrated throughout a volume of 1 × 1 × 0.6 mm3, acquired in less than 2 minutes. The MEMS-based scanner resulted in improved image quality, increased robustness and lighter weight of the system – all factors that are critical for on-field deployment. A custom integrated feedback system consisting of a laser diode and a position-sensing detector was developed to investigate the impact of the resonant frequency of the MEMS and the driving signal of the scanner on the movement of the mirror. Results on the metrology of manufactured materials and characterization of tissue samples with GD-OCM are presented.
Proceedings of SPIE | 2017
Cristina Canavesi; Andrea Cogliati; Adam Hayes; Patrice Tankam; Anand P. Santhanam; Jannick P. Rolland
Real-time volumetric high-definition wide-field-of-view in-vivo cellular imaging requires micron-scale resolution in 3D. Compactness of the handheld device and distortion-free images with cellular resolution are also critically required for onsite use in clinical applications. By integrating a custom liquid lens-based microscope and a dual-axis MEMS scanner in a compact handheld probe, Gabor-domain optical coherence microscopy (GD-OCM) breaks the lateral resolution limit of optical coherence tomography through depth, overcoming the tradeoff between numerical aperture and depth of focus, enabling advances in biotechnology. Furthermore, distortion-free imaging with no post-processing is achieved with a compact, lightweight handheld MEMS scanner that obtained a 12-fold reduction in volume and 17-fold reduction in weight over a previous dual-mirror galvanometer-based scanner. Approaching the holy grail of medical imaging – noninvasive real-time imaging with histologic resolution – GD-OCM demonstrates invariant resolution of 2 μm throughout a volume of 1 x 1 x 0.6 mm3, acquired and visualized in less than 2 minutes with parallel processing on graphics processing units. Results on the metrology of manufactured materials and imaging of human tissue with GD-OCM are presented.
Journal of the Acoustical Society of America | 2016
Andrea Cogliati; Zhiyao Duan; Brendt Wohlberg
Automatic music transcription is the process of automatically inferring a high-level symbolic representation, such as music notation or piano-roll, from a music performance. It has many applications in music education, content-based music search, musicological analysis of non-notated music, and music enjoyment. Existing approaches often perform in the frequency domain, where the fundamental time-frequency resolution tradeoff prevents them from obtaining satisfactory transcription accuracies. In this project, we develop a novel approach in the time domain for piano transcription using convolutional sparse coding. It models the music waveform as a summation of piano note waveforms (dictionary elements) convolved with their temporal activations (onsets). The piano note waveforms are pre-recorded in a context-dependent way, i.e., for the specific piano to be transcribed in the specific environment. During transcription, the note waveforms are fixed, and their temporal activations are estimated and post-proces...
international symposium/conference on music information retrieval | 2016
Andrea Cogliati; David Temperley; Zhiyao Duan
international symposium/conference on music information retrieval | 2017
Andrea Cogliati; Zhiyao Duan