Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jinyu Han is active.

Publication


Featured researches published by Jinyu Han.


IEEE Transactions on Audio, Speech, and Language Processing | 2014

Multi-pitch Streaming of Harmonic Sound Mixtures

Zhiyao Duan; Jinyu Han; Bryan Pardo

Multi-pitch analysis of concurrent sound sources is an important but challenging problem. It requires estimating pitch values of all harmonic sources in individual frames and streaming the pitch estimates into trajectories, each of which corresponds to a source. We address the streaming problem for monophonic sound sources. We take the original audio, plus frame-level pitch estimates from any multi-pitch estimation algorithm as inputs, and output a pitch trajectory for each source. Our approach does not require pre-training of source models from isolated recordings. Instead, it casts the problem as a constrained clustering problem, where each cluster corresponds to a source. The clustering objective is to minimize the timbre inconsistency within each cluster. We explore different timbre features for music and speech. For music, harmonic structure and a newly proposed feature called uniform discrete cepstrum (UDC) are found effective; while for speech, MFCC and UDC works well. We also show that timbre-consistency is insufficient for effective streaming. Constraints are imposed on pairs of pitch estimates according to their time-frequency relationships. We propose a new constrained clustering algorithm that satisfies as many constraints as possible while optimizing the clustering objective. We compare the proposed approach with other state-of-the-art supervised and unsupervised multi-pitch streaming approaches that are specifically designed for music or speech. Better or comparable results are shown.


international conference on acoustics, speech, and signal processing | 2011

Improving melody extraction using Probabilistic Latent Component Analysis

Jinyu Han; Ching-Wei Chen

We propose a new approach for automatic melody extraction from polyphonic audio, based on Probabilistic Latent Component Analysis (PLCA).An audio signal is first divided into vocal and non-vocal segments using a trained Gaussian Mixture Model (GMM) classifier. A statistical model of the non-vocal segments of the signal is then learned adaptively from this particular input music by PLCA. This model is then employed to remove the accompaniment from the mixture, leaving mainly the vocal components. The melody line is extracted from the vocal components using an auto-correlation algorithm. Quantitative evaluation shows that the new system performs significantly better than two existing melody extraction algorithms for polyphonic single-channel mixtures.


international conference on latent variable analysis and signal separation | 2012

Audio imputation using the non-negative hidden markov model

Jinyu Han; Gautham J. Mysore; Bryan Pardo

Missing data in corrupted audio recordings poses a challenging problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Non-negative Hidden Markov Model, enables more temporally coherent estimation for the missing data by taking into account both the spectral and temporal information of the audio signal. This approach is able to reconstruct highly corrupted audio signals with large parts of the spectrogram missing. We demonstrate this approach on real-world polyphonic music signals. The initial experimental results show that our approach has advantages over a previous missing data imputation method.


international conference on acoustics, speech, and signal processing | 2010

Song-level multi-pitch tracking by heavily constrained clustering

Zhiyao Duan; Jinyu Han; Bryan Pardo

Given a set of monophonic, harmonic sound sources (e.g. human voices or wind instruments), multi-pitch estimation (MPE) is the task of determining the instantaneous pitches of each source. Multi-pitch tracking (MPT) connects the instantaneous pitch estimates provided by MPE algorithms into pitch trajectories of sources. A trajectory can be short (within a musical note), or long (an entire piece of music). While note-level MPT methods usually utilize local time-frequency proximity of pitches to connect them into a note, song-level MPT is much more difficult and needs more information. This is because pitches evolve discontinuously from note to note, and pitch trajectories can even interweave. In this paper, we cast the song-level MPT problem as a constrained clustering problem. The constraints are time-frequency locality of pitches and the clustering objective is their timbre consistency. Due to this problems unique properties, existing constrained clustering algorithms cannot be directly applied. We propose a new constrained clustering algorithm. Experiments show that our approach produces good results on real-world music recordings of 4 musical instruments.


international conference on acoustics, speech, and signal processing | 2011

Reconstructing completely overlapped notes from musical mixtures

Jinyu Han; Bryan Pardo

In mixtures of musical sounds, the problem of overlapped harmonics poses a significant challenge to source separation. Common Amplitude Modulation (CAM) is one of the most effective methods to resolve this problem. It, however, relies on non-overlapped harmonics from the same note being available. We propose an alternate technique for harmonic envelope estimation, based on Harmonic Temporal Envelope Similarity (HTES). We learn a harmonic envelope model for each instrument from the non-overlapped harmonics of notes of the same instrument, wherever they occur in the recording. This model is used to reconstruct the harmonic envelopes for overlapped harmonics. This allows reconstruction of completely overlapped notes. Experiments show our algorithm performs better than an existing system based on CAM when the harmonics of pitched instruments are strongly overlapped.


international workshop on machine learning for signal processing | 2012

Language informed bandwidth expansion

Jinyu Han; Gautham J. Mysore; Bryan Pardo

High-level knowledge of language helps the human auditory system understand speech with missing information such as missing frequency bands. The automatic speech recognition community has shown that the use of this knowledge in the form of language models is crucial to obtaining high quality recognition results. In this paper, we apply this idea to the bandwidth expansion problem to automatically estimate missing frequency bands of speech. Specifically, we use language models to constrain the recently proposed non-negative hidden Markov model for this application. We compare the proposed method to a bandwidth expansion algorithm based on non-negative spectrogram factorization and show improved results on two standard signal quality metrics.


workshop on applications of signal processing to audio and acoustics | 2009

Improving separation of harmonic sources with iterative estimation of spatial cues

Jinyu Han; Bryan Pardo

Recent work in source separation of two-channel mixtures has used spatial cues (cross-channel amplitude and phase difference coefficients) to estimate time-frequency masks for separating sources. As sources increasingly overlap in the time-frequency domain or the spatial angle between sources decreases, these spatial cues become unreliable. We introduce a method to re-estimate the spatial cues for mixtures of harmonic sources. The newly estimated spatial cues are fed to the system to update each source estimate and the pitch estimate of each source. This iterative procedure is repeated until the difference between the current estimate of the spatial cues and the previous one is under a pre-set threshold. Results on a set of three-source mixtures of musical instruments show this approach significantly improves separation performance of two existing time-frequency masking systems.


acm/ieee joint conference on digital libraries | 2008

The vocalsearch music search engine

Bryan Pardo; David Little; Rui Jiang; Hagai Livni; Jinyu Han

The VocalSearch system is a music search engine developed at Northwestern University and available on the internet (vocalsearch.org). This system lets the user query for the desired song in a number of ways: sung queries, queries entered as music notation, and text-based lyrics search. Users are also able to contribute songs to the system, making them searchable for future users. The result is a flexible system that lets the user find the song using their preferred modality (music notation, text, music notation). This demonstration lets users try out the VocalSearch system.


Journal of the Acoustical Society of America | 2010

Reconstructing individual monophonic instruments from musical mixtures using scene completion.

Jinyu Han; Bryan Pardo

Monaural sound source separation is the process of separating sound sources from a single channel mixture. In mixtures of pitched musical instruments, the problem of overlapping harmonics poses a significant challenge to source separation and reconstruction. One standard method to resolve overlapped harmonics is based on the assumption that harmonics of the same source have correlated amplitude envelopes: common amplitude modulation (CAM). Based on CAM, overlapped harmonics are approximated using the amplitude envelope from the non‐overlapped harmonics of the same note. CAM assumes non‐overlapped harmonics from the same noteare available and have similar amplitude envelopes to the overlapped harmonics. This is not always the case. A technique is proposed for harmonic temporal envelope estimation based on the idea of scene completion. The system learns the harmonic envelope for each instrument’s notes from the non‐overlapped harmonics of other notes played by that instrument, wherever they occur in the rec...


international symposium/conference on music information retrieval | 2009

HARMONICALLY INFORMED MULTI-PITCH TRACKING

Zhiyao Duan; Jinyu Han; Bryan Pardo

Collaboration


Dive into the Jinyu Han's collaboration.

Top Co-Authors

Avatar

Bryan Pardo

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

Zhiyao Duan

University of Rochester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Little

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

Hagai Livni

Northwestern University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rui Jiang

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

Zafar Rafii

Northwestern University

View shared research outputs
Researchain Logo
Decentralizing Knowledge