Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yi-Hsuan Yang is active.

Publication


Featured researches published by Yi-Hsuan Yang.


IEEE Transactions on Audio, Speech, and Language Processing | 2008

A Regression Approach to Music Emotion Recognition

Yi-Hsuan Yang; Yu-Ching Lin; Ya-Fan Su; Homer H. Chen

Content-based retrieval has emerged in the face of content explosion as a promising approach to information access. In this paper, we focus on the challenging issue of recognizing the emotion content of music signals, or music emotion recognition (MER). Specifically, we formulate MER as a regression problem to predict the arousal and valence values (AV values) of each music sample directly. Associated with the AV values, each music sample becomes a point in the arousal-valence plane, so the users can efficiently retrieve the music sample by specifying a desired point in the emotion plane. Because no categorical taxonomy is used, the regression approach is free of the ambiguity inherent to conventional categorical approaches. To improve the performance, we apply principal component analysis to reduce the correlation between arousal and valence, and RReliefF to select important features. An extensive performance study is conducted to evaluate the accuracy of the regression approach for predicting AV values. The best performance evaluated in terms of the R 2 statistics reaches 58.3% for arousal and 28.1% for valence by employing support vector machine as the regressor. We also apply the regression approach to detect the emotion variation within a music selection and find the prediction accuracy superior to existing works. A group-wise MER scheme is also developed to address the subjectivity issue of emotion perception.


ACM Transactions on Intelligent Systems and Technology | 2012

Machine Recognition of Music Emotion: A Review

Yi-Hsuan Yang; Homer H. Chen

The proliferation of MP3 players and the exploding amount of digital music content call for novel ways of music organization and retrieval to meet the ever-increasing demand for easy and effective information access. As almost every music piece is created to convey emotion, music organization and retrieval by emotion is a reasonable way of accessing music information. A good deal of effort has been made in the music information retrieval community to train a machine to automatically recognize the emotion of a music signal. A central issue of machine recognition of music emotion is the conceptualization of emotion and the associated emotion taxonomy. Different viewpoints on this issue have led to the proposal of different ways of emotion annotation, model training, and result visualization. This article provides a comprehensive review of the methods that have been proposed for music emotion recognition. Moreover, as music emotion recognition is still in its infancy, there are many open issues. We review the solutions that have been proposed to address these issues and conclude with suggestions for further research.


acm multimedia | 2006

Music emotion classification: a fuzzy approach

Yi-Hsuan Yang; Chia Chu Liu; Homer H. Chen

Due to the subjective nature of human perception, classification of the emotion of music is a challenging problem. Simply assigning an emotion class to a song segment in a deterministic way does not work well because not all people share the same feeling for a song. In this paper, we consider a different approach to music emotion classification. For each music segment, the approach determines how likely the song segment belongs to an emotion class. Two fuzzy classifiers are adopted to provide the measurement of the emotion strength. The measurement is also found useful for tracking the variation of music emotions in a song. Results are shown to illustrate the effectiveness of the approach.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

Ranking-Based Emotion Recognition for Music Organization and Retrieval

Yi-Hsuan Yang; Homer H. Chen

Determining the emotion of a song that best characterizes the affective content of the song is a challenging issue due to the difficulty of collecting reliable ground truth data and the semantic gap between humans perception and the music signal of the song. To address this issue, we represent an emotion as a point in the Cartesian space with valence and arousal as the dimensions and determine the coordinates of a song by the relative emotion of the song with respect to other songs. We also develop an RBF-ListNet algorithm to optimize the ranking-based objective function of our approach. The cognitive load of annotation, the accuracy of emotion recognition, and the subjective quality of the proposed approach are extensively evaluated. Experimental results show that this ranking-based approach simplifies emotion annotation and enhances the reliability of the ground truth. The performance of our algorithm for valence recognition reaches 0.326 in Gamma statistic.


Archive | 2011

Music Emotion Recognition

Yi-Hsuan Yang; Homer H. Chen

Providing acomplete review of existing work in music emotion developed in psychology and engineering, Music Emotion Recognition explains how to account for the subjective nature of emotion perception in the development of automatic music emotion recognition (MER) systems. Among the first publications dedicated to automatic MER, it begins with a comprehensiveintroduction to the essential aspects of MERincluding background, key techniques, and applications. This ground-breaking reference examines emotion from a dimensional perspective. It defines emotions in music as points in a 2D plane in terms of two of the most fundamental emotion dimensions according to psychologistsvalence and arousal. The authors present a computational framework that generalizes emotion recognition from the categorical domain to real-valued 2D space. They also: Introduce novel emotion-based music retrieval and organization methods Describe a ranking-base emotion annotation and model training method Present methods that integrate information extracted from lyrics, chord sequence, and genre metadata for improved accuracy Consider an emotion-based music retrieval system that is particularly useful for mobile devices The book details techniques for addressing the issues related to: the ambiguity and granularity of emotion description, heavy cognitive load of emotion annotation, subjectivity of emotion perception, and the semantic gap between low-level audio signal and high-level emotion perception. Complete with more than 360 useful references, 12 example MATLAB codes, and a listing of key abbreviations and acronyms, this cutting-edge guide supplies the technical understanding and tools needed to develop your own automatic MER system based on the automatic recognition model.


acm multimedia | 2013

1000 songs for emotional analysis of music

Mohammad Soleymani; Micheal N. Caro; Erik M. Schmidt; Cheng-Ya Sha; Yi-Hsuan Yang

Music is composed to be emotionally expressive, and emotional associations provide an especially natural domain for indexing and recommendation in todays vast digital music libraries. But such libraries require powerful automated tools, and the development of systems for automatic prediction of musical emotion presents a myriad challenges. The perceptual nature of musical emotion necessitates the collection of data from human subjects. The interpretation of emotion varies between listeners thus each clip needs to be annotated by a distribution of subjects. In addition, the sharing of large music content libraries for the development of such systems, even for academic research, presents complicated legal issues which vary by country. This work presents a new publicly available dataset for music emotion recognition research and a baseline system. In addressing the difficulties of emotion annotation we have turned to crowdsourcing, using Amazon Mechanical Turk, and have developed a two-stage procedure for filtering out poor quality workers. The dataset consists entirely of creative commons music from the Free Music Archive, which as the name suggests, can be shared freely without penalty. The final dataset contains 1000 songs, each annotated by a minimum of 10 subjects, which is larger than many currently available music emotion dataset.


acm multimedia | 2008

ContextSeer: context search and recommendation at query time for shared consumer photos

Yi-Hsuan Yang; Po Tun Wu; Ching Wei Lee; Kuan Hung Lin; Winston H. Hsu; Homer H. Chen

The advent of media-sharing sites like Flickr has drastically increased the volume of community-contributed multimedia resources on the web. However, due to their magnitudes, these collections are increasingly difficult to understand, search and navigate. To tackle these issues, a novel search system, ContextSeer, is developed to improve search quality (by reranking) and recommend supplementary information (i.e., search-related tags and canonical images) by leveraging the rich context cues, including the visual content, high-level concept scores, time and location metadata. First, we propose an ordinal reranking algorithm to enhance the semantic coherence of text-based search result by mining contextual patterns in an unsupervised fashion. A novel feature selection method, wc-tf-idf is also developed to select informative context cues. Second, to represent the diversity of search result, we propose an efficient algorithm cannoG to select multiple canonical images without clustering. Finally, ContextSeer enhances the search experience by further recommending relevant tags. Besides being effective and unsupervised, the proposed methods are efficient and can be finished at query time, which is vital for practical online applications. To evaluate ContextSeer, we have collected 0.5 million consumer photos from Flickr and manually annotated a number of queries by pooling to form a new benchmark, Flickr550. Ordinal reranking achieves significant performance gains both in Flcikr550 and TRECVID search benchmarks. Through a subjective test, cannoG expresses its representativeness and excellence for recommending multiple canonical images.


pacific rim conference on multimedia | 2008

Toward Multi-modal Music Emotion Classification

Yi-Hsuan Yang; Yu-Ching Lin; Heng-Tze Cheng; I-Bin Liao; Yeh-Chin Ho; Homer H. Chen

The performance of categorical music emotion classification that divides emotion into classes and uses audio features alone for emotion classification has reached a limit due to the presence of a semantic gap between the object feature level and the human cognitive level of emotion perception. Motivated by the fact that lyrics carry rich semantic information of a song, we propose a multi-modal approach to help improve categorical music emotion classification. By exploiting both the audio features and the lyrics of a song, the proposed approach improves the 4-class emotion classification accuracy from 46.6% to 57.1%. The results also show that the incorporation of lyrics significantly enhances the classification accuracy of valence.


IEEE Transactions on Multimedia | 2009

Smooth Control of Adaptive Media Playout for Video Streaming

Ya-Fan Su; Yi-Hsuan Yang; Meng-Ting Lu; Homer H. Chen

Client-side data buffering is a common technique to deal with media playout interruptions of streaming video caused by network jitters and packet losses of best-effort networks. However, stronger playout interruption protection inevitably amounts to larger data buffering and results in more memory requirements and longer playout delay. Adaptive media playout (AMP), also a client-side technique, can reduce the buffer requirement and avoid buffer outage but at the expense of visual quality degradation because of the fluctuation of playout speed. In this paper, we propose a novel AMP scheme to keep the video playout as smooth as possible while adapting to the channel condition. The triggering of the playout control is based on buffer variation rather than buffer fullness. Experimental results show that our AMP scheme surpasses conventional schemes in unfriendly network conditions. Unlike previous schemes that are tuned for a specific range of packet loss and network instability, the proposed AMP scheme maintains consistent performance across a wide range of network conditions.


international conference on multimedia and expo | 2007

Music Emotion Classification: A Regression Approach

Yi-Hsuan Yang; Yu-Ching Lin; Ya-Fan Su; Homer H. Chen

Typical music emotion classification (MEC) approaches categorize emotions and apply pattern recognition methods to train a classifier. However, categorized emotions are too ambiguous for efficient music retrieval. In this paper, we model emotions as continuous variables composed of arousal and valence values (AV values), and formulate MEC as a regression problem. The multiple linear regression, support vector regression, and AdaBoost.RT are adopted to evaluate the prediction accuracy. Since the regression approach is inherently continuous, it is free of the ambiguity problem existing in its categorical counterparts.

Collaboration


Dive into the Yi-Hsuan Yang's collaboration.

Top Co-Authors

Avatar

Homer H. Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Li Su

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar

Yu-Ching Lin

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jen-Yu Liu

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chih-Ming Chen

National Chengchi University

View shared research outputs
Top Co-Authors

Avatar

Ming-Feng Tsai

National Chengchi University

View shared research outputs
Top Co-Authors

Avatar

Chin-Chia Michael Yeh

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar

Ping-Keng Jao

Center for Information Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge