Masahiro Niitsuma
Keio University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Masahiro Niitsuma.
computer analysis of images and patterns | 2013
Masahiro Niitsuma; Lambertus Schomaker; Jean-Paul van Oosten; Yo Tomita
Although most of the previous studies in writer identification in music scores assumed successful prior staff-line removal, this assumption does not hold when the music scores suffer from a certain level of degradation or deformation. The impact of staff-line removal on the result of writer identification in such documents is rather vague. In this study, we propose a novel writer identification method that requires no staff-line removal and no segmentation. Staff-line removal is virtually achieved without image processing, by dimensionality reduction with an autoencoder in Contour-Hinge feature space. The experimental result with a wide range of music manuscripts shows the proposed method can achieve favourable results without prior staff-line removal.
Multimedia Tools and Applications | 2016
Masahiro Niitsuma; Lambert Schomaker; Jean-Paul van Oosten; Yo Tomita; David A. Bell
Recent renewed interest in computational writer identification has resulted in an increased number of publications. In relation to historical musicology its application has so far been limited. One of the obstacles seems to be that the clarity of the images from the scans available for computational analysis is often not sufficient. In this paper, the use of the Hinge feature is proposed to avoid segmentation and staff-line removal for effective feature extraction from low quality scans. The use of an auto encoder in Hinge feature space is suggested as an alternative to staff-line removal by image processing, and their performance is compared. The result of the experiment shows an accuracy of 87 % for the dataset containing 84 writers’ samples, and superiority of our segmentation and staff-line removal free approach. Practical analysis on Bach’s autograph manuscript of the Well-Tempered Clavier II (Additional MS. 35021 in the British Library, London) is also presented and the extensive applicability of our approach is demonstrated.
Journal of the Acoustical Society of America | 2016
Chihiro Terayama; Masahiro Niitsuma; Yoichi Yamashita
This paper proposes a technique of embedding information into audio signal to communicate using loudspeakers and microphones mounted on tablet-type devices. This technique is based on amplitude modulation for high audible frequency bands. The embedded information is represented by existence of sine waves with 11 different frequencies. In consideration of the frequency characteristics of tablet-type devices, two frequency bands to embed information, were chosen: (1) frequencies up to 12 kHz; (2) frequencies up to 16 kHz. The experimental result showed 85.9% of correct decoding in the case of (1), and 76.8% in the case of (2). Moreover, less audible noise was perceived in the case of (2).
Journal of the Acoustical Society of America | 2016
Yuto Yamamoto; Masahiro Niitsuma; Yoichi Yamashita
This paper addresses negative emotion recognition using paralinguistic information in speech for speech dialogue system. Speech conveys not only linguistic information but also paralinguistic and non-linguistic information such as the emotions, attitudes, and intentions. This easily perceivable information plays a key role in a spoken dialog system. However, most of previous speech recognition systems fail to consider this significant information, focusing only on linguistic information, thus hindering the development of more natural speech dialog systems. In order to utilize these significant information for speech dialog systems, this paper focuses on negative emotion recognition from Japanese utterances. 6552-dimensional acoustic features were extracted from 6300 Japanese utterances of 50 people in three emotional state: negative; positive; and neutral. Negative emotion includes anger, sad and dislike. While positive emotion includes favor, joy, and relief. They were classified by SVM and evaluated by ...
Journal of the Acoustical Society of America | 2016
Atsushi Morimoto; Masahiro Niitsuma; Yoichi Yamashita
This paper addresses age estimation of Japanese speech utterance, using paralinguistic information in speech. Our speech conveys not only linguistic information but also paralinguistic and non-linguistic information. Nonlinguistic information includes potentially valuable social information such as personal parameters, body conditions, gender and ages. This significant information, however, has not been investigated enough. The main purpose of our study was to develop less-invasive methods to extract this information, by proposing a method to classify speakers’ age into five groups (20s, 30s, 40s, 50s, and 60s). 6375-dimentional audio features were extracted from 1579 samples of 913 male speakers extracted from CSJ (Corpus of Spontaneous Japanese) and feature selection was conducted based on Fishers Ratio. Naive Bayes with top 20 parameters of Fischer’s ratio, yields highest accuracy rate of 51.2%.
international symposium/conference on music information retrieval | 2011
Masahiro Niitsuma; Yo Tomita
international symposium/conference on music information retrieval | 2009
Masahiro Niitsuma; Tsutomu Fujinami; Yo Tomita
The Missouri Review | 2018
Masahiro Niitsuma; Yo Tomita; Wei Qi Yan; David A. Bell
asia pacific signal and information processing association annual summit and conference | 2017
Keisuke Imoto; Nobutaka Ono; Masahiro Niitsuma; Yoichi Yamashita
IEICE Technical Report; IEICE Tech. Rep. | 2017
Terayama Chihiro; Masahiro Niitsuma; Yoichi Yamashita