Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Masaaki Ishigame is active.

Publication


Featured researches published by Masaaki Ishigame.


multimedia signal processing | 2008

Highlight scene extraction of sports broadcasts using sports news programs

Yoshiaki Itoh; Shigenobu Sakaki; Kazunori Kojima; Masaaki Ishigame

This paper proposes a new approach for extracting highlight scenes from sports broadcasts by using sports news programs. In order to extract the highlight scenes from sports broadcasts without fail, we use sports news programs and identify identical or similar sections between sports broadcasts and sports news programs that cover the sports broadcasts. To extract identical or similar sections between two video data sets efficiently, we developed a two-step method that combines relay-CDP and active-search. We evaluated this method from the standpoint of the extraction accuracy of the highlight scenes, and computation time, through experiments using actual broadcast data sets.


multimedia signal processing | 2010

Time-space acoustical feature for fast video copy detection

Yoshiaki Itoh; Masahiro Erokuumae; Kazunori Kojima; Masaaki Ishigame; Kazuyo Tanaka

We propose a new time-space acoustical feature for fast video copy detection to search a video segment for a number of video streams to find illegal video copies on Internet video site and so on. We extract a small number of feature vectors from acoustically peculiar points that express the point of local maximum/minimum in the time sequence of acoustical power envelopes in video data. The relative values of the feature points are extracted, so called time-space acoustical feature, because the volume in the video stream differs in different recording environments. The features can be obtained quickly compared with representative features such as MFCC, and they require a short processing time for matching because the number and the dimension of each feature vector are both small. The accuracy and the computation time of the proposed method is evaluated using recorded TV movie programs for input data, and a 30 sec. −3 min. segment in DVD for reference data, assuming a copyright holder of a movie searches the illegal copies for video streams. We could confirm that the proposed method completed all processes within the computation time of the former feature extraction with 93.2% of F-measure in 3 minutes video segment detection.


asia-pacific signal and information processing association annual summit and conference | 2013

High priority in highly ranked documents in spoken term detection

Kazuma Konno; Yoshiaki Itoh; Kazunori Kojima; Masaaki Ishigame; Kazuyo Tanaka; Shi-wook Lee

In spoken term detection, the retrieval of OOV (Out-Of-Vocabulary) query terms are very important because query terms are likely to be OOV terms. To improve the retrieval performance for OOV query terms, the paper proposes a re-scoring method after determining the candidate segments. Each candidate segment has a matching score and a segment number. Because highly ranked candidate is usually reliable and a user is assumed to select query terms so that they are the special terms for the target documents and they appear frequently in the target documents, we give a high priority to the candidate segments that are included in highly ranked documents by adjusting the matching score. We conducted the performance evaluation experiments for the proposed method using open test collections for SpokenDoc-2 in NTCIR-10. Results showed the retrieval performance was more than 7.0 points improved by the proposed method for two test sets in the test collections, and demonstrated the effectiveness of the proposed method.


spoken language technology workshop | 2008

Open vocabulary spoken document retrieval by subword sequence obtained from speech recognizer

Go Kuriki; Yoshiaki Itoh; Kazunori Kojima; Masaaki Ishigame; Kazuyo Tanaka; Shi-wook Lee

We present a method for open vocabulary retrieval based on a spoken document retrieval (SDR) system using subword models. The present paper proposes a new approach to open vocabulary SDR system using subword models which do not require subword recognition. Instead, subword sequences are obtained from the phone sequence outputted containing an out of vocabulary (OOV) word, a speech recognizer outputs a word sequence whose phone sequence is considered to be similar to the OOV word. When OOV words are provided in a query, the proposed system is able to retrieve the target section by comparing the phone sequences of the query and the word sequence generated by the speech recognizer.


international conference on acoustics speech and signal processing | 1999

Error correction for speaker-independent isolated word recognition through likelihood compensation using phonetic bigram

Hiroshi Matsuo; Masaaki Ishigame

We propose an error correction technique for speaker-independent isolated word recognition by compensating for a words likelihood. The likelihood is compensated for by the likelihood calculated by a phonetic bigram. The phonetic bigram is a phoneme model expressing the frame correlation within an utterance. A speaker-independent isolated word recognition experiment showed that our proposed technique reduces the recognition error compared to conventional techniques. The proposed technique achieves a performance almost equal to that without speaker adaptation compared to the conventional phoneme model adapted using several words.


Eurasip Journal on Audio, Speech, and Music Processing | 2008

Automatic music boundary detection using short segmental acoustic similarity in a music piece

Yoshiaki Itoh; Akira Iwabuchi; Kazunori Kojima; Masaaki Ishigame; Kazuyo Tanaka; Shi-wook Lee

The present paper proposes a new approach for detecting music boundaries, such as the boundary between music pieces or the boundary between a music piece and a speech section for automatic segmentation of musical video data and retrieval of a designated music piece. The proposed approach is able to capture each music piece using acoustic similarity defined for short-term segments in the music piece. The short segmental acoustic similarity is obtained by means of a new algorithm called segmental continuous dynamic programming, or segmental CDP. The location of each music piece and its music boundaries are then identified by referring to multiple similar segments and their location information, avoiding oversegmentation within a music piece. The performance of the proposed method is evaluated for music boundary detection using actual music datasets. The present paper demonstrates that the proposed method enables accurate detection of music boundaries for both the evaluation data and a real broadcasted music program.


multimedia signal processing | 2007

Music Boundary Detection Using Similarity in a Music Selection

Yoshiaki Itoh; A. Iwabuchi; K. Kqjima; Masaaki Ishigame; K. Tanaka; Shi-vvook Lee

This paper proposes a new method of extracting music boundaries, such as a boundary between musical selections, or a boundary between a musical selection and a speech, for automatic segmentation of \ideo data and other applications. The method utilizes acoustic similarity in a music selection. Similar partial sections are first extracted, by means of a new algorithm called Segmental Continuous Dynamic Programming, or Segmental CDP. The music boundary is identified by reference to multiple similar sections and their location information, as extracted by Segmental CDP. The performance of the proposed method is evaluated for music boundary extraction using actual music data sets. The study demonstrates that the proposed method enables to extract music boundaries well for both evaluation data and a real broadcasted music program.


Journal of the Acoustical Society of America | 2006

A proposal of discrimination method between voice and music using Gaussian mixture models and similarity in a music selection

Takuma Yoshida; Masaaki Ishigame; Yoshiaki Ito; Kazunori Kojima

We have proposed a method for automatic music boundary detection using similarity in a music selection. The method is able to capture a whole position of a consecutive music selection, although the boundary position of a music selection is likely to be roughly estimated. In this paper, we propose a new method that combines a GMM (Gaussian mixture model) discrimination method between music and voice with the previously proposed method. The GMM enables us to determine strict boundary positions. On the other hand, the GMM often misdetects boundaries that are not actual boundaries, such as changing points of music instruments, the point of modulation in a music selection, and so on. We, therefore, exclude the GMM misdiscrimination in a music selection by the previously proposed method, and also realize the precise detection of music boundaries by the GMM. We conducted various experiments using open music selections that are provided by RWC, and the results showed the proposed method could improve the performa...


international conference on parallel and distributed systems | 2000

Network based genetic algorithm

Kazunori Kojima; Masaaki Ishigame; Wataru Kawamata; Hiroshi Matsuo

Parallel genetic algorithms are effective for solving large problems. Most of them are implemented on a massive parallel computer and the efficiency depends on the parallel computing system. It is inappropriate to implement them in a distributed computing system connected by a network. The paper proposes a client-server based approach to parallel genetic algorithms with a delegate management model that manages string exchange between subpopulations by the server and eliminates inter-subpopulation communications. It is easy to port and implement without any parallel computing systems. Experiments solving the traveling salesman problem (100 cities) were carried out. The results show the effectiveness of the proposed model.


Journal of the Acoustical Society of America | 1996

Training section tolerant HMM in concatenated training

Hiroshi Matsuo; Masaaki Ishigame

Concatenated training of phoneme HMMs can use speech data without hand labels for training HMMs, but it has a tendency to decrease the recognition rate because of an improper training section. The results for speaker‐independent isolated word recognition experiments showed that non left–right HMM is tolerant of the quality of the training section in concatenated training than the conventional left‐to‐right HMM. Non left–right HMM has a structure where state transitions within a phoneme are ergodic and state transitions between successive two phonemes are left‐to‐right. Non left‐right HMM and the conventional left‐to‐right HMM show much the same performance as long as the training section is given properly by hand labels. Since the training section contains undesirable data, the recognition rate for both HMMs decreases, but the decrement of the recognition rate for non left‐right HMM is smaller than the decrement for the conventional left‐to‐right HMM. Similar tendency was shown when the training section w...

Collaboration


Dive into the Masaaki Ishigame's collaboration.

Top Co-Authors

Avatar

Kazunori Kojima

Iwate Prefectural University

View shared research outputs
Top Co-Authors

Avatar

Yoshiaki Itoh

National Institute of Advanced Industrial Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shi-wook Lee

National Institute of Advanced Industrial Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Go Kuriki

Iwate Prefectural University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

A. Iwabuchi

Iwate Prefectural University

View shared research outputs
Top Co-Authors

Avatar

Akira Iwabuchi

Iwate Prefectural University

View shared research outputs
Researchain Logo
Decentralizing Knowledge