Is this you? Create Your Porfile

Kari Torkkola

Helsinki University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kari Torkkola is active.

Explore More

Publication

Featured researches published by Kari Torkkola.

international symposium on neural networks | 1992

LVQPAK: A software package for the correct application of Learning Vector Quantization algorithms

Teuvo Kohonen; Jari Kangas; Jorma Laaksonen; Kari Torkkola

An overview of the software package LVQPAK, which has been developed for convenient and effective application of learning vector quantization algorithms, is presented. Two new features are included: fast conflict-free initial distribution of codebook vectors into the class zones and the optimized-learning-rate algorithm OLVQ1.<<ETX>>

international conference on acoustics, speech, and signal processing | 1993

An efficient way to learn English grapheme-to-phoneme rules automatically

Kari Torkkola

An efficient way to learn automatically grapheme-to-phoneme mapping rules for English by using Kohonens concept of dynamically expanding context is presented. This method constructs rules that are most general in the sense of an explicitly defined specificity hierarchy. As the hierarchy, the amount of expanding context around the symbol to be transformed, weighted towards the right, is used. To apply this concept to English text-to-speech mapping, the authors have used the 20008-word corpus provided in the public domain by T. Sejnowski and C.R. Rosenberg (Complex Syst. vol.1, no.1, p.145-68 of 1987), which was also used in the NETTALK experiments. Phoneme-level mapping accuracies of 91% with data not used in training demonstrate that the dynamically expanding context is able to capture quite efficiently the context-dependent relationships in the corpus.<<ETX>>

international conference on acoustics speech and signal processing | 1988

Phonetic typewriter for Finnish and Japanese

Teuvo Kohonen; Kari Torkkola; M. Shozakai; Jari Kangas; Olli Ventä

A microprocessor-based real-time speech recognition system is described. It is able to produce orthographic transcriptions for arbitrary words or phrases uttered in Finnish or Japanese. It can also be used as a large-vocabulary isolated word recognizer. The acoustic processor of the system transcribing speech into phonemes is based on neural network principles. The so-called phonotopic maps constructed by a self-organizing process are employed. The coarticulation effects in phonetic transcriptions are compensated by means of automatically derived rules which describe the morphology of errors at the acoustic processor output. Without applying any language model, the recognition result is correct up to 92 or even 97 per cent referring to individual letters.<<ETX>>

Journal of the Acoustical Society of America | 1993

Self‐organized acoustic feature map in detection of fricative‐vowel coarticulation

Lea Leinonen; Tapio Hiltunen; Kari Torkkola; Jari Kangas

The self-organizing map, a neural network algorithm of Kohonen, was used for the detection of coarticulatory variation of fricative [s] preceding vowels [a:], [i:], and [u:]. The results were compared with the psychoacoustic classification of the same samples to find out whether the map had extracted perceptually meaningful features of [s]. The map distinguished samples of [s] in front of [u:] from those in front of [a:] or [i:] throughout the fricative duration. Samples of [s] preceding [a:] and [i:] were distinguished from each other only just before (about 40 ms) the vowel onset. The results agreed with the perceptual classifications. Most judgments (82%) of [s] in front of [u:] were correct, and this variant of [s] was recognized from the first and second halves of segmented fricatives equally well. Samples of [s] in front of [a:] and [i:] were distinguished from each other less accurately. When halves of segmented [s] were perceptually judged, the differentiation between the following [a] and [i] was possible only on the basis of the second half of the fricative. The results demonstrate that the self-organizing map is a useful tool for the extraction of intersubject regularities in speech spectra. The map also provides an easily understandable, on-line, visualization of speech that can be used as feedback in therapy and education.

international conference on acoustics speech and signal processing | 1988

Automatic alignment of speech with phonetic transcriptions in real time

Kari Torkkola

A system to align speech waveforms with the corresponding phonetic transcriptions is described. The alignment is mainly based on the labeling of speech frames centisecond apart to phonetic classes. A novel method based on neural network principles is used to accomplish the labeling. Another major source of information utilized is spectral stationarity. The alignment is performed in two main stages. First, a list of phonetic events having stationary properties is constructed. The phonetic transcription is roughly aligned with this list. A more detailed boundary refinement is then carried out using heuristic speech-specific knowledge. The system is running on standard IBM PC/AT in real time. It is used for on-line speaker enrollment and syntactic correction analysis in addition to establishing a database for speech recognition research.<<ETX>>

international conference on acoustics, speech, and signal processing | 1991

Using the topology-preserving properties of SOFMs in speech recognition

Kari Torkkola; M. Kokkonen

Self-organizing feature maps (SOFMs) are used as speech feature extractors followed by a classifier based on multilayer feedforward networks. Usually SOFMs have been used in speech recognition as static pattern classifiers or vector quantizers, ignoring their property of preserving the local topology of input pattern space. Here, the topological ordering of the acoustic speech data in the SOFM is utilized to form trajectories in the map which are then fed into a classifier. Viewing the trajectories at multiple resolution levels, feature vectors are formed that take contextual information into account. Experiments with such feature vectors indicate that better accuracies can be obtained than by using a simple SOFM classifier based on instantaneous acoustic features.<<ETX>>

Speech Communication | 1994

Mapping context dependent acoustic information into context independent form by LVQ

Jyri Mäntysalo; Kari Torkkola; Teuvo Kohonen

Abstract In the framework of phonemic speech recognition using Hidden Markov Models (HMMs) together with codebooks trained by Learning Vector Quantization (LVQ), a novel way to model context-dependencies in speech is presented. We use LVQ to map acoustic contextual data into context independent phonemic form. The acoustic data is in the form of concatenated averages of successive short-time feature vectors. This mapping eliminates the need to employ context dependent phonemic, for example, triphone HMMs, and the difficulties associated therein. Instead, simpler context independent discrete observation HMMs suffice. We report excellent results for a speaker dependent task for Finnish.

international conference on acoustics, speech, and signal processing | 1992

Using SOMs as feature extractors for speech recognition

Jari Kangas; Kari Torkkola; Makko Kokkonen

The authors demonstrate that the self-organizing maps (SOMs) of Kohonen can be used as speech feature extractors that are able to take temporal context into account. They have investigated two alternatives for using SOMs as such feature extractors, one based on tracing the location of highest activity on a SOM, the other on integrating the activity of the whole SOM for a period of time. The experiments indicated that an improvement is achievable by using these methods.<<ETX>>

Speech Communication | 1990

Using self-organizing maps and multi-layered feed-forward nets to obtain phonemic transcriptions of spoken utterances

Mikko Kokkonen; Kari Torkkola

Abstract A new approach to construct phonemic transcriptions of spoken utterances is described. The Self-Organizing Feature Maps by Kohonen are first applied to vector-quantize speech into a sequence of phoneme labels a centisecond apart. This code sequence is converted into a phoneme string using a multi-layered feed-forward network trained with error back propagation. The trained network acts as a filter removing undesired transitional and coarticulatory effects from the code sequence. This makes it almost a trivial task to convert the code sequence into a phoneme sequence. The need for any statistical speech models, such as Hidden Markov Models, is thus eliminated. The new approach is compared to an existing one being used in a speech recognition system, in which simple durational rules are used for the same transformation task. The accuracy of produced phonemic transcriptions is 4.8 per cent units better using the proposed multi-layered network approach (88.4% opposed to 83.6%).

Folia Phoniatrica Et Logopaedica | 1993

Acoustic Pattern Recognition of /s/ Misarticulationbythe Self-Organizing Map

Riitta Mujunen; Lea Leinonen; Jari Kangas; Kari Torkkola

The [s] samples of 11 women, psychoacoustically classified as acceptable/unacceptable, were studied with the self-organizing map, the neural network algorithm of Kohonen. The measurement map had been previously computed with nondisordered speech samples. Fifteen-component spectral vectors, analyzed with the map, were calculated from short-time FFT spectra at 10-ms intervals. The degree of audible acceptability correlated with the location of the sample on the map. Spectral model vectors in different map locations depicted distinguishing spectral features in the [s] samples analyzed. The results demonstrate that self-organized maps are suitable for the extraction and measurement of acoustic features underlying psychoacoustic classifications, and for on-line visual imaging of speech.

Explore More