Hindi-English Code-Switching Speech Corpus
HHindi-English Code-Switching Speech Corpus
Ganji Sreeram, Kunal Dhawan and Rohit Sinha
Department of Electronics and Electrical Engineering,Indian Institute of Technology Guwahati, Guwahati-781039, India.
Abstract
Code-switching refers to the usage of two languages within a sentence or discourse. It is a global phenomenon amongmultilingual communities and has emerged as an independent area of research. With the increasing demand for thecode-switching automatic speech recognition (ASR) systems, the development of a code-switching speech corpus hasbecome highly desirable. However, for training such systems, very limited code-switched resources are available as yet.In this work, we present our first efforts in building a code-switching ASR system in the Indian context. For that purpose,we have created a Hindi-English code-switching speech database. The database not only contains the speech utteranceswith code-switching properties but also covers the session and the speaker variations like pronunciation, accent, age,gender, etc. This database can be applied in several speech signal processing applications, such as code-switching ASR,language identification, language modeling, speech synthesis etc. This paper mainly presents an analysis of the statisticsof the collected code-switching speech corpus. Later, the performance results for the ASR task have been reported forthe created database.
Keywords: code-switching, speech corpus, automatic speech recognition
1. Introduction
In multilingual communities, the speakers often switchor mix between two or more languages or language va-rieties during communication in their day to day lives.In linguistics, this phenomenon is referred to as code-switching [1, 2]. This phenomenon poses some interestingresearch challenges to speech recognition [3, 4, 5], languageidentification [6] and language modelling [7, 8, 9] domains.Over the years, due to urbanization and geographical dis-tribution, people have moved from one place to another fora better livelihood. Hence, communicating in two or morelanguages helps to interact better with people from differ-ent places and cultures. There are many reasons for theoccurrence of code-switching. The people belonging to thebilingual communities say that the main reason for code-switching between languages is due to the lack of wordsin the vocabulary of that particular native language [10].According to [11, 12, 13, 14], some possible reasons forcode-switching are: (i) to qualify the message by empha-sizing specific words, (ii) to convey a personalized mes-sage, (iii) to maintain confidentiality during verbal com-munication, (iv) to show expertise, authority, status, etc.Another reason for code-switching is to enrich communica-tion between speakers without any change in the situation.Hence, a native language speaker actively embeds mean-ings into the conversation by mixing non-native language ∗ Corresponding author
Email address: {s.ganji, k.dhawan, rsinha}@iitg.ac.in (Ganji Sreeram, Kunal Dhawan and Rohit Sinha) words [15]. Based on the locations of the non-native words,code-switching can be broadly classified into two modes.When the switching happens within the sentence it is re-ferred to as the intra-sentential code-switching and the onepredominantly happening at the sentence boundary is re-ferred to as the inter-sentential code-switching [16]. Intra-sentential mode of switching is a common phenomenon andhas become an identifying characteristic in bilingual com-munities.Over the years, the English language has become themost widely spoken language in the world. After gainingindependence from the British rule, though the Indian con-stitution declared Hindi as the primary official language,the usage of English was continued as a secondary lan-guage for its dominance in administration, education, andlaw [17]. Thereby, the urban population has started atrend to communicate in English for economic and socialpurposes. Over the years, substantial code-switching toEnglish while speaking Hindi, as well as many other In-dian languages, has become a common feature [18, 19].Note that, 41 .
1% of the Indian population are native Hindispeakers and hence the switching between Hindi and En-glish is very common. Also, in the recent past, the re-searchers have reported that the native language of thespeaker influences the foreign (non-native) language ac-quisition [20]. In India, English is taught in schools fromelementary level across the country, but very few schoolsare able to impart correct English pronunciations devoidof native language influences to their pupils. The recentworks [21, 22] have highlighted that the code-switching a r X i v : . [ c s . C L ] S e p able 1: Example Hinglish sentences showing the inter-sententialcode-switching and the variants of the intra-sentential code-switching. Type-1 and Type-2 variants of intra-sentential code-switching refer to high and low contextual information being carriedby the non-native (English) words, respectively. Hindi वरर और वस्तत कके बबीच ररश्तत क्यत हह?
Acceptablecode-switching class और object कके बबीच relationship क्यत हह?वरर और वस्तत कके बबीच relationship क्यत हह?
Oddcode-switching class और object कके between ररश्तत what हह?वरर और वस्तत कके बबीच relationship what हह?
Hinglish क्यत आप मतझके deccan queen कत departure time बतत सकतके हहकक पयत मतझके मकेरत current account balance बततएएfunctions कके नतम भबी lowercase letter सके हबी शतरू हहोतके हहैं मकेरत atm card खहो रयत हह तहो मह अपनके payment कहो कह सके रहोक सकतत हतहुँ
Hindi क्यत आप मतझके deccan रतनबी कत पपरस्थतन कत समय बतत सकतके हहैंकक पयत मतझके मकेरत चतलल खततत शकेष रतरश बततएएकतयर्यों कके नतम भबी छहोटके अक्षर सके हबी शतरू हहोतके हहैंमकेरत atm पतपरक खहो रयत हह तहो मह अपनके भतरततन कहो कह सके रहोक सकतत हतहुँ
English can you tell me the departure time of deccan queen please tell me my current account balance the names of the functions also start with a lowercase lettermy atm card is lost so how can I stop my payment
Inter-sentential she is the daughter of ceo, वह यहतहुँ दहो रदन कके रलए आई हह मतझके अमकेरबीकत ममें चतर सतल हहो रए, but I still miss my country
Intra-sentential Type-1 मतझके मकेरत current account balance जतननत हह भतरत ममें popular free virtual credit card services रकतनबी हहैं
Type-2 अपनके budget कके अनतसतर investments कर सकतके हहैंclass और object कके बबीच relationship क्यत हह phenomenon is also observed in chats, comments, andmessages posted on the social media sites like Facebook,Twitter, WhatsApp, YouTube, etc. Table 1 shows a fewexample sentences of different modes of code-switchingwhile highlighting the differences in the contextual infor-mation carried by the non-native words. In Type-1 intra-sentential code-switching, the non-native language wordseither occur in sequence or form a phrase, thus carry somecontextual information. Whereas, in Type-2 case, the non-native language words are embedded into the native lan-guage sentences in such a manner that virtually no con-textual information could be derived from those words.Also, during code-switching, we observe that the majorityof the sentences belong to Type-2 intra-sentential mode.However, due to lack of availability of the domain-specificresources, the research activity is somewhat limited.The monolingual automatic speech recognition (ASR)systems may be capable of recognizing a few words froma foreign language but are unable to handle a significantamount of code-switching in the data. On account of theexistence of different variants of English pronunciationsand code-switching effects, the development of an ASRsystem for Hindi-English (Hinglish) code-switching speechdata is a challenging task. To the best of our knowledge,there is no large-sized Hinglish corpus available for carry-ing out the research. Towards addressing that constraint,we recently created a Hinglish corpus covering all typicalsources of variations such as accent, session, channel, age,gender, etc. In this work, we describe the details of thatcorpus and also present basic experimental evaluation isdone on the same.The remainder of this paper is organized as follows: InSection 2, we review the code-switching corpora currentlyreported in the literature. In Section 3, the details aboutHinglish speech and text corpus along with that of the nec-essary lexical resources for developing the Hinglish ASRsystem, are presented in detail. The experimental evalua-tions using the created Hinglish corpus has been presentedin Section 5. The paper is concluded in Section 6.
2. Literature Review on Code-switching Corpora
In literature, a few code-switching speech corpora are al-ready reported and they happen to cover different nativeand non-native language combinations. In the following, we briefly review those code-switching corpora while sum-marizing their salient attributes. • The CUMIX Cantonese-English code-switchingspeech corpus developed by Joyce Y. C. Chan, etal., at the Chinese University of Hong Kong [23].It contains code-switched speech utterances read bythe speakers. The database contains 17 hours of dataread by 40 speakers. • A small Mandarin-Taiwanese code-switching speechcorpus was developed for testing purpose in [24] byDau-cheng Lyu and Ren-yuan Lyu. The corpus con-tains 4000 Mandarin-Taiwanese code-switching utter-ances recorded from 16 speakers. • The English-Spanish code-switching speech corpuswas compiled by Franco J. C. and Solorio at the Uni-versity of Texas [25]. The corpus contains 40 minutesof transcribed spontaneous conversations of 3 speak-ers. • The SEAME is a Mandarin-English code-switchingconversational speech corpus developed by Dau-Cheng Lyu and Tien Ping Tan from Nanyang Tech-nological University, Singapore, and Universiti SainsMalaysia [26, 27]. The database contains 63 hours ofspontaneous Mandarin-English code-switching inter-view and conversational speech uttered by 157 Singa-porean and Malaysian speakers. • Han-Ping Shen, et al., developed the CECOS, aChinese-English code-switching speech corpus at theNational Cheng Kung University in Taiwan [28]. Itcontains 12 . • A small Hindi-English code-switching speech corpuswas collected by Anik Dey and Pascale Fung at HongKong University of Science and Technology. Thiscorpus is primarily made up of student interviewspeech [14]. It is about 30 minutes of data collectedfrom 9 speakers. • A corpus of Sepedi-English code-switching speech cor-pus was created by the South African CSIR [29]. Thedatabase consists of 10 hours of prompted speech,sourced from radio broadcasts and read by 20 Sepedispeakers. • Emre Ylmaz, et al., developed FAME!, a Frisian-Dutch code-switching speech corpus of radio broad-cast speech at Radboud University, Nijmegen [30].The recordings are collected from the archives of Om-rop Fryslan, the regional public broadcaster of theprovince Fryslan. The database covers almost a 50years time span. • The Malay-English corpus developed by Basem H. A.Ahmed, et al., consists of 100 hours of Malaysian2alay-English code-switching speech data from 120Chinese, 72 Malay and 16 Indian speakers. [5]. • MediaParl is a Swiss accented bilingual database de-veloped by David Imseng, et al. contains recordingsin both French and German as they are spoken inSwitzerland. The data was recorded at the ValaisParliament. Valais is a bi-lingual Swiss canton withmany local accents and dialects [31]. • The FACST, a French-Arabic speech corpus consistsof records of code-switching read and conversationalutterances by 20 bilingual adult speakers who tend tocode-switch in their daily lives [32]. It is about 7 . • A South African speech corpus containing English-isiZulu, English-isiXhosa, English-Setswana, andEnglish-Sesotho code-switching speech utterances iscreated from South African soap operas by Ewald vander Westhuizen and Thomas Niesler. The soap operaspeech is typically fast, spontaneous and may expressemotion, with a speech rate higher than promptedspeech in the same languages [33]. • The Arabic-English is recently developed by InjyHamed, et al., by conducting the interviews with 12participants [34].From the literature review, it can be noted that verysmall sized code-switching acoustic and linguistic resourceshave been available so far covering the Indian context.This motivated us to create moderate sized Hinglish re-sources so that current technological advances in acousticand language modeling can be explored for Hinglish ASRtask.
3. Creation of Hinglish Corpus
This section describes the details of the creation ofHinglish (code-switching) corpus. Firstly, we describe thecontext and means employed for the creation of Hinglishsentences. Secondly, the details of the procedure followedby the speakers while recording the speech data corre-sponding to the created Hinglish sentences, are described.Finally, the creation of the lexical resources is discussed.
For the experimental purpose, the Hinglish code-switching text data has been collected by crawling a fewblogging websites , , , having different contexts. Thecrawled data is normalized into meaningful sentences and https://shoutmehindi.com https://notesinhinglish.blogspot.in Hindi English Hindi English13,071 179,798 71,143 3,649 4,980 further processed to remove extra spaces, special charac-ters, emoticons, etc. Data thus obtained is used for train-ing the language models, creating the lexicon and also asthe text transcription for recording the acoustic data. Thesalient details of the Hinglish code-switching text corpuscreated is summarized in Table 2
Hinglish code-switching acoustic data is recorded overthe phone from speakers belonging to different states inIndia. A consultant was hired for enrolling the speakersto call a toll free number from their mobile phones. Thespeakers called from various acoustic environments such ashome, office, etc. Each speaker was given 100 unique sen-tences taken from the above-processed text data. These100 sentences are partitioned into 5 groups which contain20 sentences each. Each speaker is requested to recordthose 5 groups in 5 different sessions in order to capture thesession variations such as emotions, environment, etc. It isworth highlighting that the duration of the sentences givento each speaker varies from 2 −
30 seconds. Each speakertook about 10 minutes to complete recording the 20 sen-tences in each session. On an average, to complete record-ing the 100 sentences, each speaker took about 50 minutes.The volunteering speakers were compensated with |
250 fortheir time and effort.The speech data is recorded at 8 kHz sampling frequencyand a bit rate of 128. This set of speech files was man-ually inspected and pruned. At the end of the data col-lection phase, the Hinglish code-switching database con-tained 7 ,
005 utterances in total spoken by 71 speakers.
For the creation of a lexicon for development of an ASRsystem for Hinglish data, a unified phone list has been cre-ated for Hindi and English words. Also, a unique word listis extracted from the 13 ,
071 sentences obtained from Sub-section 3.1. The phone level transcription for those wordshas been done manually. Thus created lexicon covers allthe pronunciation variations.
4. Statistical Analysis of the Database
This section provides the statistical analysis of theHinglish code-switching speech corpus. The following sub-section provides information about the speakers in thedatabase. Later, a description of the size and linguisticfeatures of the database is provided.3 .1. Speaker information
In order to collect the Hinglish code-switching speechdata, the field data consultant recruited speakers from In-dian Institute of Technology Guwahati (IITG) who are na-tively from different states of India. A total of 71 speakersare involved in the development of this database. To modela robust ASR system for Hinglish code-switching data, weneed to have a database that covers variations due to dif-ferent geographical distribution, gender, age, etc. Aimingat this, we have collected the database which covers allsuch variations. The details of the database are discussedbelow.
Since the speakers residing in IITG are from differentstates of India and from different geographical locations,diversity in the acoustic data is guaranteed. The geograph-ical distribution of the speakers is shown as a pie-chart inFigure 1. The area-wise distribution of the speakers in-volved in this study is provided in Table 3.East 40.84% West9.85%North30.98%South 18.33%
Figure 1: Geographical distribution of the speakers involved in thecollection of database. It is worth highlighting that the collecteddatabase covers speakers from 21 states od India.
The Hinglish code-switching speech data has beenrecorded from the speakers between 20 to 64 years of age.The age distribution is shown as a bar-diagram in Figure 2.
The Hinglish code-switching speech data is recordedfrom 27 female speakers and 44 male speakers resulting ina total of 71 speakers from different states of India. Thegender distribution is shown as a pie-chart in Figure 3.
5. Experimental Evaluation and Discussion
The Hinglish code-switching database has been vali-dated by developing an ASR system. For this purpose, <
20 21 22 23 24 25 26 27 28 29 30 31 32 > P o pu l a t i o n ( % ) Figure 2: Age distribution of the speakers involved in the collectionof database. The speakers between 18 −
62 years of age are involvedin this study.
Male61.97%Female 38.03%
Figure 3: Gender distribution of the speakers involved in creation ofthe Hinglish code-switching database. the recorded 7005 utterances are partitioned into trainingand testing sets containing 5 ,
500 and 1 , − ,
566 number of sentences are used fortraining the LM. For developing the 3-gram LM, we haveemployed the IRSTLM toolkit [36]. The evaluation resultsin terms of percentage word error rate (%WER) are givenin Table 4. The DNN-based acoustic model with 3-gramLM resulted in the best %
W ER score when compared toother models.
The context-dependent GMM acoustic models aretrained by tuning the number of senones. After tuning,4 able 3: The area and gender wise distribution of the speakers em-ployed for the creation of the Hinglish code-switching database. Intotal we have 71 speakers out of which 44 are male and 27 are female
Geographicallocation Malespeakers Femalespeakers Totalspeakers
East 14 15 29West 05 02 07North 16 06 22South 09 04 13
Total 44 27 71
Table 4: Evaluation of Hinglish code-switching speech corpus in con-text od ASR task. The performance results in terms of percentageword error rate (%WER) are reported.
Model Features %WER
Mono MFCC 53.51Tri1 MFCC 33.52Tri2 MFCC + LDA 32.73Tri3 MFCC + LDA + SAT 27.20DNN MFCC + LDA + SAT 25.40the number of senones is set to be 2500. The Gaussianmixtures per senone are set to be 8 in all the cases. TheDNN based acoustic models are trained with 5 hidden lay-ers and 1024 nodes with tanh as non-linearity function ineach of the hidden layers. These models are trained with20 epochs and mini-batch size of 128.
6. Conclusion
In this work, the procedure followed to develop aHinglish code-switching speech database has been pre-sented. It contains 7 ,
005 utterances spoken by 71 speakersfrom different parts of India. The database has been vali-dated by developing an ASR system. The collection of thedatabase is still in progress.
7. Acknowledgment
The authors wish to acknowledge with gratitude for thefinancial assistance received towards data collection froman ongoing project grant no. 11(18)/2012-HCC(TDIL)from the Ministry of Electronics and Information Tech-nology, Govt. of India.
References [1] John J Gumperz,
Discourse Strategies , Cambridge UniversityPress, 1982.[2] Chad Nilep, “’Code switching’in sociocultural linguistics,”
Col-orado Research in Linguistics , vol. 19(1), pp. 1–22, 2006.[3] Dau Cheng Lyu, Ren Yuan Lyu, Yuang Chin Chiang, andChun Nan Hsu, “Speech recognition on code-switching amongthe Chinese dialects,” in
Proc. of the International Conferenceon Acoustics, Speech and Signal Processing (ICASSP) . IEEE,2006, vol. 1. [4] Kiran Bhuvanagirir and Sunil Kumar Kopparapu, “Mixed lan-guage speech recognition without explicit identification of lan-guage,”
American Journal of Signal Processing , vol. 2, no. 5,pp. 92–97, 2012.[5] Basem HA Ahmed and Tien-Ping Tan, “Automatic speechrecognition of code switching speech using 1-best rescoring,”in
Proc. of the International Conference on Asian LanguageProcessing (IALP) . IEEE, 2012, pp. 137–140.[6] Dau Cheng Lyu and Ren Yuan Lyu, “Language identificationon code-switching utterances using multiple cues,” in
Proc. ofthe Interspeech, an Annual Conference of International SpeechCommunication Association , 2008.[7] Houwei Cao, PC Ching, Tan Lee, and Yu Ting Yeung,“Semantics-based language modeling for Cantonese-Englishcode-mixing speech recognition,” in
Proc. of the 7th Inter-national Symposium on Chinese Spoken Language Processing(ISCSLP) . IEEE, 2010, pp. 246–250.[8] Ching Feng Yeh, Chao Yu Huang, Liang Che Sun, Che Liang,and Lin Shan Lee, “An integrated framework for transcrib-ing Mandarin-English code-mixed lectures with improved acous-tic and language modeling,” in
Proc. of the 7th InternationalSymposium on Chinese Spoken Language Processing (ISCSLP) .IEEE, 2010, pp. 214–219.[9] Injy Hamed, Mohamed Elmahdy, and Slim Abdennadher,“Building a First Language Model for Code-switch Arabic-English,”
Procedia Computer Science , vol. 117, pp. 208–216,2017.[10] Fran¸cois Grosjean,
Life with Two Languages: An Introductionto Bilingualism , Harvard University Press, 1982.[11] Carol Myers-Scotton, “Social motivations for code-switching:evidence from Africa. Clarendon,” 1993.[12] Lalita Malik,
Socio-linguistics: A study of code-switching , An-mol Publications PVT. LTD., 1994.[13] Lesley Milroy and Pieter Muysken,
One speaker, two languages:Cross-disciplinary perspectives on code-switching , CambridgeUniversity Press, 1995.[14] Anik Dey and Pascale Fung, “A Hindi-English Code-SwitchingCorpus.,” in
Proc. of the Language Resources and EvaluationConference (LREC) , 2014, pp. 2410–2413.[15] Hsi-Yao Su, “Code-switching between mandarin and taiwanesein three telephone conversation: The negotiation of interper-sonal relationships among bilingual speakers in taiwan,” in
Proc. of the Symposium about Language and Society , 2001.[16] Carol Myers-Scotton, “Codeswitching with English: types ofswitching, types of communities,”
World Englishes , vol. 8, no.3, pp. 333–346, 1989.[17] Sunita Malhotra, “Hindi-English, Code Switching and Lan-guage Choice in Urban, Uppermiddle-class Indian Families,”
Kansas Working Papers in Linguistics , 1980.[18] Ashok Kumar, “Certain aspects of the form and functions ofHindi-English code-switching,”
Anthropological Linguistics , pp.195–205, 1986.[19] Smita Sinha, “Code Switching and Code Mixing Among OriyaTrilingual Children - A Study,”
Academic Journal on Languagein India , vol. 9(4), pp. 274, 2009.[20] James E Flege, “Second-language speech learning: Theory,findings, and problems,”
Speech perception and linguistic ex-perience , 1995.[21] Kalika Bali, Jatin Sharma, Monojit Choudhury, and YogarshiVyas, “I am borrowing ya mixing? An Analysis of English-HindiCode Mixing in Facebook,” in
Proc. of the First Workshop onComputational Approaches to Code Switching , 2014, pp. 116–126.[22] Amitava Das and Bj¨orn Gamb¨ack, “Code-mixing in social me-dia text: the last language identification frontier?,” in Proc. ofthe Traitement Automatique des Langues (TAL), Special Issueon Social Networks and NLP , vol. 54(3), 2015.[23] Chan Joyce Y. C., P. C. Ching, and Tan Lee, “Developmentof a Cantonese-English code-mixing speech corpus,” in
Proc.of the 9th European Conference on Speech Communication andTechnology , 2005.
24] Dau-Cheng Lyu, Ren-Yuan Lyu, Yuang-chin Chiang, and Chun-Nan Hsu, “Speech recognition on code-switching among theChinese dialects,” in
Proc. of International Conference onAcoustics, Speech and Signal Processing (ICASSP) . IEEE,2006, vol. 1.[25] Juan Carlos Franco and Thamar Solorio, “Baby-steps towardsbuilding a Spanglish language model,” in
Proc. of InternationalConference on Intelligent Text Processing and ComputationalLinguistics . Springer, 2007, pp. 75–84.[26] Dau-Cheng Lyu, Tien-Ping Tan, Eng Siong Chng, and HaizhouLi, “SEAME: a Mandarin-English code-switching speech corpusin south-east asia,” in
Proc. of Interspeech, an Annual Confer-ence of the International Speech Communication Association ,2010.[27] Ngoc Thang Vu, Dau-Cheng Lyu, Jochen Weiner, DominicTelaar, Tim Schlippe, Fabian Blaicher, Eng-Siong Chng, TanjaSchultz, and Haizhou Li, “A first speech recognition system forMandarin-English code-switch conversational speech,” in
Proc.of International Conference on Acoustics, Speech and SignalProcessing (ICASSP) . IEEE, 2012, pp. 4889–4892.[28] Han-Ping Shen, Chung-Hsien Wu, Yan-Ting Yang, and Chun-Shan Hsu, “Cecos: A Chinese-English code-switching speechdatabase,” in
Proc. of International Conference on SpeechDatabase and Assessments (Oriental COCOSDA) . IEEE, 2011,pp. 120–123.[29] Thipe I Modipa, Marelie H Davel, and Febe De Wet, “Im-plications of Sepedi/English code switching for ASR systems,”2013.[30] Emre Yilmaz, Maaike Andringa, Sigrid Kingma, Jelske Dijk-stra, Frits Van der Kuip, Hans Van de Velde, Frederik Kamp-stra, Jouke Algra, H Heuvel, and David Van Leeuwen, “Alongitudinal bilingual Frisian-Dutch radio broadcast databasedesigned for code-switching research,” 2016.[31] David Imseng, Herv´e Bourlard, Holger Caesar, Philip N Gar-ner, Gw´enol´e Lecorv´e, and Alexandre Nanchen, “MediaParl:Bilingual mixed language accented speech database,” in
Proc.of Spoken Language Technology Workshop (SLT) . IEEE, 2012,pp. 263–268.[32] Djegdjiga Amazouz, Martine Adda-Decker, and Lori Lamel,“The French-Algerian Code-Switching Triggered audio corpus(FACST).,” in
Proc. of Language Resources and EvaluationConference (LREC) , 2018.[33] Ewald van der Westhuizen and Thomas Niesler, “A firstsouth african corpus of multilingual code-switched soap operaspeech.,” in
Proc. of Language Resources and Evaluation Con-ference (LREC) , 2018.[34] Injy Hamed, Mohamed Elmahdy, and Slim Abdennadher, “Col-lection and Analysis of Code-switch Egyptian Arabic-EnglishSpeech Corpus.,” in
Proc. of Language Resources and Evalua-tion Conference (LREC) , 2018.[35] Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Bur-get, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, PetrMotlicek, Yanmin Qian, Petr Schwarz, et al., “The Kaldi speechrecognition toolkit,” in
Workshop on automatic speech recogni-tion and understanding . IEEE Signal Processing Society, 2011,number EPFL-CONF-192584.[36] Marcello Federico, Nicola Bertoldi, and Mauro Cettolo,“IRSTLM: an open source toolkit for handling large scale lan-guage models,” in
Proc. of 19th Annual Conference of the In-ternational Speech Communication Association , 2008., 2008.