Agha Ali Raza
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Agha Ali Raza.
information and communication technologies and development | 2012
Agha Ali Raza; Mansoor Pervaiz; Christina Milo; Samia Razaq; Guy Alster; Jahanzeb Sherwani; Umar Saif; Roni Rosenfeld
Entertainment has recently been shown to be a powerful motivator for mastering new technologies. We therefore set out to use viral entertainment to introduce telephone-based, speech-based services to low-literate people in developing countries. We describe Polly, a simple voice manipulation and forwarding system that went viral in Pakistan last year. Seeded once by 32 low-skilled office workers in a Pakistani university, in 3 weeks Polly amassed 2,032 users and 10,629 interactions. From analyzing the traffic and its content, it is evident that Polly has been used extensively for entertainment and social contact, but it has also been put to an unintended use as a voicemail and group messaging facility. This demonstrated the potential for speech based services, and the pent-up demand for entertainment, among our target population. Also of note, Pollys viral spread crossed gender and age boundaries and even established itself in a female population. However, it appears to have not crossed socioeconomic boundaries.
frontiers of information technology | 2010
Huda Sarfraz; Sarmad Hussain; Riffat Bokhari; Agha Ali Raza; Inam Ullah; Zahid Sarfraz; Sophia Pervez; Asad Mustafa; Iqra Javed; Rahila Parveen
This paper presents the development of acoustic and language models for robust Urdu speech recognition using the CMU Sphinx Open Source Toolkit for speech recognition. Three models have been developed incrementally, with the addition of speech data of up to two speakers per pass; one model using data from 40 female speakers only, one from 41 male speakers only, and one with both male and female speakers (81 speakers). This paper presents the current recognition results, and discusses approaches for improving these recognition rates.
2009 Oriental COCOSDA International Conference on Speech Database and Assessments | 2009
Agha Ali Raza; Sarmad Hussain; Huda Sarfraz; Inam Ullah; Zahid Sarfraz
Phonetically rich speech corpora play a pivotal role in speech research. The significance of such resources becomes crucial in the development of Automatic Speech Recognition systems and Text to Speech systems. This paper presents details of designing and developing an optimal context based phonetically rich speech corpus for Urdu that will serve as a baseline model for training a Large Vocabulary Continuous Speech Recognition system for Urdu language.
acm symposium on computing and development | 2013
Agha Ali Raza; Roni Rosenfeld; Farhan ul Haq; Zain Tariq; Umar Saif
We have been developing techniques for spreading telephone-based services to low-literate people in the developing world, bypassing the need for explicit user training. We achieve this by using entertainment as a viral conduit to spread and popularize development related voice-based services. Polly, our telephone-based voice manipulation and forwarding system, has been in continuous operation in Pakistan since May 2012. In this poster, we show the geographical spread of Polly over the initial four months of its deployment. We then describe our attempts at reducing our operating costs by shifting some of them to users, and the impact this had on user behavior, demonstrated via randomized control trials and by the usage of landline vs. mobile phones.
pacific-asia conference on knowledge discovery and data mining | 2014
Yibin Lin; Agha Ali Raza; Jay Yoon Lee; Danai Koutra; Roni Rosenfeld; Christos Faloutsos
When a free, catchy application shows up, how quickly will people notify their friends about it? Will the enthusiasm drop exponentially with time, or oscillate? What other patterns emerge?
information and communication technologies and development | 2017
Waleed Riaz; Harris Durrani; Suleman Shahid; Agha Ali Raza
This paper presents results of a field study of an interactive voice response (IVR) system developed for the agricultural community of Punjab, Pakistan. We studied the information requirements and the user demographics to develop a basic IVR system which disseminates agro-information such as weather forecast, pesticide and fertilizer information etc. In terms of usability and information extraction, simple menu-based navigation was, relatively, easily understood and used. The usage was evaluated and results show that such a system is a viable option to deliver agro-information.
human factors in computing systems | 2018
Agha Ali Raza; Bilal Saleem; Shan Randhawa; Zain Tariq; Awais Athar; Umar Saif; Roni Rosenfeld
Speech is more natural than text for a large part of the world including hard-to-reach populations (low-literate, poor, tech-novice, visually-impaired, marginalized) and oral cultures. Voice-based services over simple mobile phones are effective means to provide orality-driven social connectivity to such populations. We present Baang, a versatile and inclusive voice-based social platform that allows audio content creation and sharing among its open community of users. Within 8 months, Baang spread virally to 10,721 users (69% of them blind) who participated in 269,468 calls and shared their thoughts via 44,178 audio-posts, 343,542 votes, 124,389 audio-comments and 94,864 shares. We show that the ability to vote, comment and share leads to viral spread, deeper engagement, longer retention and emergence of true dialog among participants. Beyond connectivity, Baang provides its users with a voice and a social identity as well as means to share information and get community support.
human computer interaction with mobile devices and services | 2018
Hira Ejaz; Syed Ali Hussain; Agha Ali Raza
Freedom of expression is a fundamental human right. Unfortunately, this right gets denied to the majority of people because they cannot read and write. This is because most modern means of communication rely on textual interfaces that are not inclusive to less educated and visually impaired people. However, simple and feature mobile phones are becoming widely available to such populations. In this paper, we present Mehfil, an IVR based citizen journalism platform that was deployed in Pakistan for 41 days. It received 789 calls from 535 users (2.4% of them blind) from all provinces of Pakistan. Mehfil provides a platform to its users to report their local area problems by recording their grievances on a range of social issues including unemployment, personal safety, health, education, corruption and rights of disabled (especially visually impaired). This paper reveals a demand for mobile phone-based citizen journalism and grievance reporting platforms among low-literate people in Pakistan.
conference of the international speech communication association | 2018
Agha Ali Raza; Awais Athar; Shan Randhawa; Zain Tariq; Muhammad Bilal Saleem; Haris Bin Zia; Umar Saif; Roni Rosenfeld
We present a novel technique for rapid collection of spontaneous speech data over mobile phone channel using telephonic community forums. Our public forum allows users to post audio messages, listen to messages posted by others, post votes and audio comments, and share content with friends through subsidized phone calls. The entertainment aspects and sharing features of the forum lead to its viral spread in Pakistan. Within 8 months, it reached 11,017 users and gathered 1,207 hours of speech data comprising 57,454 audio-posts and 130,685 audiocomments, spanning Urdu and 9 regional languages. We trained an ASR using just 9.5 hours of the corpus to obtain 24.19% WER. Community forums automatically overcome common spontaneous speech data collection challenges like speaker recruitment, natural speech elicitation, content diversity, informed consent, sampling real-world ambient noise, and reach (for geographically remote linguistic communities). This technique is especially useful for gathering speech corpora for underresourced languages hence enabling the development of speech recognition, keyword spotting, speaker ID, and noise classification systems (among others) for such languages. It also allows rapid, automatic preservation of spoken languages and oral aspects of culture. This technique can be extended to collect speech data for endangered languages, oral cultures, and linguistic minorities.
acm symposium on computing and development | 2016
Agha Ali Raza; Samia Razaq; Amna Raja; Rizwan Naru; Ali Gibran; Abdullah Sabri; Haroon Niaz; Muhammad Bilal Saleem; Umar Saif
This paper explores the use of Interactive Voice Response (IVR) systems for automatic surveys, data validation and prescreening. We report a deployment aimed at employing voice-based, telephone services to conduct automated, structured interviews of low-literate users and to advertise relevant development-related services to them. Survey calls were placed to 67,000 vocational training recipients to validate their phone numbers and to find out their current job status. Of these, 45,500 answered these calls and 11,500 (25%) responded to the survey questions. Manually conducted follow-up interviews found more than 70% of the survey results to be consistent and also revealed the impact of phone sharing (among family members), call timing, simplicity of interface and surveyor-participant interpretation mismatch regarding certain survey questions on participant involvement and the validity of survey results. The paper discusses the use of IVR to collect information, possible system design considerations and factors affecting the accuracy of gathered information.