Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Madina Hasan is active.

Publication


Featured researches published by Madina Hasan.


Journal of Computational and Applied Mathematics | 2010

A non-linear structure preserving matrix method for the low rank approximation of the Sylvester resultant matrix

Joab R. Winkler; Madina Hasan

A non-linear structure preserving matrix method for the computation of a structured low rank approximation S(f@?,g@?) of the Sylvester resultant matrix S(f,g) of two inexact polynomials f=f(y) and g=g(y) is considered in this paper. It is shown that considerably improved results are obtained when f(y) and g(y) are processed prior to the computation of S(f@?,g@?), and that these preprocessing operations introduce two parameters. These parameters can either be held constant during the computation of S(f@?,g@?), which leads to a linear structure preserving matrix method, or they can be incremented during the computation of S(f@?,g@?), which leads to a non-linear structure preserving matrix method. It is shown that the non-linear method yields a better structured low rank approximation of S(f,g) and that the assignment of f(y) and g(y) is important because S(f@?,g@?) may be a good structured low rank approximation of S(f,g), but S(g@?,f@?) may be a poor structured low rank approximation of S(g,f) because its numerical rank is not defined. Examples that illustrate the differences between the linear and non-linear structure preserving matrix methods, and the importance of the assignment of f(y) and g(y), are shown.


Journal of Computational and Applied Mathematics | 2013

An improved non-linear method for the computation of a structured low rank approximation of the Sylvester resultant matrix

Joab R. Winkler; Madina Hasan

This paper reports on improvements to recent work on the computation of a structured low rank approximation of the Sylvester resultant matrix S(f,g) of two inexact polynomials f=f(y) and g=g(y). Specifically, it has been shown in previous work that these polynomials must be processed before a structured low rank approximation of S(f,g) is computed. The existing algorithm may still, however, yield a structured low rank approximation of S(f,g), but not a structured low rank approximation of S(g,f), which is unsatisfactory. Moreover, a structured low rank approximation of S(f,g) must be equal to, apart from permutations of its columns, a structured low rank approximation of S(g,f), but the existing algorithm does not guarantee the satisfaction of this condition. This paper addresses these issues by modifying the existing algorithm, such that these deficiencies are overcome. Examples that illustrate these improvements are shown.


ieee automatic speech recognition and understanding workshop | 2015

The 2015 sheffield system for transcription of Multi-Genre Broadcast media

Oscar Saz; Mortaza Doulaty; Salil Deena; Rosanna Milner; Raymond W. M. Ng; Madina Hasan; Yulan Liu; Thomas Hain

We describe the University of Sheffield system for participation in the 2015 Multi-Genre Broadcast (MGB) challenge task of transcribing multi-genre broadcast shows. Transcription was one of four tasks proposed in the MGB challenge, with the aim of advancing the state of the art of automatic speech recognition, speaker diarisation and automatic alignment of subtitles for broadcast media. Four topics are investigated in this work: Data selection techniques for training with unreliable data, automatic speech segmentation of broadcast media shows, acoustic modelling and adaptation in highly variable environments, and language modelling of multi-genre shows. The final system operates in multiple passes, using an initial unadapted decoding stage to refine segmentation, followed by three adapted passes: a hybrid DNN pass with input features normalised by speaker-based cepstral normalisation, another hybrid stage with input features normalised by speaker feature-MLLR transformations, and finally a bottleneck-based tandem stage with noise and speaker factorisation. The combination of these three system outputs provides a final error rate of 27.5% on the official development set, consisting of 47 multi-genre shows.


Odyssey 2016 | 2016

The Sheffield language recognition system in NIST LRE 2015.

Raymond W. M. Ng; Mauro Nicolao; Oscar Saz; Madina Hasan; Bhusan Chettri; Mortaza Doulaty; Tan Lee; Thomas Hain

The Speech and Hearing Research Group of the University of Sheffield submitted a fusion language recognition system to NIST LRE 2015. It combines three language classifiers. Two are acoustic-based, which use i–vectors and a tandem DNN language recogniser respectively. The third classifier is a phonotactic language recogniser. Two sets of training data with duration of approximately 170 and 300 hours were composed for LR training. Using the larger set of training data, the primary Sheffield LR system gives 32.44 min DCF on the official LR 2015 eval data. A post-evaluation system enhancement was carried out where i–vectors were extracted from the bottleneck features of an English DNN. The min DCF was reduced to 29.20.


conference of the international speech communication association | 2016

webASR 2 - Improved cloud based speech technology

Thomas Hain; Jeremy Christian; Oscar Saz; Salil Deena; Madina Hasan; Raymond W. M. Ng; Rosanna Milner; Mortaza Doulaty; Yulan Liu

This paper presents the most recent developments of the webASR service (www.webasr.org), the world’s first web– based fully functioning automatic speech recognition platform for scientific use. Initially released in 2008, the functionalities of webASR have recently been expanded with 3 main goals in mind: Facilitate access through a RESTful architecture, that allows for easy use through either the web interface or an API; allow the use of input metadata when available by the user to improve system performance; and increase the coverage of available systems beyond speech recognition. Several new systems for transcription, diarisation, lightly supervised alignment and translation are currently available through webASR. The results in a series of well–known benchmarks (RT’09, IWSLT’12 and MGB’15 evaluations) show how these webASR systems provides state–of–the–art performances across these tasks


Multimedia Tools and Applications | 2018

Lightly supervised alignment of subtitles on multi-genre broadcasts

Oscar Saz; Salil Deena; Mortaza Doulaty; Madina Hasan; Bilal Khaliq; Rosanna Milner; Raymond W. M. Ng; Julia Olcoz; Thomas Hain

This paper describes a system for performing alignment of subtitles to audio on multigenre broadcasts using a lightly supervised approach. Accurate alignment of subtitles plays a substantial role in the daily work of media companies and currently still requires large human effort. Here, a comprehensive approach to performing this task in an automated way using lightly supervised alignment is proposed. The paper explores the different alternatives to speech segmentation, lightly supervised speech recognition and alignment of text streams. The proposed system uses lightly supervised decoding to improve the alignment accuracy by performing language model adaptation using the target subtitles. The system thus built achieves the third best reported result in the alignment of broadcast subtitles in the Multi–Genre Broadcast (MGB) challenge, with an F1 score of 88.8%. This system is available for research and other non–commercial purposes through webASR, the University of Sheffield’s cloud–based speech technology web service. Taking as inputs an audio file and untimed subtitles, webASR can produce timed subtitles in multiple formats, including TTML, WebVTT and SRT.


International Conference on Statistical Language and Speech Processing | 2017

Detecting Stuttering Events in Transcripts of Children’s Speech

Sadeen Alharbi; Madina Hasan; Anthony J. H. Simons; Shelagh Brumfitt; Phil D. Green

Stuttering is a common problem in childhood that may persist into adulthood if not treated in early stages. Techniques from spoken language understanding may be applied to provide automated diagnosis of stuttering from children speech. The main challenges however lie in the lack of training data and the high dimensionality of this data. This study investigates the applicability of machine learning approaches for detecting stuttering events in transcripts. Two machine learning approaches were applied, namely HELM and CRF. The performance of these two approaches are compared, and the effect of data augmentation is examined in both approaches. Experimental results show that CRF outperforms HELM by 2.2% in the baseline experiments. Data augmentation helps improve systems performance, especially for rarely available events. In addition to the annotated augmented data, this study also adds annotated human transcriptions from real stuttered children’s speech to help expand the research in this field.


conference of the international speech communication association | 2016

Combining feature and model-based adaptation of RNNLMs for multi-genre broadcast speech recognition

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain

Recurrent neural network language models (RNNLMs) have consistently outperformed n-gram language models when used in automatic speech recognition (ASR). This is because RNNLMs provide robust parameter estimation through the use of a continuous-space representation of words, and can generally model longer context dependencies than n-grams. The adaptation of RNNLMs to new domains remains an active research area and the two main approaches are: feature-based adaptation, where the input to the RNNLM is augmented with auxiliary features; and model-based adaptation, which includes model fine-tuning and introduction of adaptation layer(s) in the network. This paper explores the properties of both types of adaptation on multi-genre broadcast speech recognition. Two hybrid adaptation techniques are proposed, namely the finetuning of feature-based RNNLMs and the use of a feature-based adaptation layer. A method for the semi-supervised adaptation of RNNLMs, using topic model-based genre classification, is also presented and investigated. The gains obtained with RNNLM adaptation on a system trained on 700h. of speech are consistent using both RNNLMs trained on a small (10Mwords) and large set (660M words), with 10% perplexity and 2% word error rate improvements on a 28:3h. test set.


conference of the international speech communication association | 2016

The Sheffield Wargame Corpus - day two and day three

Yulan Liu; Charles W. Fox; Madina Hasan; Thomas Hain

Improving the performance of distant speech recognition is of considerable current interest, driven by a desire to bring speech recognition into people’s homes. Standard approaches to this task aim to enhance the signal prior to recognition, typically using beamforming techniques on multiple channels. Only few real-world recordings are available that allow experimentation with such techniques. This has become even more pertinent with recent works with deep neural networks aiming to learn beamforming from data. Such approaches require large multi-channel training sets, ideally with location annotation for moving speakers, which is scarce in existing corpora. This paper presents a freely available and new extended corpus of English speech recordings in a natural setting, with moving speakers. The data is recorded with diverse microphone arrays, and uniquely, with ground truth location tracking. It extends the 8.0 hour Sheffield Wargames Corpus released in Interspeech 2013, with a further 16.6 hours of fully annotated data, including 6.1 hours of female speech to improve gender bias. Additional blog-based language model data is provided alongside, as well as a Kaldi baseline system. Results are reported with a standard Kaldi configuration, and a baseline meeting recognition system.


Calcolo | 2012

Two methods for the calculation of the degree of an approximate greatest common divisor of two inexact polynomials

Joab R. Winkler; Madina Hasan; Xin Lao

Collaboration


Dive into the Madina Hasan's collaboration.

Top Co-Authors

Avatar

Thomas Hain

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Oscar Saz

University of Zaragoza

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Salil Deena

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yulan Liu

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge