Sadiq Sani | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sadiq Sani is active.

Explore More

Publication

Featured researches published by Sadiq Sani.

international conference on case-based reasoning | 2017

kNN Sampling for Personalised Human Activity Recognition

Sadiq Sani; Stewart Massie; Kay Cooper

The need to adhere to recommended physical activity guidelines for a variety of chronic disorders calls for high precision Human Activity Recognition (HAR) systems. In the SelfBACK system, HAR is used to monitor activity types and intensities to enable self-management of low back pain (LBP). HAR is typically modelled as a classification task where sensor data associated with activity labels are used to train a classifier to predict future occurrences of those activities. An important consideration in HAR is whether to use training data from a general population (subject-independent), or personalised training data from the target user (subject-dependent). Previous evaluations have shown that using personalised data results in more accurate predictions. However, from a practical perspective, collecting sufficient training data from the end user may not be feasible. This has made using subject-independent data by far the more common approach in commercial HAR systems. In this paper, we introduce a novel approach which uses nearest neighbour similarity to identify examples from a subject-independent training set that are most similar to sample data obtained from the target user and uses these examples to generate a personalised model for the user. This nearest neighbour sampling approach enables us to avoid much of the practical limitations associated with training a classifier exclusively with user data, while still achieving the benefit of personalisation. Evaluations show our approach to significantly out perform a general subject-independent model by up to 5%.

conference on information and knowledge management | 2012

Two-part segmentation of text documents

Deepak P; Karthik Visweswariah; Sadiq Sani

We consider the problem of segmenting text documents that have a two-part structure such as a problem part and a solution part. Documents of this genre include incident reports that typically involve description of events relating to a problem followed by those pertaining to the solution that was tried. Segmenting such documents into the component two parts would render them usable in knowledge reuse frameworks such as Case-Based Reasoning. This segmentation problem presents a hard case for traditional text segmentation due to the lexical inter-relatedness of the segments. We develop a two-part segmentation technique that can harness a corpus of similar documents to model the behavior of the two segments and their inter-relatedness using language models and translation models respectively. In particular, we use separate language models for the problem and solution segment types, whereas the inter-relatedness between segment types is modeled using an IBM Model 1 translation model. We model documents as being generated starting from the problem part that comprises of words sampled from the problem language model, followed by the solution part whose words are sampled either from the solution language model or from a translation model conditioned on the words already chosen in the problem part. We show, through an extensive set of experiments on real-world data, that our approach outperforms the state-of-the-art text segmentation algorithms in the accuracy of segmentation, and that such improved accuracy translates well to improved usability in Case-based Reasoning systems. We also analyze the robustness of our technique to varying amounts and types of noise and empirically illustrate that our technique is quite noise tolerant, and degrades gracefully with increasing amounts of noise.

international conference on case based reasoning | 2011

Term similarity and weighting framework for text representation

Sadiq Sani; Stewart Massie; Robert Lothian

Expressiveness of natural language is a challenge for text representation since the same idea can be expressed in many different ways. Therefore, terms in a document should not be treated independently of one another since together they help to disambiguate and establish meaning. Term-similarity measures are often used to improve representation by capturing semantic relationships between terms. Another consideration for representation involves the importance of terms. Feature selection techniques address this by using statistical measures to quantify term usefulness for retrieval. In this paper we present a framework that combines term-similarity and weighting for text representation. This allows us to comparatively study the impact of term similarity, term weighting and any synergistic effect that may exist between them. Study of term similarity is based on approaches that exploit term co-occurrences within document and sentence contexts whilst term weighting uses the popular Chi-squared test. Our results on text classification tasks show that the combined effect of similarity and weighting is superior to each technique independently and that this synergistic effect is obtained regardless of co-occurrence context granularity.

International Conference on Innovative Techniques and Applications of Artificial Intelligence | 2016

SELFBACK—Activity Recognition for Self-management of Low Back Pain

Sadiq Sani; Stewart Massie; Kay Cooper

Low back pain (LBP) is the most significant contributor to years lived with disability in Europe and results in significant financial cost to European economies. Guidelines for the management of LBP have self-management at their cornerstone, where patients are advised against bed rest, and to remain active. In this paper, we introduce SELFBACK, a decision support system used by the patients themselves to improve and reinforce self-management of LBP. SELFBACK uses activity recognition from wearable sensors in order to automatically determine the type and level of activity of a user. This is used by the system to automatically determine how well users adhere to prescribed physical activity guidelines. Important parameters of an activity recognition system include windowing, feature extraction and classification. The choices of these parameters for the SELFBACK system are supported by empirical comparative analyses which are presented in this paper. In addition, two approaches are presented for detecting step counts for ambulation activities (e.g. walking and running) which help to determine activity intensity. Evaluation shows the SELFBACK system is able to distinguish between five common daily activities with 0.9 macro-averaged F1 and detect step counts with 6.4 and 5.6 root mean squared error for walking and running respectively.

international conference on case-based reasoning | 2014

Supervised Semantic Indexing Using Sub-spacing

Sadiq Sani; Stewart Massie; Robert Lothian

Indexing of textual cases is commonly affected by the problem of variation in vocabulary. Semantic indexing is commonly used to address this problem by discovering semantic or conceptual relatedness between individual terms and using this to improve textual case representation. However, representations produced using this approach are not optimal for supervised tasks because standard semantic indexing approaches do not take into account class membership of these textual cases. Supervised semantic indexing approaches e.g. sprinkled Latent Semantic Indexing (SpLSI) and supervised Latent Dirichlet Allocation (sLDA) have been proposed for addressing this limitation. However, both SpLSI and sLDA are computationally expensive and require parameter tuning. In this work, we present an approach called Supervised Sub-Spacing (S3) for supervised semantic indexing of documents. S3 works by creating a separate sub-space for each class within which class-specific term relations and term weights are extracted. The power of S3 lies in its ability to modify document representations such that documents that belong to the same class are made more similar to one another while, at the same time, reducing their similarity to documents of other classes. In addition, S3 is flexible enough to work with a variety of semantic relatedness metrics and yet, powerful enough that it leads to significant improvements in text classification accuracy. We evaluate our approach on a number of supervised datasets and results show classification performance on S3-based representations to significantly outperform both a supervised version of Latent Semantic Indexing (LSI) called Sprinkled LSI, and supervised LDA.

international conference on case-based reasoning | 2013

Should Term-Relatedness Be Used in Text Representation?

Sadiq Sani; Stewart Massie; Robert Lothian

The variation in natural language vocabulary remains a challenge for text representation as the same idea can be expressed in many different ways. Thus document representations often rely on generalisation to map low-level lexical expressions to higher level concepts in order to capture the inherent semantics of the documents. Term-relatedness measures are often used to generalise document representations by capturing semantic relationships between terms. In this work we conduct a comparative study of common term-relatedness metrics on 43 datasets and discover that generalisation is not always beneficial. Hence, the ability to predict whether or not to generalise the indexing vocabulary of a dataset is important given the computation overhead of generalisation. Accordingly, we present a case-based approach that predicts, given a text dataset, whether or not using generalisation will improve text retrieval performance. The evaluation shows that our approach is able to correctly predict datasets that are likely to benefit from generalisation with over 90% accuracy.

international conference on case-based reasoning | 2018

Improving kNN for Human Activity Recognition with Privileged Learning Using Translation Models

Anjana Wijekoon; Sadiq Sani; Stewart Massie; Kay Cooper

Multiple sensor modalities provide more accurate Human Activity Recognition (HAR) compared to using a single modality, yet the latter is preferred by consumers as it is more convenient and less intrusive. This presents a challenge to researchers, as a single modality is likely to pick up movement that is both relevant as well as extraneous to the human activity being tracked and lead to poorer performance. The goal of an optimal HAR solution is therefore to utilise the fewest sensors at deployment, while maintaining performance levels achievable using all available sensors. To this end, we introduce two translation approaches, capable of generating missing modalities from available modalities. These can be used to generate missing or “privileged” modalities at deployment to augment case representations and improve HAR. We evaluate the presented translators with k-NN classifiers on two HAR datasets and achieve up-to \(5\%\) performance improvements using representations augmented with privileged modalities. This suggests that non-intrusive modalities suited for deployment benefit from translation models that generates privileged modalities.

international conference on case-based reasoning | 2018

Personalised Human Activity Recognition Using Matching Networks

Sadiq Sani; Stewart Massie; Kay Cooper

Human Activity Recognition (HAR) is typically modelled as a classification task where sensor data associated with activity labels are used to train a classifier to recognise future occurrences of these activities. An important consideration when training HAR models is whether to use training data from a general population (subject-independent), or personalised training data from the target user (subject-dependent). Previous evaluations have shown personalised training to be more accurate because of the ability of resulting models to better capture individual users’ activity patterns. From a practical perspective however, collecting sufficient training data from end users may not be feasible. This has made using subject-independent training far more common in real-world HAR systems. In this paper, we introduce a novel approach to personalised HAR using a neural network architecture called a matching network. Matching networks perform nearest-neighbour classification by reusing the class label of the most similar instances in a provided support set, which makes them very relevant to case-based reasoning. A key advantage of matching networks is that they use metric learning to produce feature embeddings or representations that maximise classification accuracy, given a chosen similarity metric. Evaluations show our approach to substantially out perform general subject-independent models by at least 6% macro-averaged F1 score.

International Conference on Innovative Techniques and Applications of Artificial Intelligence | 2013

Sentiment Classification Using Supervised Sub-Spacing

Sadiq Sani; Stewart Massie; Robert Lothian

An important application domain for Machine learning is sentiment classification. Here, the traditional approach is to represent documents using a Bag-Of-Words (BOW) model, where individual terms are used as features. However, the BOW model is unable to sufficiently model the variation inherent in natural language text. Term-relatedness metrics are commonly used to overcome this limitation by capturing latent semantic concepts or topics in documents. However, representations produced using standard term relatedness approaches do not take into account class membership of documents. In this work, we present a novel approach called Supervised Sub-Spacing (S3) for introducing supervision to term-relatedness extraction. S3 works by creating a separate sub-space for each class within which term relations are extracted such that documents belonging to the same class are made more similar to one another. Recent approaches in sentiment classification have proposed combining machine learning with background knowledge from sentiment lexicons for improved performance. Thus, we present a simple, yet effective approach for augmenting S3 with background knowledge from SentiWordNet. Evaluation shows S3 to significantly out perform the state-of-the-art SVM classifier. Results also show that using background knowledge from SentiWordNet significantly improves the performance of S3.

international conference on case-based reasoning | 2012

Event Extraction for Reasoning with Text

Sadiq Sani; Stewart Massie; Robert Lothian

Textual Case-Based Reasoning (TCBR) aims at effective reuse of past problem-solving experiences that are predominantly captured in unstructured form. The absence of structure and a well-defined feature space makes comparison of these experiential cases difficult. Since reasoning is primarily dependent on retrieval of similar cases, the acquisition of a suitable indexing vocabulary is crucial for case representation. The challenge is to ensure that this vocabulary is selective and is representative enough to be able to capture the intended meaning in text, beyond simply the surface meaning. Indexing strategies that rely on bag of words (BOW) have the advantage of low knowledge acquisition costs, but only facilitate case comparison at a superficial level. In this paper we study the influence of semantic and lexical indexing constructs on a retrieve-only TCBR system applied to incident reporting. We introduce, Rubee (RUle-Based Event Extraction), an unsupervised approach for automatically extracting events from incident reports. A novel aspect of Rubee is its use of polarity information to distinguish between events that occurred and any non-event occurrences. Our results show that whilst semantic indexing is important, there is evidence that case representation benefits from a combined vocabulary (both semantic and lexical). A comparative study involving a popular event extraction system, Evita, and several baseline algorithms also indicate that events extracted by Rubee lead to significantly better retrieval performance.

Explore More