Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yasar Khan is active.

Publication


Featured researches published by Yasar Khan.


Sprachwissenschaft | 2016

A fine-grained evaluation of SPARQL endpoint federation systems

Muhammad Saleem; Yasar Khan; Ali Hasnain; Ivan Ermilov; Axel-Cyrille Ngonga Ngomo

The Web of Data has grown enormously over the last years. Currently, it comprises a large compendium of interlinked and distributed datasets from multiple domains. Running complex queries on this compendium often requires accessing data from different endpoints within one query. The abundance of datasets and the need for running complex query has thus motivated a considerable body of work on SPARQL query federation systems, the dedicated means to access data distributed over the Web of Data. However, the granularity of previous evaluations of such systems has not allowed deriving of insights concerning their behavior in different steps involved during federated query processing. In this work, we perform extensive experiments to compare state-of-the-art SPARQL endpoint federation systems using the comprehensive performance evaluation framework Fed- Bench. In addition to considering the tradition query runtime as an evaluation criterion, we extend the scope of our performance evaluation by considering criteria, which have not been paid much attention to in previous studies. In particular, we consider the number of sources selected, the total number of SPARQL ASK requests used, the completeness of answers as well as the source selection time. Yet, we show that they have a significant impact on the overall query runtime of existing systems. Moreover, we extend FedBench to mirror a highly distributed data environment and assess the behavior of existing systems by using the same performance criteria. As the result we provide a detailed analysis of the experimental outcomes that reveal novel insights for improving current and future SPARQL federation systems.


Journal of Biomedical Semantics | 2017

SAFE: SPARQL federation over RDF data cubes with access control

Yasar Khan; Muhammad Saleem; Muntazir Mehdi; Aidan Hogan; Qaiser Mehmood; Dietrich Rebholz-Schuhmann; Ratnesh Sahay

BackgroundSeveral query federation engines have been proposed for accessing public Linked Open Data sources. However, in many domains, resources are sensitive and access to these resources is tightly controlled by stakeholders; consequently, privacy is a major concern when federating queries over such datasets. In the Healthcare and Life Sciences (HCLS) domain real-world datasets contain sensitive statistical information: strict ownership is granted to individuals working in hospitals, research labs, clinical trial organisers, etc. Therefore, the legal and ethical concerns on (i) preserving the anonymity of patients (or clinical subjects); and (ii) respecting data ownership through access control; are key challenges faced by the data analytics community working within the HCLS domain. Likewise statistical data play a key role in the domain, where the RDF Data Cube Vocabulary has been proposed as a standard format to enable the exchange of such data. However, to the best of our knowledge, no existing approach has looked to optimise federated queries over such statistical data.ResultsWe present SAFE: a query federation engine that enables policy-aware access to sensitive statistical datasets represented as RDF data cubes. SAFE is designed specifically to query statistical RDF data cubes in a distributed setting, where access control is coupled with source selection, user profiles and their access rights. SAFE proposes a join-aware source selection method that avoids wasteful requests to irrelevant and unauthorised data sources. In order to preserve anonymity and enforce stricter access control, SAFE’s indexing system does not hold any data instances—it stores only predicates and endpoints. The resulting data summary has a significantly lower index generation time and size compared to existing engines, which allows for faster updates when sources change.ConclusionsWe validate the performance of the system with experiments over real-world datasets provided by three clinical organisations as well as legacy linked datasets. We show that SAFE enables granular graph-level access control over distributed clinical RDF data cubes and efficiently reduces the source selection and overall query execution time when compared with general-purpose SPARQL query federation engines in the targeted setting.


international conference on semantic systems | 2015

A linked data platform for finite element biosimulations

Muntazir Mehdi; Yasar Khan; André Freitas; Joao Jares; Stefan Decker; Ratnesh Sahay

Biosimulation models have been recently introduced to understand the exact causative factors that give rise to impairment in human organs. Finite Element Method (FEM) provides a mathematical framework to simulate dynamic biological systems, with applications ranging from human ear, cardiovascular, to neurovascular research. Due to lack of a well-integrated data infrastructure, the steps involved in the execution and comparative evaluation of large Finite Element (FE) simulations are time consuming and are performed in isolated environments. In this paper, we present a Linked Data platform to improve the automation in integration, analysis and visualisation of biosimulation models for the inner-ear (cochlea) mechanics. The proposed platform aims to help domain scientists and clinicians for exploring and analysing Finite Element (FE) numerical data and simulation results obtained from multiple domains such as biological, geometrical, mathematical, physical models. We validate the platform by conducting a qualitative survey and perform quantitative experiments to record overall performance.


Journal of Biomedical Semantics | 2017

Towards precision medicine: discovering novel gynecological cancer biomarkers and pathways using linked data

Alokkumar Jha; Yasar Khan; Muntazir Mehdi; Rezaul Karim; Qaiser Mehmood; Achille Zappa; Dietrich Rebholz-Schuhmann; Ratnesh Sahay

BackgroundNext Generation Sequencing (NGS) is playing a key role in therapeutic decision making for the cancer prognosis and treatment. The NGS technologies are producing a massive amount of sequencing datasets. Often, these datasets are published from the isolated and different sequencing facilities. Consequently, the process of sharing and aggregating multisite sequencing datasets are thwarted by issues such as the need to discover relevant data from different sources, built scalable repositories, the automation of data linkage, the volume of the data, efficient querying mechanism, and information rich intuitive visualisation.ResultsWe present an approach to link and query different sequencing datasets (TCGA, COSMIC, REACTOME, KEGG and GO) to indicate risks for four cancer types – Ovarian Serous Cystadenocarcinoma (OV), Uterine Corpus Endometrial Carcinoma (UCEC), Uterine Carcinosarcoma (UCS), Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC) – covering the 16 healthy tissue-specific genes from Illumina Human Body Map 2.0. The differentially expressed genes from Illumina Human Body Map 2.0 are analysed together with the gene expressions reported in COSMIC and TCGA repositories leading to the discover of potential biomarkers for a tissue-specific cancer.ConclusionWe analyse the tissue expression of genes, copy number variation (CNV), somatic mutation, and promoter methylation to identify associated pathways and find novel biomarkers. We discovered twenty (20) mutated genes and three (3) potential pathways causing promoter changes in different gynaecological cancer types. We propose a data-interlinked platform called BIOOPENER that glues together heterogeneous cancer and biomedical repositories. The key approach is to find correspondences (or data links) among genetic, cellular and molecular features across isolated cancer datasets giving insight into cancer progression from normal to diseased tissues. The proposed BIOOPENER platform enriches mutations by filling in missing links from TCGA, COSMIC, REACTOME, KEGG and GO datasets and provides an interlinking mechanism to understand cancer progression from normal to diseased tissues with pathway components, which in turn helped to map mutations, associated phenotypes, pathways, and mechanism.


bioinformatics and bioengineering | 2013

Querying phenotype-genotype associations across multiple knowledge bases using Semantic Web technologies

Oya Deniz Beyan; Aftab Iqbal; Yasar Khan; Athos Antoniades; John A. Keane; Panagiotis Hasapis; Christos Georgousopoulos; Myrto Ioannidi; Stefan Decker; Ratnesh Sahay

Biomedical and genomic data are inherently heterogeneous and their recent proliferation over the Web has demanded innovative querying methods to help domain experts in their clinical and research studies. In this paper we present the use of Semantic Web technologies in querying diverse phenotype-genotype associations for supporting personalized medicine and potentially helping to discover new associations. Our initial results suggest that Semantic Web technologies has competitive advantages in extracting, consolidating and presenting phenotype-genotype associations that resides in various bioinformatics resources. The developed querying method could support researchers and medical professionals in discovering and utilizing information on published associations relating disease, treatment, adverse events and environmental factors to genetic markers from multiple repositories.


international conference on big data | 2017

Querying web polystores

Yasar Khan; Antoine Zimmermann; Alokkumar Jha; Dietrich Rebholz-Schuhmann; Ratnesh Sahay

The database, semantic web, and linked data communities have proposed solutions that federate queries over multiple data sources using a single data model. Nowadays, the data retrieval requirements originating from versatile and broad domains like healthcare and life sciences (HCLS) are changing this conventional trend — of federating query over a single data model — primarily due to the simultaneous use of different data models (CSV, JSON, RDB, RDF, XML, etc.) in a real-life scenario. Its now impractical to assume that the variety (graph, key-value, stream, text, table, tree, etc.) of high volume data residing in specialised storage engines will first be converted to a common data model, stored in a general-purpose data storage engine, and finally be queried over the Web. Nevertheless, in this era where genomics datasets are growing from petascale to exascale, it is now important to exploit such vast domain resources in their native data models. The key approach is to query the vast data resources from their native data models and specialised storage engines. In this paper, we propose a Web-based query federation mechanism — called PolyWeb — that unifies query answering over multiple native data models (CSV, RDB, and RDF). We demonstrate PolyWeb on a cancer genomics use-case where it is often the case that a description of biological and chemical entities (e.g., gene, disease, drug, pathways) span across multiple data models. In order to assess the benefits and limitations of evaluating queries over native data models, we evaluate PolyWeb with state-of-the-art query federation engine in terms of result completeness, source selection, and overall query execution time.


Cancer Research | 2017

Abstract A27: A linked data approach to discover HPV oncoprotiens and RB1 induced mutation associations for the retinoblastoma research

Alokkumar Jha; Yasar Khan; Dietrich Rebholz-Schumann; Ratnesh Sahay

Background: LOSS or GAIN in tumor suppressor gene RB1 play a significant role as in case of loss low penetrance where only 39% of eye at risk develops in retinoblastoma. This research covers the multiple mutation types and its effects and identification of major type of mutation involved in retinoblastoma because of HPV and RB1. Methods: First, we focus on exploring gene expression (GE) patterns for RB1 and HPV associated genes from TCGA. Second, identification of validated and non-validated standard CNV ensured using the COSMIC. Finally, the clinical profiles of filtered mutations have been validated based on ICGC pathological profiling data to infer the prognostic behavior from RB1 and HPV associated genes. In order to link and retrieve patterns of a gene from TCGA, COSMIC, and ICGC repositories, we performed following steps: transform heterogeneous data repositories and their storage formats into standard Resource Description Framework (RDF) format; to discover associations by finding specific patterns (i.e. correlations) in the GE data sets; scalable querying the large volume and frequently updating datasets covering the GE data from different repositories. Results: HPV mutations indicated in more than 127 cancer studies shows deletion and amplifications are rare mutations. Retinoblastoma: Expression profile of RB1 shows mutations such as nonsense, Missense or splice events and in GBM and gliomas the expression values in splice mutations (1500-200), nonsense mutations (200-600). In principal HPV associated retinoblastoma the higher expression of HPV genes results in splice junctions and lower in nonsense mutations. Other tissues: Pattern of RB1 where the results coming from more 123 studies show the pattern of mutations similar to the results obtained from HPV associated genes. Alteration with HPV genes study based on the alteration in Altered in 90% samples of 61 cases where TP53 is holding 90% occurrence majorly as normal mutations and SNRNP70 and BRCA1 is majorly responsible truncating mutation other highly mutated genes are AP3D1, BRD4, CCHCR1, CPSF4, CREBBP, CUL3, DDX11, EP300, EP400, FRZ1, GNB2L1, GTF2B, KDM5C, NR4A1, PRP1, PLK1, SF1, SRSF1, SRSF7, SMARCB1, SNRNP70, TAF1, TBP, TMF1, TOPBP1. Whereas RB1 is associated with 11% cased of deep deletion where the V654L is the normal mutation and all others are highly truncating mutation. Survival graph for HPV and RB1 associated genes median months(m) of survival with alteration in these query genes are 103m whereas in RB1 the median month of survival are 7.63m however the disease free survival in RB1 cases are 4.50m. The p-value are 0.76, 0.67, 0.38 respectively. To demonstrate pattern of survival gene set enrichment have been performed on both gene lists where in case of RB1 genes with highest interactions are MDM2, CDK4 and TP53. In HPV genes interacting hubs are TOP1, PARP1, TP53 and ODF2. Higher interacting genes are associated with drugs. RB1 corresponded to Insulin, p a non-cancer FDA approved drug whereas HPV genes and especially TOP1 is associated with Lucanthone, Innotecan, BTBD1 and Topotecan in case of FDA approved drugs category. Cancer drugs with HPV genes are majorly associated with TOP1, PARP1 and PLK1 namely BTBD1, AZD 2281, AG14361, BI2536 and GW84382X. In ICGC effect of RB1 is on cancers e.g. melanoma 51.91% Esophageal 45.38% ovarian 31.18% liver 27.96% Pancreatic 22.55%. Associations of RB1 with other cancer mutations are either splice as in TCGA and other associated mutation with RB1 is LPAR6 majorly these mutations are in exon-region and further understand the other mutations in TCGA ICGC reveals most of these are SNPs at chromosome-13 which defines the locus of RB1 and for HPV. Higher interacting hubs from TCGA TOP1, PAPR1 and ODF2. TOP1 is associated with melanoma two donor hubs of 42.08% and 40.91% liver cancer Hepatocellular carcinoma (Virus) with 23.08 % Esophageal (15.05%) and Ovarian (11.36%). Mutations types are SNP and Splice junctions. Citation Format: Alokkumar Jha, Yasar Khan, Dietrich Rebholz-Schumann, Ratnesh Sahay. A linked data approach to discover HPV oncoprotiens and RB1 induced mutation associations for the retinoblastoma research. [abstract]. In: Proceedings of the AACR Special Conference on Engineering and Physical Sciences in Oncology; 2016 Jun 25-28; Boston, MA. Philadelphia (PA): AACR; Cancer Res 2017;77(2 Suppl):Abstract nr A27.


very large data bases | 2016

Drug Dosage Balancing Using Large Scale Multi-omics Datasets

Alokkumar Jha; Muntazir Mehdi; Yasar Khan; Qaiser Mehmood; Dietrich Rebholz-Schuhmann; Ratnesh Sahay

Cancer is a disease of biological and cell cycle processes, driven by dosage of the limited set of drugs, resistance, mutations, and side effects. The identification of such limited set of drugs and their targets, pathways, and effects based on large scale multi-omics, multi-dimensional datasets is one of key challenging tasks in data-driven cancer genomics. This paper demonstrates the use of public databases associated with Drug-TargetGene/Protein-Disease to dissect the in-depth analysis of approved cancer drugs, their genetic associations, their pathways to establish a dosage balancing mechanism. This paper will also help to understand cancer as a disease associated pathways and effect of drug treatment on the cancer cells. We employ the Semantic Web approach to provide an integrated knowledge discovery process and the network of integrated datasets. The approach is employed to sustain the biological questions involving 1 Associated drugs and their omics signature, 2i¾?Identification of gene association with integrated Drug-Target databases 3 Mutations, variants, and alterations from these targets 4 Their PPI Interactions and associated oncogenic pathways 5 Associated biological process aligned with these mutations and pathways to identify IC-50 level of each drug along-with adverse events and alternate indications. In principal this large semantically integrated database of around 30 databases will serve as Semantic Linked Association Prediction in drug discovery to explore and expand the dosage balancing and drug re-purposing.


ieee international conference semantic computing | 2016

Demonstrating a Linked Data Visualiser for Finite Element Biosimulations

Muntazir Mehdi; Yasar Khan; André Freitas; Joao Jares; Saleem Raza; Panagiotis Hasapis; Ratnesh Sahay

Healthcare experts have recently turned towards the use of Biosimulation models to understand the multiple or different causative factors that cause impairment in human organs. The applications of biosimulations have been applied in different biological systems ranging from human ear, cardiovascular, to neurovascular research using Finite Element Method (FEM). FEM provide a mathematical framework to simulate these dynamic biological systems. Visualizing and analyzing huge amounts of Finite Element (FE) Biosimulation numerical data is a strenuous task. In this paper, we demonstrate a Linked Data Visualiser -- called SIFEM Visualiser -- to help domain-experts to Visualise, analyze and compare biosimulation results from heterogeneous, complex, and high volume numerical data. The SIFEM Visualiser aims to help healthcare experts in exploring and analyzing Finite Element (FE) numerical data and simulation results obtained from different aspects of inner-ear (Cochlear) model - such as biological, geometrical, mathematical, and physical models.


bioinformatics and bioengineering | 2015

Extending inner-ear anatomical concepts in the Foundational Model of Anatomy (FMA) ontology

Yasar Khan; Muntazir Mehdi; Alokkumar Jha; Saleem Raza; André Freitas; Marggie Jones; Ratnesh Sahay

The inner ear is physically inaccessible in living humans, which leads to unique difficulties in studying its normal function and pathology as in other human organs. Recently, biosimulation model has gained a significant attention to understand the exact causative factors that give rise to impairment in human organs. However, to build a biosimulation model for human organ concepts and their topological relationships from multiple and semantically overlapping domains such as biology, anatomy, geometrical, mathematical, physical models are required. In this paper, we focus on modelling the inner-ear macro anatomical concepts and their topological relationships. We extended the Foundational Model of Anatomy (FMA) ontology to cover micro-level version of human inner-ear anatomy where connection between simulating tissues, liquids, soft tissues and connecting adjacent (e.g. hair cells, perilymph) parts studied in detail, included and implemented.

Collaboration


Dive into the Yasar Khan's collaboration.

Top Co-Authors

Avatar

Ratnesh Sahay

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Muntazir Mehdi

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Aftab Iqbal

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Alokkumar Jha

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Stefan Decker

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ali Hasnain

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Joao Jares

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Qaiser Mehmood

National University of Ireland

View shared research outputs
Researchain Logo
Decentralizing Knowledge