Is this you? Create Your Porfile

Turki

King Abdulaziz University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Turki is active.

Explore More

Publication

Featured researches published by Turki.

intelligent data engineering and automated learning | 2015

A New Approach to Link Prediction in Gene Regulatory Networks

Turki Turki; Jason Tsong-Li Wang

Link prediction is an important data mining problem that has many applications in different domains such as social network analysis and computational biology. For example, biologists model gene regulatory networks (GRNs) as directed graphs where nodes are genes and links show regulatory relationships between the genes. By predicting links in GRNs, biologists can gain a better understanding of the cell regulatory circuits and functional elements. Existing supervised methods for GRN inference work by building a feature-based classifier from gene expression data and using the classifier to predict links in the GRNs. In this paper we present a new supervised approach for link prediction in GRNs. Our approach employs both gene expression data and topological features extracted from the GRNs, in combination with three machine learning algorithms including random forests, support vector machines and neural networks. Experimental results on different datasets demonstrate the good performance of the proposed approach and its superiority over the existing methods.

BMC Genomics | 2014

MaxSSmap: a GPU program for mapping divergent short reads to genomes with the maximum scoring subsequence

Turki Turki; Usman Roshan

BackgroundPrograms based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes.ResultsWe introduce a GPU program called MaxSSmap with the aim of achieving comparable accuracy to Smith-Waterman but with faster runtimes. Similar to most programs MaxSSmap identifies a local region of the genome followed by exact alignment. Instead of using hash tables or Burrows-Wheeler in the first part, MaxSSmap calculates maximum scoring subsequence score between the read and disjoint fragments of the genome in parallel on a GPU and selects the highest scoring fragment for exact alignment. We evaluate MaxSSmap’s accuracy and runtime when mapping simulated Illumina E.coli and human chromosome one reads of different lengths and 10% to 30% mismatches with gaps to the E.coli genome and human chromosome one. We also demonstrate applications on real data by mapping ancient horse DNA reads to modern genomes and unmapped paired reads from NA12878 in 1000 genomes.ConclusionsWe show that MaxSSmap attains comparable high accuracy and low error to fast Smith-Waterman programs yet has much lower runtimes. We show that MaxSSmap can map reads rejected by BWA and NextGenMap with high accuracy and low error much faster than if Smith-Waterman were used. On short read lengths of 36 and 51 both MaxSSmap and Smith-Waterman have lower accuracy compared to at higher lengths. On real data MaxSSmap produces many alignments with high score and mapping quality that are not given by NextGenMap and BWA. The MaxSSmap source code in CUDA and OpenCL is freely available from http://www.cs.njit.edu/usman/MaxSSmap.

IEEE Access | 2017

Transfer Learning Approaches to Improve Drug Sensitivity Prediction in Multiple Myeloma Patients

Turki Turki; Zhi Wei; Jason Tsong-Li Wang

Traditional machine learning approaches to drug sensitivity prediction assume that training data and test data must be in the same feature space and have the same underlying distribution. However, in real-world applications, this assumption does not hold. For example, we sometimes have limited training data for the task of drug sensitivity prediction in multiple myeloma patients (target task), but we have sufficient auxiliary data for the task of drug sensitivity prediction in patients with another cancer type (related task), where the auxiliary data for the related task are in a different feature space or have a different distribution. In such cases, transfer learning, if applied correctly, would improve the performance of prediction algorithms on the test data of the target task via leveraging the auxiliary data from the related task. In this paper, we present two transfer learning approaches that combine the auxiliary data from the related task with the training data of the target task to improve the prediction performance on the test data of the target task. We evaluate the performance of our transfer learning approaches exploiting three auxiliary data sets and compare them against baseline approaches using the area under the receiver operating characteristic curve on the test data of the target task. Experimental results demonstrate the good performance of our approaches and their superiority over the baseline approaches when auxiliary data are incorporated.

BMC Systems Biology | 2017

A link prediction approach to cancer drug sensitivity prediction

Turki Turki; Zhi Wei

BackgroundPredicting the response to a drug for cancer disease patients based on genomic information is an important problem in modern clinical oncology. This problem occurs in part because many available drug sensitivity prediction algorithms do not consider better quality cancer cell lines and the adoption of new feature representations; both lead to the accurate prediction of drug responses. By predicting accurate drug responses to cancer, oncologists gain a more complete understanding of the effective treatments for each patient, which is a core goal in precision medicine.ResultsIn this paper, we model cancer drug sensitivity as a link prediction, which is shown to be an effective technique. We evaluate our proposed link prediction algorithms and compare them with an existing drug sensitivity prediction approach based on clinical trial data. The experimental results based on the clinical trial data show the stability of our link prediction algorithms, which yield the highest area under the ROC curve (AUC) and are statistically significant.ConclusionsWe propose a link prediction approach to obtain new feature representation. Compared with an existing approach, the results show that incorporating the new feature representation to the link prediction algorithms has significantly improved the performance.

mexican conference on pattern recognition | 2014

Weighted Maximum Variance Dimensionality Reduction

Turki Turki; Usman Roshan

Dimensionality reduction procedures such as principal component analysis and the maximum margin criterion discriminant are special cases of a weighted maximum variance (WMV) approach. We present a simple two parameter version of WMV that we call 2P-WMV. We study the classification error given by the 1-nearest neighbor algorithm on features extracted by our and other dimensionality reduction methods on several real datasets. Our results show that our method yields the lowest average error across the datasets with statistical significance.

MIKE | 2014

Top-k Parametrized Boost

Turki Turki; Muhammad Ihsan; Nouf Turki; Jie Zhang; Usman Roshan; Zhi Wei

Ensemble methods such as AdaBoost are popular machine learning methods that create highly accurate classifier by combining the predictions from several classifiers. We present a parametrized method of AdaBoost that we call Top-k Parametrized Boost. We evaluate our and other popular ensemble methods from a classification perspective on several real datasets. Our empirical study shows that our method gives the minimum average error with statistical significance on the datasets.

BioMed Research International | 2017

MapReduce Algorithms for Inferring Gene Regulatory Networks from Time-Series Microarray Data Using an Information-Theoretic Approach

Yasser Abduallah; Turki Turki; Kevin Byron; Zongxuan Du; Miguel Cervantes-Cervantes; Jason Tsong-Li Wang

Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.

machine learning and data mining in pattern recognition | 2016

A Learning Framework to Improve Unsupervised Gene Network Inference

Turki Turki; William Bassett; Jason Tsong-Li Wang

Network inference through link prediction is an important data mining problem that finds many applications in computational social science and biomedicine. For example, by predicting links, i.e., regulatory relationships, between genes to infer gene regulatory networks (GRNs), computational biologists gain a better understanding of the functional elements and regulatory circuits in cells. Unsupervised methods have been widely used to infer GRNs; however, these methods often create missing and spurious links. In this paper, we propose a learning framework to improve the unsupervised methods. Given a network constructed by an unsupervised method, the proposed framework employs a graph sparsification technique for network sampling and principal component analysis for feature selection to obtain better quality training data, which guides three classifiers to predict and clean the links of the given network. The three classifiers include neural networks, random forests and support vector machines. Experimental results on several datasets demonstrate the good performance of the proposed learning framework and the classifiers used in the framework.

international conference on machine learning and applications | 2016

Inferring Gene Regulatory Networks by Combining Supervised and Unsupervised Methods

Turki Turki; Jason Tsong-Li Wang; Ibrahim Rajikhan

Supervised methods for inferring gene regulatory networks (GRNs) perform well with good training data. However, when training data is absent, these methods are not applicable. Unsupervised methods do not need training data but their accuracy is low. In this paper, we combine supervised and unsupervised methods to infer GRNs using time-series gene expression data. Specifically, we use results obtained from unsupervised methods to train supervised methods. Since the results contain noise, we develop a data cleaning algorithm to remove noise, hence improving the quality of the training data. These refined training data are then used to guide classifiers including support vector machines and deep learning tools to infer GRNs through link prediction. Experimental results on several data sets demonstrate the good performance of the classifiers and the effectiveness of our data cleaning algorithm.

ieee systems conference | 2016

A greedy-based oversampling approach to improve the prediction of mortality in MERS patients

Turki Turki; Zhi Wei

Predicting mortality of Middle East respiratory syndrome (MERS) patients with identified outcomes is a core goal for hospitals in deciding whether a new patient should be hospitalized or not in the presence of limited resources of the hospitals. We present an oversampling approach that we call Greedy-Based Oversampling Approach (GBOA). We evaluate our approach and compare it against the standard oversampling approach from a classification perspective on real dataset collected from the Saudi Ministry of Health using two popular supervised classification methods, Random Forests and Support Vector Machines. Our results demonstrate that our approach outperforms the other standard approach from a classification perspective by giving the highest accuracy with statistical significance on the 20 simulations of the real dataset.

Explore More