Pooya Zakeri | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pooya Zakeri is active.

Explore More

Publication

Featured researches published by Pooya Zakeri.

Bioinformatics | 2014

Protein fold recognition using geometric kernel data fusion

Pooya Zakeri; Ben Jeuris; Raf Vandebril; Yves Moreau

Motivation: Various approaches based on features extracted from protein sequences and often machine learning methods have been used in the prediction of protein folds. Finding an efficient technique for integrating these different protein features has received increasing attention. In particular, kernel methods are an interesting class of techniques for integrating heterogeneous data. Various methods have been proposed to fuse multiple kernels. Most techniques for multiple kernel learning focus on learning a convex linear combination of base kernels. In addition to the limitation of linear combinations, working with such approaches could cause a loss of potentially useful information. Results: We design several techniques to combine kernel matrices by taking more involved, geometry inspired means of these matrices instead of convex linear combinations. We consider various sequence-based protein features including information extracted directly from position-specific scoring matrices and local sequence alignment. We evaluate our methods for classification on the SCOP PDB-40D benchmark dataset for protein fold recognition. The best overall accuracy on the protein fold recognition test set obtained by our methods is ∼86.7%. This is an improvement over the results of the best existing approach. Moreover, our computational model has been developed by incorporating the functional domain composition of proteins through a hybridization model. It is observed that by using our proposed hybridization model, the protein fold recognition accuracy is further improved to 89.30%. Furthermore, we investigate the performance of our approach on the protein remote homology detection problem by fusing multiple string kernels. Availability and implementation: The MATLAB code used for our proposed geometric kernel fusion frameworks are publicly available at http://people.cs.kuleuven.be/∼raf.vandebril/homepage/software/geomean.php?menu=5/ Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Scientific Reports | 2018

Linking drug target and pathway activation for effective therapy using multi-task learning

Mi Yang; Jaak Simm; Chi Chung Lam; Pooya Zakeri; Gerard J. P. van Westen; Yves Moreau; Julio Saez-Rodriguez

Despite the abundance of large-scale molecular and drug-response data, the insights gained about the mechanisms underlying treatment efficacy in cancer has been in general limited. Machine learning algorithms applied to those datasets most often are used to provide predictions without interpretation, or reveal single drug-gene association and fail to derive robust insights. We propose to use Macau, a bayesian multitask multi-relational algorithm to generalize from individual drugs and genes and explore the interactions between the drug targets and signaling pathways’ activation. A typical insight would be: “Activation of pathway Y will confer sensitivity to any drug targeting protein X”. We applied our methodology to the Genomics of Drug Sensitivity in Cancer (GDSC) screening, using gene expression of 990 cancer cell lines, activity scores of 11 signaling pathways derived from the tool PROGENy as cell line input and 228 nominal targets for 265 drugs as drug input. These interactions can guide a tissue-specific combination treatment strategy, for example suggesting to modulate a certain pathway to maximize the drug response for a given tissue. We confirmed in literature drug combination strategies derived from our result for brain, skin and stomach tissues. Such an analysis of interactions across tissues might help target discovery, drug repurposing and patient stratification strategies.

international conference on bioinformatics and biomedical engineering | 2016

A Comprehensive Comparison of Two MEDLINE Annotators for Disease and Gene Linkage: Sometimes Less is More

Sarah Elshal; Jaak Simm; Adam Arany; Pooya Zakeri; Jesse Davis; Yves Moreau

Text mining is popular in biomedical applications because it allows retrieving highly relevant information. Particularly for us, it is quite practical in linking diseases to the genes involved in them. However text mining involves multiple challenges, such as (1) recognizing named entities (e.g., diseases and genes) inside the text, (2) constructing specific vocabularies that efficiently represent the available text, and (3) applying the correct statistical criteria to link biomedical entities with each other. We have previously developed Beegle, a tool that allows prioritizing genes for any search query of interest. The method starts with a search phase, where relevant genes are identified via the literature. Once known genes are identified, a second phase allows prioritizing novel candidate genes through a data fusion strategy. Many aspects of our method could be potentially improved. Here we evaluate two MEDLINE annotators that recognize biomedical entities inside a given abstract using different dictionaries and annotation strategies. We compare the contribution of each of the two annotators in associating genes with diseases under different vocabulary settings. Somewhat surprisingly, with fewer recognized entities and a more compact vocabulary, we obtain better associations between genes and diseases. We also propose a novel but simple association criterion to link genes with diseases, which relies on recognizing only gene entities inside the biomedical text. These refinements significantly improve the performance of our method.

bioinformatics and biomedicine | 2015

Gene prioritization through geometric-inspired kernel data fusion

Pooya Zakeri; Sarah Elshal; Yves Moreau

In biology there is often the need to discover the most promising genes, among a large list of candidate genes, to further investigate. While a single data source might not be effective enough, integrating several complementary genomic data sources leads to more accurate prediction. We propose a kernel-based gene prioritization framework using geometric kernel fusion which we have recently developed as a powerful tool for protein fold classification [I]. It has been shown that taking more involved geometry means of their corresponding kernel matrices is less sensitive in dealing with complementary and noisy kernel matrices compared to standard multiple kernel learning methods. Since genomic kernels often encodes the complementary characteristics of biological data, this leads us to research the application of geometric kernel fusion in the gene prioritization task. We utilize an unbiased and prospective benchmark based on the OMIM [2] associations. Experimental results on our prospective benchmark show that our model can improve the accuracy of the state-of-the-art gene prioritization model.

intelligent systems in molecular biology | 2018

Gene Prioritization Using Bayesian Matrix Factorization with Genomic and Phenotypic Side Information

Pooya Zakeri; Jaak Simm; Adam Arany; Sarah Elshal; Yves Moreau

Motivation Most gene prioritization methods model each disease or phenotype individually, but this fails to capture patterns common to several diseases or phenotypes. To overcome this limitation, we formulate the gene prioritization task as the factorization of a sparsely filled gene‐phenotype matrix, where the objective is to predict the unknown matrix entries. To deliver more accurate gene‐phenotype matrix completion, we extend classical Bayesian matrix factorization to work with multiple side information sources. The availability of side information allows us to make non‐trivial predictions for genes for which no previous disease association is known. Results Our gene prioritization method can innovatively not only integrate data sources describing genes, but also data sources describing Human Phenotype Ontology terms. Experimental results on our benchmarks show that our proposed model can effectively improve accuracy over the well‐established gene prioritization method, Endeavour. In particular, our proposed method offers promising results on diseases of the nervous system; diseases of the eye and adnexa; endocrine, nutritional and metabolic diseases; and congenital malformations, deformations and chromosomal abnormalities, when compared to Endeavour. Availability and implementation The Bayesian data fusion method is implemented as a Python/C++ package: https://github.com/jaak‐s/macau. It is also available as a Julia package: https://github.com/jaak‐s/BayesianDataFusion.jl. All data and benchmarks generated or analyzed during this study can be downloaded at https://owncloud.esat.kuleuven.be/index.php/s/UGb89WfkZwMYoTn.

bioRxiv | 2017