Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ali Faisal is active.

Publication


Featured researches published by Ali Faisal.


Genes, Chromosomes and Cancer | 2011

Integrative analysis of microRNA, mRNA and aCGH data reveals asbestos- and histology-related changes in lung cancer

Penny Nymark; Mohamed Guled; Ioana Borze; Ali Faisal; Leo Lahti; Kaisa Salmenkivi; Eeva Kettunen; Sisko Anttila; Sakari Knuutila

Lung cancer has the highest mortality rate of all of the cancers in the world and asbestos‐related lung cancer is one of the leading occupational cancers. The identification of asbestos‐related molecular changes has long been a topic of increasing research interest. The aim of this study was to identify novel asbestos‐related molecular correlates by integrating miRNA expression profiling with previously obtained profiling data (aCGH and mRNA expression) from the same patient material. miRNA profiling was performed on 26 tumor and corresponding normal lung tissue samples from highly asbestos‐exposed and non‐exposed patients, and on eight control lung tissue samples. Data analyses on miRNA expression, and integration of miRNA and previously obtained mRNA data were performed using Chipster. A separate analysis was used to integrate miRNA and previously obtained aCGH data. Both known and new lung cancer‐associated miRNAs and target genes with inverse correlation were discovered. Furthermore, DNA copy number alterations (e.g., gain at 12p13.31) were correlated with the deregulated miRNAs. Specifically, thirteen novel asbestos‐related miRNAs (over‐expressed: miR‐148b, miR‐374a, miR‐24‐1*, Let‐7d, Let‐7e, miR‐199b‐5p, miR‐331‐3p, and miR‐96 and under‐expressed: miR‐939, miR‐671‐5p, miR‐605, miR‐1224‐5p and miR‐202) and inversely correlated target genes (e.g., GADD45A, LTBP1, FOSB, NCALD, CACNA2D2, MTSS1, EPB41L3) were identified. In addition, over‐expression of the well known squamous cell carcinoma‐associated miR‐205 was linked to down‐regulation of the DOK4 gene. The miRNAs/genes presented here may represent interesting targets for further investigation and could eventually have potential diagnostic implications.


BMC Bioinformatics | 2009

Probabilistic retrieval and visualization of biologically relevant microarray experiments

José Caldas; Nils Gehlenborg; Ali Faisal; Alvis Brazma; Samuel Kaski

Motivation: As ArrayExpress and other repositories of genomewide experiments are reaching a mature size, it is becoming more meaningful to search for related experiments, given a particular study. We introduce methods that allow for the search to be based upon measurement data, instead of the more customary annotation data. The goal is to retrieve experiments in which the same biological processes are activated. This can be due either to experiments targeting the same biological question, or to as yet unknown


Ecological Informatics | 2010

Inferring species interaction networks from species abundance data: A comparative evaluation of various statistical and machine learning methods

Ali Faisal; Frank Dondelinger; Dirk Husmeier; Colin M. Beale

Abstract The complexity of ecosystems is staggering, with hundreds or thousands of species interacting in a number of ways from competition and predation to facilitation and mutualism. Understanding the networks that form the systems is of growing importance, e.g. to understand how species will respond to climate change, or to predict potential knock-on effects of a biological control agent. In recent years, a variety of summary statistics for characterising the global and local properties of such networks have been derived, which provide a measure for gauging the accuracy of a mathematical model for network formation processes. However, the critical underlying assumption is that the true network is known. This is not a straightforward task to accomplish, and typically requires minute observations and detailed field work. More importantly, knowledge about species interactions is restricted to specific kinds of interactions. For instance, while the interactions between pollinators and their host plants are amenable to direct observation, other types of species interactions, like those mentioned above, are not, and might not even be clearly defined from the outset. To discover information about complex ecological systems efficiently, new tools for inferring the structure of networks from field data are needed. In the present study, we investigate the viability of various statistical and machine learning methods recently applied in molecular systems biology: graphical Gaussian models, L1-regularised regression with least absolute shrinkage and selection operator (LASSO), sparse Bayesian regression and Bayesian networks. We have assessed the performance of these methods on data simulated from food webs of known structure, where we combined a niche model with a stochastic population model in a 2-dimensional lattice. We assessed the network reconstruction accuracy in terms of the area under the receiver operating characteristic (ROC) curve, which was typically in the range between 0.75 and 0.9, corresponding to the recovery of about 60% of the true species interactions at a false prediction rate of 5%. We also applied the models to presence/absence data for 39 European warblers, and found that the inferred species interactions showed a weak yet significant correlation with phylogenetic similarity scores, which tended to weakly increase when including bio-climate covariates and allowing for spatial autocorrelation. Our findings demonstrate that relevant patterns in ecological networks can be identified from large-scale spatial data sets with machine learning methods, and that these methods have the potential to contribute novel important tools for gaining deeper insight into the structure and stability of ecosystems.


BMC Bioinformatics | 2012

Comprehensive data-driven analysis of the impact of chemoinformatic structure on the genome-wide biological response profiles of cancer cells to 1159 drugs

Suleiman A. Khan; Ali Faisal; John Patrick Mpindi; Juuso Parkkinen; Tuomo Kalliokoski; Antti Poso; Olli Kallioniemi; Krister Wennerberg; Samuel Kaski

BackgroundDetailed and systematic understanding of the biological effects of millions of available compounds on living cells is a significant challenge. As most compounds impact multiple targets and pathways, traditional methods for analyzing structure-function relationships are not comprehensive enough. Therefore more advanced integrative models are needed for predicting biological effects elicited by specific chemical features. As a step towards creating such computational links we developed a data-driven chemical systems biology approach to comprehensively study the relationship of 76 structural 3D-descriptors (VolSurf, chemical space) of 1159 drugs with the microarray gene expression responses (biological space) they elicited in three cancer cell lines. The analysis covering 11350 genes was based on data from the Connectivity Map. We decomposed the biological response profiles into components, each linked to a characteristic chemical descriptor profile.ResultsIntegrated analysis of both the chemical and biological space was more informative than either dataset alone in predicting drug similarity as measured by shared protein targets. We identified ten major components that link distinct VolSurf chemical features across multiple compounds to specific cellular responses. For example, component 2 (hydrophobic properties) strongly linked to DNA damage response, while component 3 (hydrogen bonding) was associated with metabolic stress. Individual structural and biological features were often linked to one cell line only, such as leukemia cells (HL-60) specifically responding to cardiac glycosides.ConclusionsIn summary, our approach identified several novel links between specific chemical structure properties and distinct biological responses in cells incubated with these drugs. Importantly, the analysis focused on chemical-biological properties that emerge across multiple drugs. The decoding of such systematic relationships is necessary to build better models of drug effects, including unanticipated types of molecular properties having strong biological effects.


Bioinformatics | 2012

Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma

José Caldas; Nils Gehlenborg; Eeva Kettunen; Ali Faisal; Mikko Rönty; Andrew G. Nicholson; Sakari Knuutila; Alvis Brazma; Samuel Kaski

Motivation: Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights. Results: We propose an information retrieval method based on differential expression. Our method deals with arbitrary experimental designs and performs competitively with alternative approaches, while making the search results interpretable in terms of differential expression patterns. We show that our model yields meaningful connections between biological conditions from different studies. Finally, we validate a previously unknown connection between malignant pleural mesothelioma and SIM2s suggested by our method, via real-time polymerase chain reaction in an independent set of mesothelioma samples. Availability: Supplementary data and source code are available from http://www.ebi.ac.uk/fg/research/rex. Contact: [email protected] Supplementary Information: Supplementary data are available at Bioinformatics online.


PLOS ONE | 2014

Toward Computational Cumulative Biology by Combining Models of Biological Datasets

Ali Faisal; Jaakko Peltonen; Elisabeth Georgii; Johan Rung; Samuel Kaski

A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researchers experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to include both biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or automatically by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition, we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer, and the model-based search was more accurate than the keyword search; moreover, it recovered biologically meaningful relationships that are not straightforwardly visible from annotations—for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database.


Neurocomputing | 2013

Transfer learning using a nonparametric sparse topic model

Ali Faisal; Jussi Gillberg; Gayle Leen; Jaakko Peltonen

In many domains data items are represented by vectors of counts; count data arises, for example, in bioinformatics or analysis of text documents represented as word count vectors. However, often the amount of data available from an interesting data source is too small to model the data source well. When several data sets are available from related sources, exploiting their similarities by transfer learning can improve the resulting models compared to modeling sources independently. We introduce a Bayesian generative transfer learning model which represents similarity across document collections by sparse sharing of latent topics controlled by an Indian buffet process. Unlike a prominent previous model, hierarchical Dirichlet process (HDP) based multi-task learning, our model decouples topic sharing probability from topic strength, making sharing of low-strength topics easier. In experiments, our model outperforms the HDP approach both on synthetic data and in first of the two case studies on text collections, and achieves similar performance as the HDP approach in the second case study.


Games and Culture | 2015

Establishing Video Game Genres Using Data-Driven Modeling and Product Databases:

Ali Faisal; Mirva Peltoniemi

Establishing genres is the first step toward analyzing games and how the genre landscape evolves over the years. We use data-driven modeling that distils genres from textual descriptions of a large collection of games. We analyze the evolution of game genres from 1979 till 2010. Our results indicate that until 1990, there have been many genres competing for dominance, but thereafter sport-racing, strategy, and action have become the most prevalent genres. Moreover, we find that games vary to a great extent as to whether they belong mostly to one genre or to a combination of several genres. We also compare the results of our data-driven model with two product databases, Metacritic and Mobygames, and observe that the classifications of games to different genres are substantially different, even between product databases. We conclude with discussion on potential future applications and how they may further our understanding of video game genres.


Systems Biomedicine | 2013

Systematic use of computational methods allows stratification of treatment responders in glioblastoma multiforme

Riku Louhimo; Viljami Aittomäki; Ali Faisal; Marko Laakso; Ping Chen; Kristian Ovaska; Erkka Valo; Leo Lahti; Vladimir Rogojin; Samuel Kaski; Sampsa Hautaniemi

Background: Cancers are complex diseases whose comprehensive characterization requires genome-scale molecular data at multiple levels from genetics to transcriptomics and clinical data. Using our recently published Anduril bioinformatics framework and novel computational approaches, such as dependency analysis, we identify key variables at miRNA, copy number variation, expression, methylation, and pathway levels in glioblastoma multiforme (GBM) progression and drug resistance. Furthermore, we identify characteristics of clinically relevant subgroups, such as patients treated with temozolomide and patients with an EGFRvIII mutation, which is a constitutively active variant of EGFR. Results: We identify several novel genomic regions and transcript profiles that may contribute to GBM progression and drug resistance. All results and Anduril scripts are available at http://csbi.ltdk.helsinki.fi/camda/. Conclusions: Our results highlight the need for approaches that define context at several levels in order to identify genomic regions or transcript profiles playing key roles in cancer progression and drug resistance.


bioRxiv | 2018

Reconstructing meaning from bits of information

Sasa L. Kivisaari; Marijn van Vliet; Annika Hultén; Tiina Lindh-Knuutila; Ali Faisal; Riitta Salmelin

We can easily identify a dog merely by the sound of barking or an orange by its citrus scent. In this work, we study the neural underpinnings of how the brain combines bits of information into meaningful object representations. Modern theories of semantics posit that the meaning of words can be decomposed into a unique combination of individual semantic features (e.g., “barks”, “has citrus scent”). Here, participants received clues of individual objects in form of three isolated semantic features, given as verbal descriptions. We used machine-learning-based neural decoding to learn a mapping between individual semantic features and BOLD activation patterns. We discovered that the recorded brain patterns were best decoded using a combination of not only the three semantic features that were presented as clues, but a far richer set of semantic features typically linked to the target object. We conclude that our experimental protocol allowed us to observe how fragmented information is combined into a complete semantic representation of an object and suggest neuroanatomical underpinnings for this process.

Collaboration


Dive into the Ali Faisal's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Leo Lahti

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elisabeth Georgii

Helsinki Institute for Information Technology

View shared research outputs
Top Co-Authors

Avatar

Erkka Valo

University of Helsinki

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge