Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Malika Smaïl-Tabbone is active.

Publication


Featured researches published by Malika Smaïl-Tabbone.


BMC Bioinformatics | 2010

IntelliGO: a new vector-based semantic similarity measure including annotation origin

Sidahmed Benabderrahmane; Malika Smaïl-Tabbone; Olivier Poch; Amedeo Napoli; Marie-Dominique Devignes

BackgroundThe Gene Ontology (GO) is a well known controlled vocabulary describing the biological process, molecular function and cellular component aspects of gene annotation. It has become a widely used knowledge source in bioinformatics for annotating genes and measuring their semantic similarity. These measures generally involve the GO graph structure, the information content of GO aspects, or a combination of both. However, only a few of the semantic similarity measures described so far can handle GO annotations differently according to their origin (i.e. their evidence codes).ResultsWe present here a new semantic similarity measure called IntelliGO which integrates several complementary properties in a novel vector space model. The coefficients associated with each GO term that annotates a given gene or protein include its information content as well as a customized value for each type of GO evidence code. The generalized cosine similarity measure, used for calculating the dot product between two vectors, has been rigorously adapted to the context of the GO graph. The IntelliGO similarity measure is tested on two benchmark datasets consisting of KEGG pathways and Pfam domains grouped as clans, considering the GO biological process and molecular function terms, respectively, for a total of 683 yeast and human genes and involving more than 67,900 pair-wise comparisons. The ability of the IntelliGO similarity measure to express the biological cohesion of sets of genes compares favourably to four existing similarity measures. For inter-set comparison, it consistently discriminates between distinct sets of genes. Furthermore, the IntelliGO similarity measure allows the influence of weights assigned to evidence codes to be checked. Finally, the results obtained with a complementary reference technique give intermediate but correct correlation values with the sequence similarity, Pfam, and Enzyme classifications when compared to previously published measures.ConclusionsThe IntelliGO similarity measure provides a customizable and comprehensive method for quantifying gene similarity based on GO annotations. It also displays a robust set-discriminating power which suggests it will be useful for functional clustering.AvailabilityAn on-line version of the IntelliGO similarity measure is available at: http://bioinfo.loria.fr/Members/benabdsi/intelligo_project/


international conference on conceptual structures | 2005

Querying a bioinformatic data sources registry with concept lattices

Nizar Messai; Marie-Dominique Devignes; Amedeo Napoli; Malika Smaïl-Tabbone

Bioinformatic data sources available on the web are multiple and heterogenous. The lack of documentation and the difficulty of interaction with these data banks require users competence in both informatics and biological fields for an optimal use of sources contents that remain rather under exploited. In this paper we present an approach based on formal concept analysis to classify and search relevant bioinformatic data sources for a given user query. It consists in building the concept lattice from the binary relation between bioinformatic data sources and their associated metadata. The concept built from a given user query is then merged into the concept lattice. The result is given by the extraction of the set of sources belonging to the extents of the query concept subsumers in the resulting concept lattice. The sources ranking is given by the concept specificity order in the concept lattice. An improvement of the approach consists in automatic refinement of the query thanks to domain ontologies. Two forms of refinement are possible by generalisation and by specialisation.


Bioinformatics | 2011

Spatial clustering of protein binding sites for template based protein docking

Anisah W. Ghoorah; Marie-Dominique Devignes; Malika Smaïl-Tabbone; David W. Ritchie

MOTIVATION In recent years, much structural information on protein domains and their pair-wise interactions has been made available in public databases. However, it is not yet clear how best to use this information to discover general rules or interaction patterns about structural protein-protein interactions. Improving our ability to detect and exploit structural interaction patterns will help to provide a better 3D picture of the known protein interactome, and will help to guide docking-based predictions of the 3D structures of unsolved protein complexes. RESULTS This article presents KBDOCK, a 3D database approach for spatially clustering protein binding sites and for performing template-based (knowledge-based) protein docking. KBDOCK combines residue contact information from the 3DID database with the Pfam protein domain family classification together with coordinate data from the Protein Data Bank. This allows the 3D configurations of all known hetero domain-domain interactions to be superposed and clustered for each Pfam family. We find that most Pfam domain families have up to four hetero binding sites, and over 60% of all domain families have just one hetero binding site. The utility of this approach for template-based docking is demonstrated using 73 complexes from the Protein Docking Benchmark. Overall, up to 45 out of 73 complexes may be modelled by direct homology to existing domain interfaces, and key binding site information is found for 24 of the 28 remaining complexes. These results show that KBDOCK can often provide useful information for predicting the structures of unknown protein complexes. AVAILABILITY http://kbdock.loria.fr/ CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Bioinformatics | 2009

Gene–disease relationship discovery based on model-driven data integration and database view definition

Saliha Yilmaz; Philippe Jonveaux; Cedric Bicep; Laurent Pierron; Malika Smaïl-Tabbone; Marie-Dominique Devignes

Motivation: Computational methods are widely used to discover gene–disease relationships hidden in vast masses of available genomic and post-genomic data. In most current methods, a similarity measure is calculated between gene annotations and known disease genes or disease descriptions. However, more explicit gene–disease relationships are required for better insights into the molecular bases of diseases, especially for complex multi-gene diseases. Results: Explicit relationships between genes and diseases are formulated as candidate gene definitions that may include intermediary genes, e.g. orthologous or interacting genes. These definitions guide data modelling in our database approach for gene–disease relationship discovery and are expressed as views which ultimately lead to the retrieval of documented sets of candidate genes. A system called ACGR (Approach for Candidate Gene Retrieval) has been implemented and tested with three case studies including a rare orphan gene disease. Availability: The ACGR sources are freely available at http://bioinfo.loria.fr/projects/acgr/acgr-software/. See especially the file ‘disease_description’ and the folders ‘Xcollect_scenarios’ and ‘ACGR_views’. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Journal of Molecular Modeling | 2008

Multiple-step virtual screening using VSM-G: overview and validation of fast geometrical matching enrichment

Alexandre Beautrait; Vincent Leroux; Matthieu Chavent; Leo Ghemtio; Marie-Dominique Devignes; Malika Smaïl-Tabbone; Wensheng Cai; Xuegang Shao; Gilles Moreau; Peter Bladon; Jianhua Yao; Bernard Maigret

AbstractNumerous methods are available for use as part of a virtual screening strategy but, as yet, no single method is able to guarantee both a level of confidence comparable to experimental screening and a level of computing efficiency that could drastically cut the costs of early phase drug discovery campaigns. Here, we present VSM-G (virtual screening manager for computational grids), a virtual screening platform that combines several structure-based drug design tools. VSM-G aims to be as user-friendly as possible while retaining enough flexibility to accommodate other in silico techniques as they are developed. In order to illustrate VSM-G concepts, we present a proof-of-concept study of a fast geometrical matching method based on spherical harmonics expansions surfaces. This technique is implemented in VSM-G as the first module of a multiple-step sequence tailored for high-throughput experiments. We show that, using this protocol, notable enrichment of the input molecular database can be achieved against a specific target, here the liver-X nuclear receptor. The benefits, limitations and applicability of the VSM-G approach are discussed. Possible improvements of both the geometrical matching technique and its implementation within VSM-G are suggested. FigureBasic principle of the virtual screening funnel process.


Lecture Notes in Computer Science | 2005

BioRegistry: a structured metadata repository for bioinformatic databases

Malika Smaïl-Tabbone; Shazia Osman; Nizar Messai; Amedeo Napoli; Marie-Dominique Devignes

One of the major challenges in the post genomic era consists in exploiting the vast amounts of biological data stored in the numerous heterogeneous biological databases distributed worldwide. Most research projects in bioinformatics start with data retrieval from selected sources. However, identifying appropriate data sources is not trivial and requires the representation of the knowledge about data sources. We present here the BioRegistry project which aims at providing means to represent and exploit knowledge associated with biological databases. As a first step, a repository structure has been designed to organise metadata associated with databases consisting of five metadata categories: database identification, topics covered, quality information, access/availability, and tracking of the metadata. The BioRegistry model and its relationships with the DCMI (Dublin Core Metadata Initiative) are described. Prototypes with various functionalities to feed, maintain and exploit the repository are presented.


information integration and web-based applications & services | 2008

BioRegistry: automatic extraction of metadata for biological database retrieval and discovery

Marie Dominique Devignes; Philippe Franiatte; Nizar Messai; Amedeo Napoli; Malika Smaïl-Tabbone

Biological databases are blooming today at an increasing rate to deal with the huge amount of data produced by genomic and post-genomic research. The need for a well-maintained searchable directory is therefore an important issue for a good exploitation of these databases. The BioRegistry repository is automatically generated from a publicly available list of biological databases (The Molecular Biology Database Collection published in Nucleic Acids Research) and aims at associating content metadata with each database in view of database retrieval and/or discovery. Such content metadata are either simple keywords or terms belonging to a medical thesaurus. Querying modalities including a search by semantic similarity are described. The use of conceptual clustering methods is proposed to build a semantic classification of biological databases enabling browsing through the BioRegistry repository and discovering previously unknown databases.


Advances in Experimental Medicine and Biology | 2011

Ontology-Based Knowledge Discovery in Pharmacogenomics

Adrien Coulet; Malika Smaïl-Tabbone; Amedeo Napoli; Marie-Dominique Devignes

One current challenge in biomedicine is to analyze large amounts of complex biological data for extracting domain knowledge. This work holds on the use of knowledge-based techniques such as knowledge discovery (KD) and knowledge representation (KR) in pharmacogenomics, where knowledge units represent genotype-phenotype relationships in the context of a given treatment. An objective is to design knowledge base (KB, here also mentioned as an ontology) and then to use it in the KD process itself. A method is proposed for dealing with two main tasks: (1) building a KB from heterogeneous data related to genotype, phenotype, and treatment, and (2) applying KD techniques on knowledge assertions for extracting genotype-phenotype relationships. An application was carried out on a clinical trial concerned with the variability of drug response to montelukast treatment. Genotype-genotype and genotype-phenotype associations were retrieved together with new associations, allowing the extension of the initial KB. This experiment shows the potential of KR and KD processes, especially for designing KB, checking KB consistency, and reasoning for problem solving.


Journal of Chemical Information and Modeling | 2010

Comparison of Three Preprocessing Filters Efficiency in Virtual Screening: Identification of New Putative LXRβ Regulators As a Test Case

Leo Ghemtio; Marie-Dominique Devignes; Malika Smaïl-Tabbone; Michel Souchet; Vincent Leroux; Bernard Maigret

In silico screening methodologies are widely recognized as efficient approaches in early steps of drug discovery. However, in the virtual high-throughput screening (VHTS) context, where hit compounds are searched among millions of candidates, three-dimensional comparison techniques and knowledge discovery from databases should offer a better efficiency to finding novel drug leads than those of computationally expensive molecular dockings. Therefore, the present study aims at developing a filtering methodology to efficiently eliminate unsuitable compounds in VHTS process. Several filters are evaluated in this paper. The first two are structure-based and rely on either geometrical docking or pharmacophore depiction. The third filter is ligand-based and uses knowledge-based and fingerprint similarity techniques. These filtering methods were tested with the Liver X Receptor (LXR) as a target of therapeutic interest, as LXR is a key regulator in maintaining cholesterol homeostasis. The results show that the three considered filters are complementary so that their combination should generate consistent compound lists of potential hits.


International Journal of Metadata, Semantics and Ontologies | 2010

BioRegistry: Automatic extraction of metadata for biological database retrieval and discovery

Marie-Dominique Devignes; Philippe Franiatte; Nizar Messai; Emmanuel Bresso; Amedeo Napoli; Malika Smaïl-Tabbone

The need for a well-maintained searchable directory is an important issue with regard to the numerous biological databases produced by genomic and post-genomic research. The BioRegistry repository aims to associate content metadata belonging to a biomedical thesaurus with biological databases in view of retrieval or discovery. It is automatically generated from a publicly available list of biological databases. The querying modalities include a search by semantic similarity. The system performance is evaluated in terms of precision and recall on a collection test. A classification method is proposed for browsing and discovering databases through the BioRegistry.

Collaboration


Dive into the Malika Smaïl-Tabbone's collaboration.

Top Co-Authors

Avatar

Marie-Dominique Devignes

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nizar Messai

François Rabelais University

View shared research outputs
Top Co-Authors

Avatar

Bernard Maigret

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Emmanuel Bresso

Empresa Brasileira de Pesquisa Agropecuária

View shared research outputs
Top Co-Authors

Avatar

Yannick Toussaint

Free University of Bozen-Bolzano

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge