Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marc Weeber is active.

Publication


Featured researches published by Marc Weeber.


Journal of the Association for Information Science and Technology | 2005

A probabilistic similarity metric for Medline records: a model for author name disambiguation.

Vetle I. Torvik; Marc Weeber; Don R. Swanson; Neil R. Smalheiser

We present a model for automatically generating training sets and estimating the probability that a pair of Medline records sharing a last and first name initial are authored by the same individual, based on shared title words, journal name, co-authors, medical subject headings, language, and affiliation, as well as distinctive features of the name itself (i.e., presence of middle initial, suffix, and prevalence in Medline).


Bioinformatics | 2004

Distribution of information in biomedical abstracts and full-text publications

Martijn J. Schuemie; Marc Weeber; Bob J. A. Schijvenaars; E.M. van Mulligen; C C van der Eijk; Rob Jelier; Barend Mons; Jan A. Kors

MOTIVATION Full-text documents potentially hold more information than their abstracts, but require more resources for processing. We investigated the added value of full text over abstracts in terms of information content and occurrences of gene symbol--gene name combinations that can resolve gene-symbol ambiguity. RESULTS We analyzed a set of 3902 biomedical full-text articles. Different keyword measures indicate that information density is highest in abstracts, but that the information coverage in full texts is much greater than in abstracts. Analysis of five different standard sections of articles shows that the highest information coverage is located in the results section. Still, 30-40% of the information mentioned in each section is unique to that section. Only 30% of the gene symbols in the abstract are accompanied by their corresponding names, and a further 8% of the gene names are found in the full text. In the full text, only 18% of the gene symbols are accompanied by their gene names.


BMC Bioinformatics | 2005

Thesaurus-based disambiguation of gene symbols.

Bob J. A. Schijvenaars; Barend Mons; Marc Weeber; Martijn J. Schuemie; Erik M. van Mulligen; Hester M. Wain; Jan A. Kors

BackgroundMassive text mining of the biological literature holds great promise of relating disparate information and discovering new knowledge. However, disambiguation of gene symbols is a major bottleneck.ResultsWe developed a simple thesaurus-based disambiguation algorithm that can operate with very little training data. The thesaurus comprises the information from five human genetic databases and MeSH. The extent of the homonym problem for human gene symbols is shown to be substantial (33% of the genes in our combined thesaurus had one or more ambiguous symbols), not only because one symbol can refer to multiple genes, but also because a gene symbol can have many non-gene meanings. A test set of 52,529 Medline abstracts, containing 690 ambiguous human gene symbols taken from OMIM, was automatically generated. Overall accuracy of the disambiguation algorithm was up to 92.7% on the test set.ConclusionThe ambiguity of human gene symbols is substantial, not only because one symbol may denote multiple genes but particularly because many symbols have other, non-gene meanings. The proposed disambiguation approach resolves most ambiguities in our test set with high accuracy, including the important gene/not a gene decisions. The algorithm is fast and scalable, enabling gene-symbol disambiguation in massive text mining applications.


Journal of Biomedical Informatics | 2007

Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification

Martijn J. Schuemie; Barend Mons; Marc Weeber; Jan A. Kors

Gene and protein name identification in text requires a dictionary approach to relate synonyms to the same gene or protein, and to link names to external databases. However, existing dictionaries are incomplete. We investigate two complementary methods for automatic generation of a comprehensive dictionary: combination of information from existing gene and protein databases and rule-based generation of spelling variations. Both methods have been reported in literature before, but have hitherto not been combined and evaluated systematically. We combined gene and protein names from several existing databases of four different organisms. The combined dictionaries showed a substantial increase in recall on three different test sets, as compared to any single database. Application of 23 spelling variation rules to the combined dictionaries further increased recall. However, many rules appeared to have no effect and some appear to have a detrimental effect on precision.


Current Topics in Medicinal Chemistry | 2005

Chemical and Biological Profiling of an Annotated Compound Library Directed to the Nuclear Receptor Family

Montserrat Cases; Ricard Garcia-Serna; Kristina M. Hettne; Marc Weeber; Johan van der Lei; Scott Boyer; Jordi Mestres

Nuclear receptors form a family of ligand-activated transcription factors that regulate a wide variety of biological processes and are thus generally considered relevant targets in drug discovery. We have constructed an annotated compound library directed to nuclear receptors (NRacl) as a means for integrating the chemical and biological data being generated within this family. Special care has been put in the appropriate storage of annotations by using hierarchical classification schemes for both molecules and nuclear receptors, which takes the ability to extract knowledge from annotated compound libraries to another level. Analysis of NRacl has ultimately led to the identification of scaffolds with highly promiscuous nuclear receptor profiles and to the classification of nuclear receptor groups with similar scaffold promiscuity patterns. This information can be exploited in the design of probing libraries for deorphanization activities as well as for devising screening batteries to address selectivity issues.


Journal of Biomedical Discovery and Collaboration | 2007

Applied information retrieval and multidisciplinary research: new mechanistic hypotheses in complex regional pain syndrome.

Kristina M. Hettne; Marissa de Mos; Anke Gj de Bruijn; Marc Weeber; Scott Boyer; Erik M. van Mulligen; Montserrat Cases; Jordi Mestres; Johan van der Lei

BackgroundCollaborative efforts of physicians and basic scientists are often necessary in the investigation of complex disorders. Difficulties can arise, however, when large amounts of information need to reviewed. Advanced information retrieval can be beneficial in combining and reviewing data obtained from the various scientific fields. In this paper, a team of investigators with varying backgrounds has applied advanced information retrieval methods, in the form of text mining and entity relationship tools, to review the current literature, with the intention to generate new insights into the molecular mechanisms underlying a complex disorder. As an example of such a disorder the Complex Regional Pain Syndrome (CRPS) was chosen. CRPS is a painful and debilitating syndrome with a complex etiology that is still unraveled for a considerable part, resulting in suboptimal diagnosis and treatment.ResultsA text mining based approach combined with a simple network analysis identified Nuclear Factor kappa B (NFκB) as a possible central mediator in both the initiation and progression of CRPS.ConclusionThe result shows the added value of a multidisciplinary approach combined with information retrieval in hypothesis discovery in biomedical research. The new hypothesis, which was derived in silico, provides a framework for further mechanistic studies into the underlying molecular mechanisms of CRPS and requires evaluation in clinical and epidemiological studies.


Genome Biology | 2008

Calling on a million minds for community annotation in WikiProteins

Barend Mons; Michael Ashburner; Christine Chichester; Erik M. van Mulligen; Marc Weeber; Johan T. den Dunnen; Gert-Jan B. van Ommen; Mark A. Musen; Matthew Cockerill; Henning Hermjakob; Albert Mons; Abel Laerte Packer; Roberto Carlos dos Santos Pacheco; Suzanna E. Lewis; Alfred Berkeley; William Melton; Nickolas Barris; Jimmy Wales; Gerard Meijssen; Erik Moeller; Peter Jan Roes; Katy Börner; Amos Marc Bairoch


Briefings in Bioinformatics | 2005

Online tools to support literature-based discovery in the life sciences

Marc Weeber; Jan A. Kors; Barend Mons


Journal of the Association for Information Science and Technology | 2005

A probabilistic similarity metric for Medline records: A model for author name disambiguation: Research Articles

Vetle I. Torvik; Marc Weeber; Don R. Swanson; Neil R. Smalheiser


Journal of Clinical Periodontology | 2007

Automatic mining of the literature to generate new hypotheses for the possible link between periodontitis and atherosclerosis : lipopolysaccharide as a case study

Kristina M. Hettne; Marc Weeber; Marja L. Laine; Hugo ten Cate; Scott Boyer; Jan A. Kors; Bruno G. Loos

Collaboration


Dive into the Marc Weeber's collaboration.

Top Co-Authors

Avatar

Barend Mons

Leiden University Medical Center

View shared research outputs
Top Co-Authors

Avatar

Erik M. van Mulligen

Erasmus University Medical Center

View shared research outputs
Top Co-Authors

Avatar

Jan A. Kors

Erasmus University Medical Center

View shared research outputs
Top Co-Authors

Avatar

Rob Jelier

Erasmus University Rotterdam

View shared research outputs
Top Co-Authors

Avatar

Bob J. A. Schijvenaars

Erasmus University Medical Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Erik Van Mulligen

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Kristina M. Hettne

Leiden University Medical Center

View shared research outputs
Top Co-Authors

Avatar

Wessel Kraaij

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Christine Chichester

Swiss Institute of Bioinformatics

View shared research outputs
Researchain Logo
Decentralizing Knowledge