Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jack A. M. Leunissen is active.

Publication


Featured researches published by Jack A. M. Leunissen.


Trends in Genetics | 2008

The quest for orthologs: finding the corresponding gene across genomes

Arnold Kuzniar; Roeland C. H. J. van Ham; Sándor Pongor; Jack A. M. Leunissen

Orthology is a key evolutionary concept in many areas of genomic research. It provides a framework for subjects as diverse as the evolution of genomes, gene functions, cellular networks and functional genome annotation. Although orthologous proteins usually perform equivalent functions in different species, establishing true orthologous relationships requires a phylogenetic approach, which combines both trees and graphs (networks) using reliable species phylogeny and available genomic data from more than two species, and an insight into the processes of molecular evolution. Here, we evaluate the available bioinformatics tools and provide a set of guidelines to aid researchers in choosing the most appropriate tool for any situation.


BMC Bioinformatics | 2006

QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species

Jifeng Tang; Ben Vosman; Roeland E. Voorrips; C. Gerard van der Linden; Jack A. M. Leunissen

BackgroundSingle nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only.ResultsWe have developed a new algorithm to detect reliable SNPs and insertions/deletions (indels) in EST data, both with and without quality files. Implemented in a pipeline called QualitySNP, it uses three filters for the identification of reliable SNPs. Filter 1 screens for all potential SNPs and identifies variation between or within genotypes. Filter 2 is the core filter that uses a haplotype-based strategy to detect reliable SNPs. Clusters with potential paralogs as well as false SNPs caused by sequencing errors are identified. Filter 3 screens SNPs by calculating a confidence score, based upon sequence redundancy and quality. Non-synonymous SNPs are subsequently identified by detecting open reading frames of consensus sequences (contigs) with SNPs. The pipeline includes a data storage and retrieval system for haplotypes, SNPs and alignments. QualitySNPs versatility is demonstrated by the identification of SNPs in EST datasets from potato, chicken and humans.ConclusionQualitySNP is an efficient tool for SNP detection, storage and retrieval in diploid as well as polyploid species. It is available for running on Linux or UNIX systems. The program, test data, and user manual are available at http://www.bioinformatics.nl/tools/snpweb/ and as Additional files.


Journal of Integrative Bioinformatics | 2011

MADMAX - Management and analysis database for multiple ~omics experiments

Ke Lin; Harrie J. Kools; Philip J. de Groot; Anand Gavai; Ram Kumar Basnet; Feng Cheng; Jian Wu; Xiaowu Wang; Arjen Lommen; Guido Hooiveld; Guusje Bonnema; Richard G. F. Visser; Michael Müller; Jack A. M. Leunissen

The rapid increase of ~omics datasets generated by microarray, mass spectrometry and next generation sequencing technologies requires an integrated platform that can combine results from different ~omics datasets to provide novel insights in the understanding of biological systems. MADMAX is designed to provide a solution for storage and analysis of complex ~omics datasets. In addition, analysis results (such as lists of genes) will be merged to reveal candidate genes supported by all datasets. The system constitutes an ISA-Tab compliant LIMS part which is independent of different analysis pipelines. A pilot study of different type of ~omics data in Brassica rapa demonstrates the possible use of MADMAX. The web-based user interface provides easy access to data and analysis tools on top of the database.


Molecular & Cellular Proteomics | 2013

The Human Leukocyte Antigen–presented Ligandome of B Lymphocytes

Chopie Hassan; Michel G.D. Kester; Arnoud H. de Ru; Pleun Hombrink; Jan W. Drijfhout; Harm Nijveen; Jack A. M. Leunissen; Mirjam H.M. Heemskerk; J.H. Frederik Falkenburg; Peter A. van Veelen

Peptides presented by human leukocyte antigen (HLA) molecules on the cell surface play a crucial role in adaptive immunology, mediating the communication between T cells and antigen presenting cells. Knowledge of these peptides is of pivotal importance in fundamental studies of T cell action and in cellular immunotherapy and transplantation. In this paper we present the in-depth identification and relative quantification of 14,500 peptide ligands constituting the HLA ligandome of B cells. This large number of identified ligands provides general insight into the presented peptide repertoire and antigen presentation. Our uniquely large set of HLA ligands allowed us to characterize in detail the peptides constituting the ligandome in terms of relative abundance, peptide length distribution, physicochemical properties, binding affinity to the HLA molecule, and presence of post-translational modifications. The presented B-lymphocyte ligandome is shown to be a rich source of information by the presence of minor histocompatibility antigens, virus-derived epitopes, and post-translationally modified HLA ligands, and it can be a good starting point for solving a wealth of specific immunological questions. These HLA ligands can form the basis for reversed immunology approaches to identify T cell epitopes based not on in silico predictions but on the bona fide eluted HLA ligandome.


Chromosoma | 2004

Satellite repeats in the functional centromere and pericentromeric heterochromatin of Medicago truncatula

Olga Kulikova; René Geurts; Monique Lamine; Dong-Jin Kim; Douglas R. Cook; Jack A. M. Leunissen; Hans de Jong; Bruce A. Roe; Ton Bisseling

Most eukaryotic centromeres contain long arrays of tandem repeats, with unit lengths of 150–300 bp. We searched for such repeats in the functional centromeres of the model legume Medicago truncatula (Medicago) accession Jemalong A17. To this end three repeats, MtR1, MtR2 and MtR3, were identified in 20 Mb of a low-pass, whole genome sequencing data set generated by a random shotgun approach. The nucleotide sequence composition, genomic organization and abundance of these repeats were characterized. Fluorescent in situ hybridization of these repeats on chromosomes at meiosis I showed that only the MtR3 repeat, encompassing stretches of 450 kb to more than 1.0 Mb, is located in the functional portion of all eight centromeres. MtR1 and MtR2 occupy distinct regions in pericentromeric heterochromatin. We also studied the presence and distribution of MtRs in Medicago accession R108-1, a genotype with a genome that is 20% smaller than that of Jemalong A17. We determined that while MtR3 is also centromeric on all pachytene bivalents in R108-1, MtR1 and MtR2 are not present in the R108 genome.


Molecular Breeding | 2010

A pipeline for high throughput detection and mapping of SNPs from EST databases

A. M. Anithakumari; Jifeng Tang; Herman J. van Eck; Richard G. F. Visser; Jack A. M. Leunissen; Ben Vosman; C. Gerard van der Linden

Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12% dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation.


BMC Bioinformatics | 2008

Large-scale identification of polymorphic microsatellites using an in silico approach

Jifeng Tang; Samantha Baldwin; Jeanne M. E. Jacobs; C. Gerard van der Linden; Roeland E. Voorrips; Jack A. M. Leunissen; Herman J. van Eck; Ben Vosman

BackgroundSimple Sequence Repeat (SSR) or microsatellite markers are valuable for genetic research. Experimental methods to develop SSR markers are laborious, time consuming and expensive. In silico approaches have become a practicable and relatively inexpensive alternative during the last decade, although testing putative SSR markers still is time consuming and expensive. In many species only a relatively small percentage of SSR markers turn out to be polymorphic. This is particularly true for markers derived from expressed sequence tags (ESTs). In EST databases a large redundancy of sequences is present, which may contain information on length-polymorphisms in the SSR they contain, and whether they have been derived from heterozygotes or from different genotypes. Up to now, although a number of programs have been developed to identify SSRs in EST sequences, no software can detect putatively polymorphic SSRs.ResultsWe have developed PolySSR, a new pipeline to identify polymorphic SSRs rather than just SSRs. Sequence information is obtained from public EST databases derived from heterozygous individuals and/or at least two different genotypes. The pipeline includes PCR-primer design for the putatively polymorphic SSR markers, taking into account Single Nucleotide Polymorphisms (SNPs) in the flanking regions, thereby improving the success rate of the potential markers. A large number of polymorphic SSRs were identified using publicly available EST sequences of potato, tomato, rice, Arabidopsis, Brassica and chicken.The SSRs obtained were divided into long and short based on the number of times the motif was repeated. Surprisingly, the frequency of polymorphic SSRs was much higher in the short SSRs.ConclusionPolySSR is a very effective tool to identify polymorphic SSRs. Using PolySSR, several hundred putative markers were developed and stored in a searchable database. Validation experiments showed that almost all markers that were indicated as putatively polymorphic by polySSR were indeed polymorphic. This greatly improves the efficiency of marker development, especially in species where there are low levels of polymorphism, like tomato. When combined with the new sequencing technologies PolySSR will have a big impact on the development of polymorphic SSRs in any species.PolySSR and the polymorphic SSR marker database are available from http://www.bioinformatics.nl/tools/polyssr/.


Nucleic Acids Research | 2007

A Protein Classification Benchmark collection for machine learning

Paolo Sonego; Mircea Pacurar; Somdutta Dhir; Attila Kertész-Farkas; András Kocsor; Zoltán Gáspári; Jack A. M. Leunissen; Sándor Pongor

Protein classification by machine learning algorithms is now widely used in structural and functional annotation of proteins. The Protein Classification Benchmark collection () was created in order to provide standard datasets on which the performance of machine learning methods can be compared. It is primarily meant for method developers and users interested in comparing methods under standardized conditions. The collection contains datasets of sequences and structures, and each set is subdivided into positive/negative, training/test sets in several ways. There is a total of 6405 classification tasks, 3297 on protein sequences, 3095 on protein structures and 10 on protein coding regions in DNA. Typical tasks include the classification of structural domains in the SCOP and CATH databases based on their sequences or structures, as well as various functional and taxonomic classification problems. In the case of hierarchical classification schemes, the classification tasks can be defined at various levels of the hierarchy (such as classes, folds, superfamilies, etc.). For each dataset there are distance matrices available that contain all vs. all comparison of the data, based on various sequence or structure comparison methods, as well as a set of classification performance measures computed with various classifier algorithms.


BMC Proceedings | 2009

Methods for interpreting lists of affected genes obtained in a DNA microarray experiment

Jakob Hedegaard; Cristina Arce; Silvio Bicciato; Agnès Bonnet; Bart Buitenhuis; Melania Collado-Romero; Lene Nagstrup Conley; Magali SanCristobal; Francesco Ferrari; Juan J. Garrido; M.A.M. Groenen; Henrik Hornshøj; Ina Hulsegge; Li Jiang; Ángeles Jiménez-Marín; Arun Kommadath; Sandrine Lagarrigue; Jack A. M. Leunissen; Laurence Liaubet; Pieter B. T. Neerincx; Haisheng Nie; Jan J. van der Poel; Dennis Prickett; M. Ramírez-Boo; J.M.J. Rebel; Christèle Robert-Granié; Axel Skarman; Mari A. Smits; Peter Sørensen; Gwenola Tosser-Klopp

BackgroundThe aim of this paper was to describe and compare the methods used and the results obtained by the participants in a joint EADGENE (European Animal Disease Genomic Network of Excellence) and SABRE (Cutting Edge Genomics for Sustainable Animal Breeding) workshop focusing on post analysis of microarray data. The participating groups were provided with identical lists of microarray probes, including test statistics for three different contrasts, and the normalised log-ratios for each array, to be used as the starting point for interpreting the affected probes. The data originated from a microarray experiment conducted to study the host reactions in broilers occurring shortly after a secondary challenge with either a homologous or heterologous species of Eimeria.ResultsSeveral conceptually different analytical approaches, using both commercial and public available software, were applied by the participating groups. The following tools were used: Ingenuity Pathway Analysis, MAPPFinder, LIMMA, GOstats, GOEAST, GOTM, Globaltest, TopGO, ArrayUnlock, Pathway Studio, GIST and AnnotationDbi. The main focus of the approaches was to utilise the relation between probes/genes and their gene ontology and pathways to interpret the affected probes/genes. The lack of a well-annotated chicken genome did though limit the possibilities to fully explore the tools. The main results from these analyses showed that the biological interpretation is highly dependent on the statistical method used but that some common biological conclusions could be reached.ConclusionIt is highly recommended to test different analytical methods on the same data set and compare the results to obtain a reliable biological interpretation of the affected genes in a DNA microarray experiment.


BMC Genetics | 2008

HaploSNPer: a web-based allele and SNP detection tool

Jifeng Tang; Jack A. M. Leunissen; Roeland E. Voorrips; C. Gerard van der Linden; Ben Vosman

BackgroundSingle nucleotide polymorphisms (SNPs) and small insertions or deletions (indels) are the most common type of polymorphisms and are frequently used for molecular marker development. Such markers have become very popular for all kinds of genetic analysis, including haplotype reconstruction. Haplotypes can be reconstructed for whole chromosomes but also for specific genes, based on the SNPs present. Haplotypes in the latter context represent the different alleles of a gene. The computational approach to SNP mining is becoming increasingly popular because of the continuously increasing number of sequences deposited in databases, which allows a more accurate identification of SNPs. Several software packages have been developed for SNP mining from databases. From these, QualitySNP is the only tool that combines SNP detection with the reconstruction of alleles, which results in a lower number of false positive SNPs and also works much faster than other programs. We have build a web-based SNP discovery and allele detection tool (HaploSNPer) based on QualitySNP.ResultsHaploSNPer is a flexible web-based tool for detecting SNPs and alleles in user-specified input sequences from both diploid and polyploid species. It includes BLAST for finding homologous sequences in public EST databases, CAP3 or PHRAP for aligning them, and QualitySNP for discovering reliable allelic sequences and SNPs. All possible and reliable alleles are detected by a mathematical algorithm using potential SNP information. Reliable SNPs are then identified based on the reconstructed alleles and on sequence redundancy.ConclusionThorough testing of HaploSNPer (and the underlying QualitySNP algorithm) has shown that EST information alone is sufficient for the identification of alleles and that reliable SNPs can be found efficiently. Furthermore, HaploSNPer supplies a user friendly interface for visualization of SNP and alleles. HaploSNPer is available from http://www.bioinformatics.nl/tools/haplosnper/.

Collaboration


Dive into the Jack A. M. Leunissen's collaboration.

Top Co-Authors

Avatar

Pieter B. T. Neerincx

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

Harm Nijveen

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

Haisheng Nie

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

M.A.M. Groenen

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

Arnold Kuzniar

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

Ben Vosman

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

C. Gerard van der Linden

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

Jifeng Tang

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

Sándor Pongor

Pázmány Péter Catholic University

View shared research outputs
Top Co-Authors

Avatar

Anand Gavai

Wageningen University and Research Centre

View shared research outputs
Researchain Logo
Decentralizing Knowledge