Pawel Herzyk | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pawel Herzyk is active.

Explore More

Publication

Featured researches published by Pawel Herzyk.

FEBS Letters | 2004

Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments ☆

Rainer Breitling; Patrick Armengaud; Anna Amtmann; Pawel Herzyk

One of the main objectives in the analysis of microarray experiments is the identification of genes that are differentially expressed under two experimental conditions. This task is complicated by the noisiness of the data and the large number of genes that are examined simultaneously. Here, we present a novel technique for identifying differentially expressed genes that does not originate from a sophisticated statistical model but rather from an analysis of biological reasoning. The new technique, which is based on calculating rank products (RP) from replicate experiments, is fast and simple. At the same time, it provides a straightforward and statistically stringent way to determine the significance level for each gene and allows for the flexible control of the false‐detection rate and familywise error rate in the multiple testing situation of a microarray experiment. We use the RP technique on three biological data sets and show that in each case it performs more reliably and consistently than the non‐parametric t‐test variant implemented in Tusher et al.s significance analysis of microarrays (SAM). We also show that the RP results are reliable in highly noisy data. An analysis of the physiological function of the identified genes indicates that the RP approach is powerful for identifying biologically relevant expression changes. In addition, using RP can lead to a sharp reduction in the number of replicate experiments needed to obtain reproducible results.

Genome Research | 2011

Chromosome and gene copy number variation allow major structural change between species and strains of Leishmania

Matthew B. Rogers; James D. Hilley; Nicholas J. Dickens; Jon Wilkes; Paul A. Bates; Daniel P. Depledge; David J. Harris; Yerim Her; Pawel Herzyk; Hideo Imamura; Thomas D. Otto; Mandy Sanders; Kathy Seeger; Jean-Claude Dujardin; Matthew Berriman; Deborah F. Smith; Christiane Hertz-Fowler; Jeremy C. Mottram

Leishmania parasites cause a spectrum of clinical pathology in humans ranging from disfiguring cutaneous lesions to fatal visceral leishmaniasis. We have generated a reference genome for Leishmania mexicana and refined the reference genomes for Leishmania major, Leishmania infantum, and Leishmania braziliensis. This has allowed the identification of a remarkably low number of genes or paralog groups (2, 14, 19, and 67, respectively) unique to one species. These were found to be conserved in additional isolates of the same species. We have predicted allelic variation and find that in these isolates, L. major and L. infantum have a surprisingly low number of predicted heterozygous SNPs compared with L. braziliensis and L. mexicana. We used short read coverage to infer ploidy and gene copy numbers, identifying large copy number variations between species, with 200 tandem gene arrays in L. major and 132 in L. mexicana. Chromosome copy number also varied significantly between species, with nine supernumerary chromosomes in L. infantum, four in L. mexicana, two in L. braziliensis, and one in L. major. A significant bias against gene arrays on supernumerary chromosomes was shown to exist, indicating that duplication events occur more frequently on disomic chromosomes. Taken together, our data demonstrate that there is little variation in unique gene content across Leishmania species, but large-scale genetic heterogeneity can result through gene amplification on disomic chromosomes and variation in chromosome number. Increased gene copy number due to chromosome amplification may contribute to alterations in gene expression in response to environmental conditions in the host, providing a genetic basis for disease tropism.

The Plant Cell | 2012

Alternative Splicing Mediates Responses of the Arabidopsis Circadian Clock to Temperature Changes

Allan B. James; Naeem H. Syed; Simon Bordage; Jacqueline Marshall; Gillian A. Nimmo; Gareth I. Jenkins; Pawel Herzyk; John W. S. Brown; Hugh G. Nimmo

The circadian clock is a timing device that allows plants to anticipate environmental changes rather than just respond to them. This work demonstrates that alternative splicing of clock gene transcripts is one of the mechanisms that regulate the clock, particularly in response to changes in temperature. Alternative splicing plays crucial roles by influencing the diversity of the transcriptome and proteome and regulating protein structure/function and gene expression. It is widespread in plants, and alteration of the levels of splicing factors leads to a wide variety of growth and developmental phenotypes. The circadian clock is a complex piece of cellular machinery that can regulate physiology and behavior to anticipate predictable environmental changes on a revolving planet. We have performed a system-wide analysis of alternative splicing in clock components in Arabidopsis thaliana plants acclimated to different steady state temperatures or undergoing temperature transitions. This revealed extensive alternative splicing in clock genes and dynamic changes in alternatively spliced transcripts. Several of these changes, notably those affecting the circadian clock genes LATE ELONGATED HYPOCOTYL (LHY) and PSEUDO RESPONSE REGULATOR7, are temperature-dependent and contribute markedly to functionally important changes in clock gene expression in temperature transitions by producing nonfunctional transcripts and/or inducing nonsense-mediated decay. Temperature effects on alternative splicing contribute to a decline in LHY transcript abundance on cooling, but LHY promoter strength is not affected. We propose that temperature-associated alternative splicing is an additional mechanism involved in the operation and regulation of the plant circadian clock.

Genome Biology | 2004

Function-informed transcriptome analysis of Drosophila renal tubule

Jing-jing Wang; Laura Kean; Jingli Yang; Adrian K. Allan; Shireen A. Davies; Pawel Herzyk; Julian A. T. Dow

BackgroundComprehensive, tissue-specific, microarray analysis is a potent tool for the identification of tightly defined expression patterns that might be missed in whole-organism scans. We applied such an analysis to Drosophila melanogaster Malpighian (renal) tubule, a defined differentiated tissue.ResultsThe transcriptome of the D. melanogaster Malpighian tubule is highly reproducible and significantly different from that obtained from whole-organism arrays. More than 200 genes are more than 10-fold enriched and over 1,000 are significantly enriched. Of the top 200 genes, only 18 have previously been named, and only 45% have even estimates of function. In addition, 30 transcription factors, not previously implicated in tubule development, are shown to be enriched in adult tubule, and their expression patterns respect precisely the domains and cell types previously identified by enhancer trapping. Of Drosophila genes with close human disease homologs, 50 are enriched threefold or more, and eight enriched 10-fold or more, in tubule. Intriguingly, several of these diseases have human renal phenotypes, implying close conservation of renal function across 400 million years of divergent evolution.ConclusionsFrom those genes that are identifiable, a radically new view of the function of the tubule, emphasizing solute transport rather than fluid secretion, can be obtained. The results illustrate the phenotype gap: historically, the effort expended on a model organism has tended to concentrate on a relatively small set of processes, rather than on the spread of genes in the genome.

Science | 2008

The Circadian Clock in Arabidopsis Roots Is a Simplified Slave Version of the Clock in Shoots

Allan B. James; José A. Monreal; Gillian A. Nimmo; Ciarán L. Kelly; Pawel Herzyk; Gareth I. Jenkins; Hugh G. Nimmo

The circadian oscillator in eukaryotes consists of several interlocking feedback loops through which the expression of clock genes is controlled. It is generally assumed that all plant cells contain essentially identical and cell-autonomous multiloop clocks. Here, we show that the circadian clock in the roots of mature Arabidopsis plants differs markedly from that in the shoots and that the root clock is synchronized by a photosynthesis-related signal from the shoot. Two of the feedback loops of the plant circadian clock are disengaged in roots, because two key clock components, the transcription factors CCA1 and LHY, are able to inhibit gene expression in shoots but not in roots. Thus, the plant clock is organ-specific but not organ-autonomous.

BMC Bioinformatics | 2004

Iterative Group Analysis (iGA): A simple tool to enhance sensitivity and facilitate interpretation of microarray experiments

Rainer Breitling; Anna Amtmann; Pawel Herzyk

BackgroundThe biological interpretation of even a simple microarray experiment can be a challenging and highly complex task. Here we present a new method (Iterative Group Analysis) to facilitate, improve, and accelerate this process.ResultsOur Iterative Group Analysis approach (iGA) uses elementary statistics to identify those functional classes of genes that are significantly changed in an experiment and at the same time determines which of the class members are most likely to be differentially expressed. iGA does not require that all members of a class change and is therefore robust against imperfect class assignments, which can be derived from public sources (e.g. GeneOntologies) or automated processes (e.g. key word extraction from gene names).In contrast to previous non-iterative approaches, iGA does not depend on the availability of fixed lists of differentially expressed genes, and thus can be used to increase the sensitivity of gene detection especially in very noisy or small data sets. In the extreme, iGA can even produce statistically meaningful results without any experimental replication.The automated functional annotation provided by iGA greatly reduces the complexity of microarray results and facilitates the interpretation process. In addition, iGA can be used as a fast and efficient tool for the platform-independent comparison of a microarray experiment to the vast number of published results, automatically highlighting shared genes of potential interest.ConclusionsBy applying iGA to a wide variety of data from diverse organisms and platforms we show that this approach enhances and accelerates the interpretation of microarray experiments.

Proceedings of the National Academy of Sciences of the United States of America | 2011

High-resolution human cytomegalovirus transcriptome

Derek Gatherer; Sepher Seirafian; Charles Cunningham; Mary Holton; Derrick J. Dargan; Katarina Baluchova; Ralph D. Hector; Julie Galbraith; Pawel Herzyk; Gavin William Grahame Wilkinson; Andrew J. Davison

Deep sequencing was used to bring high resolution to the human cytomegalovirus (HCMV) transcriptome at the stage when infectious virion production is under way, and major findings were confirmed by extensive experimentation using conventional techniques. The majority (65.1%) of polyadenylated viral RNA transcription is committed to producing four noncoding transcripts (RNA2.7, RNA1.2, RNA4.9, and RNA5.0) that do not substantially overlap designated protein-coding regions. Additional noncoding RNAs that are transcribed antisense to protein-coding regions map throughout the genome and account for 8.7% of transcription from these regions. RNA splicing is more common than recognized previously, which was evidenced by the identification of 229 potential donor and 132 acceptor sites, and it affects 58 protein-coding genes. The great majority (94) of 96 splice junctions most abundantly represented in the deep-sequencing data was confirmed by RT-PCR or RACE or supported by involvement in alternative splicing. Alternative splicing is frequent and particularly evident in four genes (RL8A, UL74A, UL124, and UL150A) that are transcribed by splicing from any one of many upstream exons. The analysis also resulted in the annotation of four previously unrecognized protein-coding regions (RL8A, RL9A, UL150A, and US33A), and expression of the UL150A protein was shown in the context of HCMV infection. The overall conclusion, that HCMV transcription is complex and multifaceted, has implications for the potential sophistication of virus functionality during infection. The study also illustrates the key contribution that deep sequencing can make to the genomics of nuclear DNA viruses.

Journal of Bioinformatics and Computational Biology | 2005

Rank-based methods as a non-parametric alternative of the t-statistic for the analysis of biological microarray data

Rainer Breitling; Pawel Herzyk

We have recently introduced a rank-based test statistic, RankProducts (RP), as a new non-parametric method for detecting differentially expressed genes in microarray experiments. It has been shown to generate surprisingly good results with biological datasets. The basis for this performance and the limits of the method are, however, little understood. Here we explore the performance of such rank-based approaches under a variety of conditions using simulated microarray data, and compare it with classical Wilcoxon rank sums and t-statistics, which form the basis of most alternative differential gene expression detection techniques. We show that for realistic simulated microarray datasets, RP is more powerful and accurate for sorting genes by differential expression than t-statistics or Wilcoxon rank sums - in particular for replicate numbers below 10, which are most commonly used in biological experiments. Its relative performance is particularly strong when the data are contaminated by non-normal random noise or when the samples are very inhomogenous, e.g. because they come from different time points or contain a mixture of affected and unaffected cells. However, RP assumes equal measurement variance for all genes and tends to give overly optimistic p-values when this assumption is violated. It is therefore essential that proper variance stabilizing normalization is performed on the data before calculating the RP values. Where this is impossible, another rank-based variant of RP (average ranks) provides a useful alternative with very similar overall performance. The Perl scripts implementing the simulation and evaluation are available upon request. Implementations of the RP method are available for download from the authors website (http://www.brc.dcs.gla.ac.uk/glama).

Genome Biology | 2013

Hyperosmotic priming of Arabidopsis seedlings establishes a long-term somatic memory accompanied by specific changes of the epigenome

Emanuela Sani; Pawel Herzyk; Giorgio Perrella; Vincent Colot; Anna Amtmann

BackgroundIn arid and semi-arid environments, drought and soil salinity usually occur at the beginning and end of a plants life cycle, offering a natural opportunity for the priming of young plants to enhance stress tolerance in mature plants. Chromatin marks, such as histone modifications, provide a potential molecular mechanism for priming plants to environmental stresses, but whether transient exposure of seedlings to hyperosmotic stress leads to chromatin changes that are maintained throughout vegetative growth remains unclear.ResultsWe have established an effective protocol for hyperosmotic priming in the model plant Arabidopsis, which includes a transient mild salt treatment of seedlings followed by an extensive period of growth in control conditions. Primed plants are identical to non-primed plants in growth and development, yet they display reduced salt uptake and enhanced drought tolerance after a second stress exposure. ChIP-seq analysis of four histone modifications revealed that the priming treatment altered the epigenomic landscape; the changes were small but they were specific for the treated tissue, varied in number and direction depending on the modification, and preferentially targeted transcription factors. Notably, priming leads to shortening and fractionation of H3K27me3 islands. This effect fades over time, but is still apparent after a ten day growth period in control conditions. Several genes with priming-induced differences in H3K27me3 showed altered transcriptional responsiveness to the second stress treatment.ConclusionExperience of transient hyperosmotic stress by young plants is stored in a long-term somatic memory comprising differences of chromatin status, transcriptional responsiveness and whole plant physiology.

Journal of Virology | 2011

Beyond the Consensus: Dissecting Within-Host Viral Population Diversity of Foot-and-Mouth Disease Virus by Using Next-Generation Genome Sequencing

Caroline F. Wright; Gaël Thébaud; Nick J. Knowles; Pawel Herzyk; David J. Paton; Daniel T. Haydon; Donald P. King

ABSTRACT The diverse sequences of viral populations within individual hosts are the starting material for selection and subsequent evolution of RNA viruses such as foot-and-mouth disease virus (FMDV). Using next-generation sequencing (NGS) performed on a Genome Analyzer platform (Illumina), this study compared the viral populations within two bovine epithelial samples (foot lesions) from a single animal with the inoculum used to initiate experimental infection. Genomic sequences were determined in duplicate sequencing runs, and the consensus sequence of the inoculum determined by NGS was identical to that previously determined using the Sanger method. However, NGS revealed the fine polymorphic substructure of the viral population, from nucleotide variants present at just below 50% frequency to those present at fractions of 1%. Some of the higher-frequency polymorphisms identified encoded changes within codons associated with heparan sulfate binding and were present in both foot lesions, revealing intermediate stages in the evolution of a tissue culture-adapted virus replicating within a mammalian host. We identified 2,622, 1,434, and 1,703 polymorphisms in the inoculum and in the two foot lesions, respectively: most of the substitutions occurred in only a small fraction of the population and represented the progeny from recent cellular replication prior to onset of any selective pressures. We estimated the upper limit for the genome-wide mutation rate of the virus within a cell to be 7.8 × 10−4 per nucleotide. The greater depth of detection achieved by NGS demonstrates that this method is a powerful and valuable tool for the dissection of FMDV populations within hosts.

Explore More