Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ferhat Ay is active.

Publication


Featured researches published by Ferhat Ay.


Science | 2010

Identification of functional elements and regulatory circuits by Drosophila modENCODE

Sushmita Roy; Jason Ernst; Peter V. Kharchenko; Pouya Kheradpour; Nicolas Nègre; Matthew L. Eaton; Jane M. Landolin; Christopher A. Bristow; Lijia Ma; Michael F. Lin; Stefan Washietl; Bradley I. Arshinoff; Ferhat Ay; Patrick E. Meyer; Nicolas Robine; Nicole L. Washington; Luisa Di Stefano; Eugene Berezikov; Christopher D. Brown; Rogerio Candeias; Joseph W. Carlson; Adrian Carr; Irwin Jungreis; Daniel Marbach; Rachel Sealfon; Michael Y. Tolstorukov; Sebastian Will; Artyom A. Alekseyenko; Carlo G. Artieri; Benjamin W. Booth

From Genome to Regulatory Networks For biologists, having a genome in hand is only the beginning—much more investigation is still needed to characterize how the genome is used to help to produce a functional organism (see the Perspective by Blaxter). In this vein, Gerstein et al. (p. 1775) summarize for the Caenorhabditis elegans genome, and The modENCODE Consortium (p. 1787) summarize for the Drosophila melanogaster genome, full transcriptome analyses over developmental stages, genome-wide identification of transcription factor binding sites, and high-resolution maps of chromatin organization. Both studies identified regions of the nematode and fly genomes that show highly occupied targets (or HOT) regions where DNA was bound by more than 15 of the transcription factors analyzed and the expression of related genes were characterized. Overall, the studies provide insights into the organization, structure, and function of the two genomes and provide basic information needed to guide and correlate both focused and genome-wide studies. The Drosophila modENCODE project demonstrates the functional regulatory network of flies. To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.


Genome Research | 2014

Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts

Ferhat Ay; Timothy L. Bailey; William Stafford Noble

Our current understanding of how DNA is packed in the nucleus is most accurate at the fine scale of individual nucleosomes and at the large scale of chromosome territories. However, accurate modeling of DNA architecture at the intermediate scale of ∼50 kb-10 Mb is crucial for identifying functional interactions among regulatory elements and their target promoters. We describe a method, Fit-Hi-C, that assigns statistical confidence estimates to mid-range intra-chromosomal contacts by jointly modeling the random polymer looping effect and previously observed technical biases in Hi-C data sets. We demonstrate that our proposed approach computes accurate empirical null models of contact probability without any distribution assumption, corrects for binning artifacts, and provides improved statistical power relative to a previously described method. High-confidence contacts identified by Fit-Hi-C preferentially link expressed gene promoters to active enhancers identified by chromatin signatures in human embryonic stem cells (ESCs), capture 77% of RNA polymerase II-mediated enhancer-promoter interactions identified using ChIA-PET in mouse ESCs, and confirm previously validated, cell line-specific interactions in mouse cortex cells. We observe that insulators and heterochromatin regions are hubs for high-confidence contacts, while promoters and strong enhancers are involved in fewer contacts. We also observe that binding peaks of master pluripotency factors such as NANOG and POU5F1 are highly enriched in high-confidence contacts for human ESCs. Furthermore, we show that pairs of loci linked by high-confidence contacts exhibit similar replication timing in human and mouse ESCs and preferentially lie within the boundaries of topological domains for human and mouse cell lines.


Genome Research | 2014

Three-dimensional modeling of the P. falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression

Ferhat Ay; Evelien M. Bunnik; Nelle Varoquaux; Sebastiaan M. Bol; Jacques Prudhomme; Jean-Philippe Vert; William Stafford Noble; Karine G. Le Roch

The development of the human malaria parasite Plasmodium falciparum is controlled by coordinated changes in gene expression throughout its complex life cycle, but the corresponding regulatory mechanisms are incompletely understood. To study the relationship between genome architecture and gene regulation in Plasmodium, we assayed the genome architecture of P. falciparum at three time points during its erythrocytic (asexual) cycle. Using chromosome conformation capture coupled with next-generation sequencing technology (Hi-C), we obtained high-resolution chromosomal contact maps, which we then used to construct a consensus three-dimensional genome structure for each time point. We observed strong clustering of centromeres, telomeres, ribosomal DNA, and virulence genes, resulting in a complex architecture that cannot be explained by a simple volume exclusion model. Internal virulence gene clusters exhibit domain-like structures in contact maps, suggesting that they play an important role in the genome architecture. Midway during the erythrocytic cycle, at the highly transcriptionally active trophozoite stage, the genome adopts a more open chromatin structure with increased chromosomal intermingling. In addition, we observed reduced expression of genes located in spatial proximity to the repressive subtelomeric center, and colocalization of distinct groups of parasite-specific genes with coordinated expression profiles. Overall, our results are indicative of a strong association between the P. falciparum spatial genome organization and gene expression. Understanding the molecular processes involved in genome conformation dynamics could contribute to the discovery of novel antimalarial strategies.


Nature Methods | 2015

Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes

Wenxiu Ma; Ferhat Ay; Choli Lee; Günhan Gülsoy; Xinxian Deng; Savannah Cook; Jennifer Hesson; Christopher Cavanaugh; Carol B. Ware; Anton Krumm; Jay Shendure; Carl Anthony Blau; Christine M. Disteche; William Stafford Noble; Zhijun Duan

High-throughput methods based on chromosome conformation capture have greatly advanced our understanding of the three-dimensional (3D) organization of genomes but are limited in resolution by their reliance on restriction enzymes. Here we describe a method called DNase Hi-C for comprehensively mapping global chromatin contacts. DNase Hi-C uses DNase I for chromatin fragmentation, leading to greatly improved efficiency and resolution over that of Hi-C. Coupling this method with DNA-capture technology provides a high-throughput approach for targeted mapping of fine-scale chromatin architecture. We applied targeted DNase Hi-C to characterize the 3D organization of 998 large intergenic noncoding RNA (lincRNA) promoters in two human cell lines. Our results revealed that expression of lincRNAs is tightly controlled by complex mechanisms involving both super-enhancers and the Polycomb repressive complex. Our results provide the first glimpse of the cell type–specific 3D organization of lincRNA genes.


Bioinformatics | 2014

A statistical approach for inferring the 3D structure of the genome

Nelle Varoquaux; Ferhat Ay; William Stafford Noble; Jean-Philippe Vert

Motivation: Recent technological advances allow the measurement, in a single Hi-C experiment, of the frequencies of physical contacts among pairs of genomic loci at a genome-wide scale. The next challenge is to infer, from the resulting DNA–DNA contact maps, accurate 3D models of how chromosomes fold and fit into the nucleus. Many existing inference methods rely on multidimensional scaling (MDS), in which the pairwise distances of the inferred model are optimized to resemble pairwise distances derived directly from the contact counts. These approaches, however, often optimize a heuristic objective function and require strong assumptions about the biophysics of DNA to transform interaction frequencies to spatial distance, and thereby may lead to incorrect structure reconstruction. Methods: We propose a novel approach to infer a consensus 3D structure of a genome from Hi-C data. The method incorporates a statistical model of the contact counts, assuming that the counts between two loci follow a Poisson distribution whose intensity decreases with the physical distances between the loci. The method can automatically adjust the transfer function relating the spatial distance to the Poisson intensity and infer a genome structure that best explains the observed data. Results: We compare two variants of our Poisson method, with or without optimization of the transfer function, to four different MDS-based algorithms—two metric MDS methods using different stress functions, a non-metric version of MDS and ChromSDE, a recently described, advanced MDS method—on a wide range of simulated datasets. We demonstrate that the Poisson models reconstruct better structures than all MDS-based methods, particularly at low coverage and high resolution, and we highlight the importance of optimizing the transfer function. On publicly available Hi-C data from mouse embryonic stem cells, we show that the Poisson methods lead to more reproducible structures than MDS-based methods when we use data generated using different restriction enzymes, and when we reconstruct structures at different resolutions. Availability and implementation: A Python implementation of the proposed method is available at http://cbio.ensmp.fr/pastis. Contact: [email protected] or [email protected]


Genome Research | 2012

Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

Daniel Marbach; Sushmita Roy; Ferhat Ay; Patrick E. Meyer; Rogerio Candeias; Tamer Kahveci; Christopher A. Bristow; Manolis Kellis

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein-protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level.


Genome Biology | 2015

Bipartite structure of the inactive mouse X chromosome

Xinxian Deng; Wenxiu Ma; Vijay Ramani; Andrew J. Hill; Fan Yang; Ferhat Ay; Joel B. Berletch; Carl Anthony Blau; Jay Shendure; Zhijun Duan; William Stafford Noble; Christine M. Disteche

BackgroundIn mammals, one of the female X chromosomes and all imprinted genes are expressed exclusively from a single allele in somatic cells. To evaluate structural changes associated with allelic silencing, we have applied a recently developed Hi-C assay that uses DNase I for chromatin fragmentation to mouse F1 hybrid systems.ResultsWe find radically different conformations for the two female mouse X chromosomes. The inactive X has two superdomains of frequent intrachromosomal contacts separated by a boundary region. Comparison with the recently reported two-superdomain structure of the human inactive X shows that the genomic content of the superdomains differs between species, but part of the boundary region is conserved and located near the Dxz4/DXZ4 locus. In mouse, the boundary region also contains a minisatellite, Ds-TR, and both Dxz4 and Ds-TR appear to be anchored to the nucleolus. Genes that escape X inactivation do not cluster but are located near the periphery of the 3D structure, as are regions enriched in CTCF or RNA polymerase. Fewer short-range intrachromosomal contacts are detected for the inactive alleles of genes subject to X inactivation compared with the active alleles and with genes that escape X inactivation. This pattern is also evident for imprinted genes, in which more chromatin contacts are detected for the expressed allele.ConclusionsBy applying a novel Hi-C method to map allelic chromatin contacts, we discover a specific bipartite organization of the mouse inactive X chromosome that probably plays an important role in maintenance of gene silencing.


Genome Biology | 2015

Analysis methods for studying the 3D architecture of the genome

Ferhat Ay; William Stafford Noble

The rapidly increasing quantity of genome-wide chromosome conformation capture data presents great opportunities and challenges in the computational modeling and interpretation of the three-dimensional genome. In particular, with recent trends towards higher-resolution high-throughput chromosome conformation capture (Hi-C) data, the diversity and complexity of biological hypotheses that can be tested necessitates rigorous computational and statistical methods as well as scalable pipelines to interpret these datasets. Here we review computational tools to interpret Hi-C data, including pipelines for mapping, filtering, and normalization, and methods for confidence estimation, domain calling, visualization, and three-dimensional modeling.


Genome Research | 2015

Topologically associating domains and their long-range contacts are established during early G1 coincident with the establishment of the replication-timing program

Vishnu Dileep; Ferhat Ay; Jiao Sima; Daniel L. Vera; William Stafford Noble; David M. Gilbert

Mammalian genomes are partitioned into domains that replicate in a defined temporal order. These domains can replicate at similar times in all cell types (constitutive) or at cell type-specific times (developmental). Genome-wide chromatin conformation capture (Hi-C) has revealed sub-megabase topologically associating domains (TADs), which are the structural counterparts of replication domains. Hi-C also segregates inter-TAD contacts into defined 3D spatial compartments that align precisely to genome-wide replication timing profiles. Determinants of the replication-timing program are re-established during early G1 phase of each cell cycle and lost in G2 phase, but it is not known when TAD structure and inter-TAD contacts are re-established after their elimination during mitosis. Here, we use multiplexed 4C-seq to study dynamic changes in chromatin organization during early G1. We find that both establishment of TADs and their compartmentalization occur during early G1, within the same time frame as establishment of the replication-timing program. Once established, this 3D organization is preserved either after withdrawal into quiescence or for the remainder of interphase including G2 phase, implying 3D structure is not sufficient to maintain replication timing. Finally, we find that developmental domains are less well compartmentalized than constitutive domains and display chromatin properties that distinguish them from early and late constitutive domains. Overall, this study uncovers a strong connection between chromatin re-organization during G1, establishment of replication timing, and its developmental control.


PLOS ONE | 2009

Scalable Steady State Analysis of Boolean Biological Regulatory Networks

Ferhat Ay; Fei Xu; Tamer Kahveci

Background Computing the long term behavior of regulatory and signaling networks is critical in understanding how biological functions take place in organisms. Steady states of these networks determine the activity levels of individual entities in the long run. Identifying all the steady states of these networks is difficult due to the state space explosion problem. Methodology In this paper, we propose a method for identifying all the steady states of Boolean regulatory and signaling networks accurately and efficiently. We build a mathematical model that allows pruning a large portion of the state space quickly without causing any false dismissals. For the remaining state space, which is typically very small compared to the whole state space, we develop a randomized traversal method that extracts the steady states. We estimate the number of steady states, and the expected behavior of individual genes and gene pairs in steady states in an online fashion. Also, we formulate a stopping criterion that terminates the traversal as soon as user supplied percentage of the results are returned with high confidence. Conclusions This method identifies the observed steady states of boolean biological networks computationally. Our algorithm successfully reported the G1 phases of both budding and fission yeast cell cycles. Besides, the experiments suggest that this method is useful in identifying co-expressed genes as well. By analyzing the steady state profile of Hedgehog network, we were able to find the highly co-expressed gene pair GL1-SMO together with other such pairs. Availability Source code of this work is available at http://bioinformatics.cise.ufl.edu/palSteady.html twocolumnfalse]

Collaboration


Dive into the Ferhat Ay's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Abhijit Chakraborty

La Jolla Institute for Allergy and Immunology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sushmita Roy

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Manolis Kellis

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Christopher A. Bristow

University of Texas MD Anderson Cancer Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jay Shendure

University of Washington

View shared research outputs
Researchain Logo
Decentralizing Knowledge