Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marc S. Halfon is active.

Publication


Featured researches published by Marc S. Halfon.


Cell | 2000

Ras Pathway Specificity Is Determined by the Integration of Multiple Signal-Activated and Tissue-Restricted Transcription Factors

Marc S. Halfon; Ana Carmena; Stephen S. Gisselbrecht; Charles Sackerson; Fernando Jiménez; Mary K. Baylies; Alan M. Michelson

Ras signaling elicits diverse outputs, yet how Ras specificity is generated remains incompletely understood. We demonstrate that Wingless (Wg) and Decapentaplegic (Dpp) confer competence for receptor tyrosine kinase-mediated induction of a subset of Drosophila muscle and cardiac progenitors by acting both upstream of and in parallel to Ras. In addition to regulating the expression of proximal Ras pathway components, Wg and Dpp coordinate the direct effects of three signal-activated (dTCF, Mad, and Pointed-functioning in the Wg, Dpp, and Ras/MAPK pathways, respectively) and two tissue-restricted (Twist and Tinman) transcription factors on a progenitor identity gene enhancer. The integration of Pointed with the combinatorial effects of dTCF, Mad, Twist, and Tinman determines inductive Ras signaling specificity in muscle and heart development.


Nucleic Acids Research | 2007

ORegAnno: an open-access community-driven resource for regulatory annotation

Obi L. Griffith; Stephen B. Montgomery; Bridget Bernier; Bryan Chu; Katayoon Kasaian; Stein Aerts; Shaun Mahony; Monica C. Sleumer; Mikhail Bilenky; Maximilian Haeussler; Malachi Griffith; Steven M. Gallo; Belinda Giardine; Bart Hooghe; Peter Van Loo; Enrique Blanco; Amy Ticoll; Stuart Lithwick; Elodie Portales-Casamar; Ian J. Donaldson; Gordon Robertson; Claes Wadelius; Pieter De Bleser; Dominique Vlieghe; Marc S. Halfon; Wyeth W. Wasserman; Ross C. Hardison; Casey M. Bergman; Steven J.M. Jones

ORegAnno is an open-source, open-access database and literature curation system for community-based annotation of experimentally identified DNA regulatory regions, transcription factor binding sites and regulatory variants. The current release comprises 30 145 records curated from 922 publications and describing regulatory sequences for over 3853 genes and 465 transcription factors from 19 species. A new feature called the ‘publication queue’ allows users to input relevant papers from scientific literature as targets for annotation. The queue contains 4438 gene regulation papers entered by experts and another 54 351 identified by text-mining methods. Users can enter or ‘check out’ papers from the queue for manual curation using a series of user-friendly annotation pages. A typical record entry consists of species, sequence type, sequence, target gene, binding factor, experimental outcome and one or more lines of experimental evidence. An evidence ontology was developed to describe and categorize these experiments. Records are cross-referenced to Ensembl or Entrez gene identifiers, PubMed and dbSNP and can be visualized in the Ensembl or UCSC genome browsers. All data are freely available through search pages, XML data dumps or web services at: http://www.oreganno.org.


Nucleic Acids Research | 2011

REDfly v3.0: toward a comprehensive database of transcriptional regulatory elements in Drosophila

Steven M. Gallo; Dave T. Gerrard; David Miner; Michael Simich; Benjamin Des Soye; Casey M. Bergman; Marc S. Halfon

The REDfly database of Drosophila transcriptional cis-regulatory elements provides the broadest and most comprehensive available resource for experimentally validated cis-regulatory modules and transcription factor binding sites among the metazoa. The third major release of the database extends the utility of REDfly as a powerful tool for both computational and experimental studies of transcription regulation. REDfly v3.0 includes the introduction of new data classes to expand the types of regulatory elements annotated in the database along with a roughly 40% increase in the number of records. A completely redesigned interface improves access for casual and power users alike; among other features it now automatically provides graphical views of the genome, displays images of reporter gene expression and implements improved capabilities for database searching and results filtering. REDfly is freely accessible at http://redfly.ccr.buffalo.edu.


Nucleic Acids Research | 2007

REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila

Marc S. Halfon; Steven M. Gallo; Casey M. Bergman

The identification and study of the cis-regulatory elements that control gene expression are important areas of biological research, but few resources exist to facilitate large-scale bioinformatics studies of cis-regulation in metazoan species. Drosophila melanogaster, with its well-annotated genome, exceptional resources for comparative genomics and long history of experimental studies of transcriptional regulation, represents the ideal system for regulatory bioinformatics. We have merged two existing Drosophila resources, the REDfly database of cis-regulatory modules and the FlyReg database of transcription factor binding sites (TFBSs), into a single integrated database containing extensive annotation of empirically validated cis-regulatory modules and their constituent binding sites. With the enhanced functionality made possible through this integration of TFBS data into REDfly, together with additional improvements to the REDfly infrastructure, we have constructed a one-stop portal for Drosophila cis-regulatory data that will serve as a powerful resource for both computational and experimental studies of transcriptional regulation. REDfly is freely accessible at http://redfly.ccr.buffalo.edu.


Bioinformatics | 2006

REDfly: a Regulatory Element Database for Drosophila

Steven M. Gallo; Long Li; Zihua Hu; Marc S. Halfon

Bioinformatics studies of transcriptional regulation in the metazoa are significantly hindered by the absence of readily available data on large numbers of transcriptional cis-regulatory modules (CRMs). Even the richly annotated Drosophila melanogaster genome lacks extensive CRM information. We therefore present here a database of Drosophila CRMs curated from the literature complete with both DNA sequence and a searchable description of the gene expression pattern regulated by each CRM. This resource should greatly facilitate the development of computational approaches to CRM discovery as well as bioinformatics analyses of regulatory sequence properties and evolution.


Developmental Cell | 2008

A Combinatorial Code for Pattern Formation in Drosophila Oogenesis

Nir Yakoby; Christopher A. Bristow; Danielle Gong; Xenia Schafer; Jessica Lembong; Jeremiah J. Zartman; Marc S. Halfon; Trudi Schüpbach; Stanislav Y. Shvartsman

Two-dimensional patterning of the follicular epithelium in Drosophila oogenesis is required for the formation of three-dimensional eggshell structures. Our analysis of a large number of published gene expression patterns in the follicle cells suggests that they follow a simple combinatorial code based on six spatial building blocks and the operations of union, difference, intersection, and addition. The building blocks are related to the distribution of inductive signals, provided by the highly conserved epidermal growth factor receptor and bone morphogenetic protein signaling pathways. We demonstrate the validity of the code by testing it against a set of patterns obtained in a large-scale transcriptional profiling experiment. Using the proposed code, we distinguish 36 distinct patterns for 81 genes expressed in the follicular epithelium and characterize their joint dynamics over four stages of oogenesis. The proposed combinatorial framework allows systematic analysis of the diversity and dynamics of two-dimensional transcriptional patterns and guides future studies of gene regulation.


Genome Biology | 2007

Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses

Long Li; Qianqian Zhu; Xin He; Saurabh Sinha; Marc S. Halfon

BackgroundTranscriptional cis-regulatory modules (for example, enhancers) play a critical role in regulating gene expression. While many individual regulatory elements have been characterized, they have never been analyzed as a class.ResultsWe have performed the first such large-scale study of cis-regulatory modules in order to determine whether they have common properties that might aid in their identification and contribute to our understanding of the mechanisms by which they function. A total of 280 individual, experimentally verified cis-regulatory modules from Drosophila were analyzed for a range of sequence-level and functional properties. We report here that regulatory modules do indeed share common properties, among them an elevated GC content, an increased level of interspecific sequence conservation, and a tendency to be transcribed into RNA. However, we find that dense clustering of transcription factor binding sites, especially homotypic clustering, which is commonly believed to be a general characteristic of regulatory modules, is rather a feature that belongs chiefly to a specific subclass. This has important implications for current computational approaches, many of which are biased toward this subset. We explore two new strategies to assess binding site clustering and gauge their performances with respect to their ability to detect all 280 modules and various functionally coherent subsets.ConclusionOur findings demonstrate that cis-regulatory modules share common features that help to define them as a class and that may lead to new insights into mechanisms of gene regulation. However, these properties alone may not be sufficient to reliably distinguish regulatory from non-regulatory sequences. We also demonstrate that there are distinct subclasses of cis-regulatory modules that are more amenable to in silico detection than others and that these differences must be taken into account when attempting genome-wide regulatory element discovery.


Analytical Chemistry | 2011

Combinatorial peptide ligand library treatment followed by a dual-enzyme, dual-activation approach on a nanoflow liquid chromatography/orbitrap/electron transfer dissociation system for comprehensive analysis of swine plasma proteome.

Chengjian Tu; Jun Li; Rebeccah F. Young; Brian Page; Frank A. Engler; Marc S. Halfon; John M. Canty; Jun Qu

The plasma proteome holds enormous clinical potential, yet an in-depth analysis of the plasma proteome remains a daunting challenge due to its high complexity and the extremely wide dynamic range in protein concentrations. Furthermore, existing antibody-based approaches for depleting high-abundance proteins are not adaptable to the analysis of the animal plasma proteome, which is often essential for experimental pathology/pharmacology. Here we describe a highly comprehensive method for the investigation of the animal plasma proteome which employs an optimized combinatorial peptide ligand library (CPLL) treatment to reduce the protein concentration dynamic range and a dual-enzyme, dual-activation strategy to achieve high proteomic coverage. The CPLL treatment enriched the lower abundance proteins by >100-fold when the samples were loaded in moderately denaturing conditions with multiple loading-washing cycles. The native and the CPLL-treated plasma were digested in parallel by two enzymes (trypsin and GluC) carrying orthogonal specificities. By performing this differential proteolysis, the proteome coverage is improved where peptides produced by only one enzyme are poorly detectable. Digests were fractionated with high-resolution strong cation exchange chromatography and then resolved on a long, heated nano liquid chromatography column. MS analysis was performed on a linear triple quadrupole/orbitrap with two complementary activation methods (collisionally induced dissociation (CID) and electron transfer dissociation). We applied this optimized strategy to investigate the plasma proteome from swine, a prominent animal model for cardiovascular diseases (CVDs). This large-scale analysis results in identification of a total of 3421 unique proteins, spanning a concentration range of 9-10 orders of magnitude. The proteins were identified under a set of commonly accepted criteria, including a precursor mass error of <15 ppm, Xcorr cutoffs, and ≥2 unique peptides at a peptide probability of ≥95% and a protein probability of ≥99%, and the peptide false-positive rate of the data set was 1.8% as estimated by searching the reversed database. CPLL treatment resulted in 55% more identified proteins over those from native plasma; moreover, compared with using only trypsin and CID, the dual-enzyme/activation approach enabled the identification of 2.6-fold more proteins and substantially higher sequence coverage for most individual proteins. Further analysis revealed 657 proteins as significantly associated with CVDs (p < 0.05), which constitute five CVD-related pathways. This study represents the first in-depth investigation of a nonhuman plasma proteome, and the strategy developed here is adaptable to the comprehensive analysis of other highly complex proteomes.


Nucleic Acids Research | 2011

Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison

Majid Kazemian; Qiyun Zhu; Marc S. Halfon; Saurabh Sinha

Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, ‘enhancers’), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for ‘motif-blind’ CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to ‘supervise’ the search. We propose a new statistical method, based on ‘Interpolated Markov Models’, for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers.


Nature Genetics | 2006

Re)modeling the transcriptional enhancer

Marc S. Halfon

Transcriptional cis-regulatory modules have a fundamental role in regulating eukaryotic gene expression. A new study shows how computational modeling can test hypotheses about how regulatory elements function, with results that challenge conventional views of their organization.

Collaboration


Dive into the Marc S. Halfon's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kushal Suryamohan

State University of New York System

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alan M. Michelson

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Beatriz Estrada

Brigham and Women's Hospital

View shared research outputs
Researchain Logo
Decentralizing Knowledge