Is this you? Create Your Porfile

William Kent

University of California, Santa Cruz

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where William Kent is active.

Explore More

Publication

Featured researches published by William Kent.

Nucleic Acids Research | 2006

The UCSC genome browser database: update 2007

Robert M. Kuhn; Donna Karolchik; Ann S. Zweig; Heather Trumbower; Daryl J. Thomas; Archana Thakkapallayil; Charles W. Sugnet; Mario Stanke; Kayla E. Smith; Adam Siepel; Kate R. Rosenbloom; Brooke Rhead; Brian J. Raney; Andrew A. Pohl; Jakob Skou Pedersen; Fan Hsu; Angie S. Hinrichs; Rachel A. Harte; Mark Diekhans; Hiram Clawson; Gill Bejerano; Galt P. Barber; Robert Baertsch; David Haussler; William Kent

The UCSC Genome Browser Database (GBD, http://genome.ucsc.edu) is a publicly available collection of genome assembly sequence data and integrated annotations for a large number of organisms, including extensive comparative-genomic resources. In the past year, 13 new genome assemblies have been added, including two important primate species, orangutan and marmoset, bringing the total to 46 assemblies for 24 different vertebrates and 39 assemblies for 22 different invertebrate animals. The GBD datasets may be viewed graphically with the UCSC Genome Browser, which uses a coordinate-based display system allowing users to juxtapose a wide variety of data. These data include all mRNAs from GenBank mapped to all organisms, RefSeq alignments, gene predictions, regulatory elements, gene expression data, repeats, SNPs and other variation data, as well as pairwise and multiple-genome alignments. A variety of other bioinformatics tools are also provided, including BLAT, the Table Browser, the Gene Sorter, the Proteome Browser, VisiGene and Genome Graphs.

Science | 2010

Identification of functional elements and regulatory circuits by Drosophila modENCODE

Sushmita Roy; Jason Ernst; Peter V. Kharchenko; Pouya Kheradpour; Nicolas Nègre; Matthew L. Eaton; Jane M. Landolin; Christopher A. Bristow; Lijia Ma; Michael F. Lin; Stefan Washietl; Bradley I. Arshinoff; Ferhat Ay; Patrick E. Meyer; Nicolas Robine; Nicole L. Washington; Luisa Di Stefano; Eugene Berezikov; Christopher D. Brown; Rogerio Candeias; Joseph W. Carlson; Adrian Carr; Irwin Jungreis; Daniel Marbach; Rachel Sealfon; Michael Y. Tolstorukov; Sebastian Will; Artyom A. Alekseyenko; Carlo G. Artieri; Benjamin W. Booth

From Genome to Regulatory Networks For biologists, having a genome in hand is only the beginning—much more investigation is still needed to characterize how the genome is used to help to produce a functional organism (see the Perspective by Blaxter). In this vein, Gerstein et al. (p. 1775) summarize for the Caenorhabditis elegans genome, and The modENCODE Consortium (p. 1787) summarize for the Drosophila melanogaster genome, full transcriptome analyses over developmental stages, genome-wide identification of transcription factor binding sites, and high-resolution maps of chromatin organization. Both studies identified regions of the nematode and fly genomes that show highly occupied targets (or HOT) regions where DNA was bound by more than 15 of the transcription factors analyzed and the expression of related genes were characterized. Overall, the studies provide insights into the organization, structure, and function of the two genomes and provide basic information needed to guide and correlate both focused and genome-wide studies. The Drosophila modENCODE project demonstrates the functional regulatory network of flies. To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.

Nature | 2003

Comparative analyses of multi-species sequences from targeted genomic regions

James W. Thomas; Jeffrey W. Touchman; Robert W. Blakesley; Gerard G. Bouffard; Stephen M. Beckstrom-Sternberg; Elliott H. Margulies; Mathieu Blanchette; Adam Siepel; Pamela J. Thomas; Jennifer C. McDowell; Baishali Maskeri; Nancy F. Hansen; M. Schwartz; Ryan Weber; William Kent; Donna Karolchik; T. C. Bruen; R. Bevan; David J. Cutler; Scott Schwartz; Laura Elnitski; Jacquelyn R. Idol; A. B. Prasad; S. Q. Lee-Lin; Valerie Maduro; T. J. Summers; Matthew E. Portnoy; Nicole Dietrich; N. Akhter; K. Ayele

The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding and conserved non-coding regions, including regulatory elements, and provide insight into the forces that have rendered modern-day genomes. As a complement to whole-genome sequencing efforts, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.

Nucleic Acids Research | 2007

The UCSC Genome Browser Database: 2008 update

Donna Karolchik; Robert M. Kuhn; Robert Baertsch; Galt P. Barber; Hiram Clawson; Mark Diekhans; Belinda Giardine; Rachel A. Harte; Angie S. Hinrichs; Fan Hsu; K. M. Kober; Webb Miller; Jakob Skou Pedersen; Andy Pohl; Brian J. Raney; Brooke Rhead; Kate R. Rosenbloom; Kayla E. Smith; Mario Stanke; Archana Thakkapallayil; Heather Trumbower; Ting Wang; Ann S. Zweig; David Haussler; William Kent

The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrate and 21 invertebrate species as of September 2007. For each assembly, the GBD contains a collection of annotation data aligned to the genomic sequence. Highlights of this year’s additions include a 28-species human-based vertebrate conservation annotation, an enhanced UCSC Genes set, and more human variation, MGC, and ENCODE data. The database is optimized for fast interactive performance with a set of web-based tools that may be used to view, manipulate, filter and download the annotation data. New toolset features include the Genome Graphs tool for displaying genome-wide data sets, session saving and sharing, better custom track management, expanded Genome Browser configuration options and a Genome Browser wiki site. The downloadable GBD data, the companion Genome Browser toolset and links to documentation and related information can be found at: http://genome.ucsc.edu/. INTRODUCTION Fundamental to expanding our knowledge of how the human body works in health and in disease is the capability to access and share data produced through experimentation and computational analysis. The University of California, Santa Cruz (UCSC) Genome Browser Database (GBD) (http://genome.ucsc.edu) (1) provides a common repository for genomic annotation data—including comparative genomics, genes and gene predictions; mRNA and EST alignments; and expression, regulation, variation and assembly data—and robust, flexible tools for viewing, comparing, distributing and analyzing the information. Produced and maintained by the Genome Bioinformatics Group at the UCSC Center for Biomolecular Science and Engineering, the GBD focuses primarily on vertebrate and model organism genomes, with an emphasis on comparative genomics analysis. As of September 2007 the GBD contains data for 11 mammalian species including human, mouse, rat, chimpanzee, rhesus macaque, horse, cow, cat, dog, opossum and platypus; 8 other vertebrates: chicken, lizard (Anolis carolinensis), frog (Xenopus tropicalis), zebrafish, fugu, tetraodon, medaka and stickleback; and 21 invertebrates including 11 flies, honeybee, Anopheles mosquito, five worms, one yeast (Saccharomyces cerevisiae) and two deuterostomes—purple sea urchin and sea squirt. For many of the organisms, more than one assembly is provided, and several older archived assemblies may be *To whom correspondence should be addressed. Tel: +1 831 459 1544; Fax: +1 831 459 1809; Email: [email protected] University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrate and 21 invertebrate species as of September 2007. For each assembly, the GBD contains a collection of annotation data aligned to the genomic sequence. Highlights of this years additions include a 28-species human-based vertebrate conservation annotation, an enhanced UCSC Genes set, and more human variation, MGC, and ENCODE data. The database is optimized for fast interactive performance with a set of web-based tools that may be used to view, manipulate, filter and download the annotation data. New toolset features include the Genome Graphs tool for displaying genome-wide data sets, session saving and sharing, better custom track management, expanded Genome Browser configuration options and a Genome Browser wiki site. The downloadable GBD data, the companion Genome Browser toolset and links to documentation and related information can be found at: http://genome.ucsc.edu/.

Bioinformatics | 2010

BigWig and BigBed

William Kent; Ann S. Zweig; Galt P. Barber; Angie S. Hinrichs; Donna Karolchik

Summary: BigWig and BigBed files are compressed binary indexed files containing data at several resolutions that allow the high-performance display of next-generation sequencing experiment results in the UCSC Genome Browser. The visualization is implemented using a multi-layered software approach that takes advantage of specific capabilities of web-based protocols and Linux and UNIX operating systems files, R trees and various indexing and compression tricks. As a result, only the data needed to support the current browser view is transmitted rather than the entire file, enabling fast remote access to large distributed data sets. Availability and implementation: Binaries for the BigWig and BigBed creation and parsing utilities may be downloaded at http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/. Source code for the creation and visualization software is freely available for non-commercial use at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, implemented in C and supported on Linux. The UCSC Genome Browser is available at http://genome.ucsc.edu Contact: [email protected] Supplementary information: Supplementary byte-level details of the BigWig and BigBed file formats are available at Bioinformatics online. For an in-depth description of UCSC data file formats and custom tracks, see http://genome.ucsc.edu/FAQ/FAQformat.html and http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html

Genome Research | 2014

Alignathon: A competitive assessment of whole genome alignment methods

Dent Earl; Ngan Nguyen; Glenn Hickey; Robert S. Harris; Stephen Fitzgerald; Kathryn Beal; Seledtsov I; Molodtsov; Brian J. Raney; Hiram Clawson; Jaebum Kim; Carsten Kemena; Jia-Ming Chang; Ionas Erb; Poliakov A; Minmei Hou; Javier Herrero; William Kent; Solovyev; Aaron E. Darling; Jian Ma; Cedric Notredame; Michael Brudno; Inna Dubchak; David Haussler; Benedict Paten

Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.

Nucleic Acids Research | 2003

The UCSC Genome Browser Database

Donna Karolchik; Robert Baertsch; Mark Diekhans; Terrence S. Furey; Angie S. Hinrichs; Yontao Lu; Krishna M. Roskin; M. Schwartz; Charles W. Sugnet; Daryl J. Thomas; Ryan Weber; David Haussler; William Kent

pacific symposium on biocomputing | 2003