Walter L. Ruzzo
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Walter L. Ruzzo.
Communications of The ACM | 1976
Michael A. Harrison; Walter L. Ruzzo; Jeffrey D. Ullman
A model of protection mechanisms in computing systems is presented and its appropriateness is argued. The “safety” problem for protection systems under this model is to determine in a given situation whether a subject can acquire a particular right to an object. In restricted cases, it can be shown that this problem is decidable, i.e. there is an algorithm to determine whether a system in a particular configuration is safe. In general, and under surprisingly weak assumptions, it cannot be decided if a situation is safe. Various implications of this fact are discussed.
Bioinformatics | 2001
Ka Yee Yeung; Walter L. Ruzzo
MOTIVATION There is a great need to develop analytical methodology to analyze and to exploit the information contained in gene expression data. Because of the large number of genes and the complexity of biological networks, clustering is a useful exploratory technique for analysis of gene expression data. Other classical techniques, such as principal component analysis (PCA), have also been applied to analyze gene expression data. Using different data analysis techniques and different clustering algorithms to analyze the same data set can lead to very different conclusions. Our goal is to study the effectiveness of principal components (PCs) in capturing cluster structure. Specifically, using both real and synthetic gene expression data sets, we compared the quality of clusters obtained from the original data to the quality of clusters obtained after projecting onto subsets of the principal component axes. RESULTS Our empirical study showed that clustering with the PCs instead of the original variables does not necessarily improve, and often degrades, cluster quality. In particular, the first few PCs (which contain most of the variation in the data) do not necessarily capture most of the cluster structure. We also showed that clustering with PCs has different impact on different algorithms and different similarity metrics. Overall, we would not recommend PCA before clustering except in special circumstances.
Bioinformatics | 2001
Ka Yee Yeung; David R. Haynor; Walter L. Ruzzo
MOTIVATION Many clustering algorithms have been proposed for the analysis of gene expression data, but little guidance is available to help choose among them. We provide a systematic framework for assessing the results of clustering algorithms. Clustering algorithms attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. Our methodology is to apply a clustering algorithm to the data from all but one experimental condition. The remaining condition is used to assess the predictive power of the resulting clusters-meaningful clusters should exhibit less variation in the remaining condition than clusters formed by chance. RESULTS We successfully applied our methodology to compare six clustering algorithms on four gene expression data sets. We found our quantitative measures of cluster quality to be positively correlated with external standards of cluster quality.
Journal of Computer and System Sciences | 1981
Walter L. Ruzzo
Abstract We argue that uniform circuit complexity introduced by Borodin is a reasonable model of parallel complexity. Three main results are presented. First, we show that alternating Turing machines are also a surprisingly good model of parallel complexity, by showing that simultaneous size/depth of uniform circuits is the same as space/time of alternating Turing machines, with depth and time within a constant factor and likewise log(size) and space. Second, we apply this to characterize NC, the class of polynomial size and polynomial-in-log depth circuits, in terms of tree-size bounded alternating TMs and other models. In particular, this enables us to show that context-free language recognition is in NC. Third, we investigate various definitions of uniform circuit complexity, showing that it is fairly insensitive to the choice of definition.
Developmental Cell | 2010
Yi Cao; Zizhen Yao; Deepayan Sarkar; Michael S. Lawrence; Gilson J. Sanchez; Maura H. Parker; Kyle L. MacQuarrie; Jerry Davison; Martin Morgan; Walter L. Ruzzo; Robert Gentleman; Stephen J. Tapscott
Recent studies have demonstrated that MyoD initiates a feed-forward regulation of skeletal muscle gene expression, predicting that MyoD binds directly to many genes expressed during differentiation. We have used chromatin immunoprecipitation and high-throughput sequencing to identify genome-wide binding of MyoD in several skeletal muscle cell types. As anticipated, MyoD preferentially binds to a VCASCTG sequence that resembles the in vitro-selected site for a MyoD:E-protein heterodimer, and MyoD binding increases during differentiation at many of the regulatory regions of genes expressed in skeletal muscle. Unanticipated findings were that MyoD was constitutively bound to thousands of additional sites in both myoblasts and myotubes, and that the genome-wide binding of MyoD was associated with regional histone acetylation. Therefore, in addition to regulating muscle gene expression, MyoD binds genome wide and has the ability to broadly alter the epigenome in myoblasts and myotubes.
Stem Cells | 2008
Merav Bar; Stacia K. Wyman; Brian R. Fritz; Junlin Qi; Kavita Garg; Rachael K. Parkin; Evan M. Kroh; Ausra Bendoraite; Patrick S. Mitchell; Angelique M. Nelson; Walter L. Ruzzo; Carol B. Ware; Jerald P. Radich; Robert Gentleman; Hannele Ruohola-Baker; Muneesh Tewari
We used massively parallel pyrosequencing to discover and characterize microRNAs (miRNAs) expressed in human embryonic stem cells (hESC). Sequencing of small RNA cDNA libraries derived from undifferentiated hESC and from isogenic differentiating cultures yielded a total of 425,505 high‐quality sequence reads. A custom data analysis pipeline delineated expression profiles for 191 previously annotated miRNAs, 13 novel miRNAs, and 56 candidate miRNAs. Further characterization of a subset of the novel miRNAs in Dicer‐knockdown hESC demonstrated Dicer‐dependent expression, providing additional validation of our results. A set of 14 miRNAs (9 known and 5 novel) was noted to be expressed in undifferentiated hESC and then strongly downregulated with differentiation. Functional annotation analysis of predicted targets of these miRNAs and comparison with a null model using non‐hESC‐expressed miRNAs identified statistically enriched functional categories, including chromatin remodeling and lineage‐specific differentiation annotations. Finally, integration of our data with genome‐wide chromatin immunoprecipitation data on OCT4, SOX2, and NANOG binding sites implicates these transcription factors in the regulation of nine of the novel/candidate miRNAs identified here. Comparison of our results with those of recent deep sequencing studies in mouse and human ESC shows that most of the novel/candidate miRNAs found here were not identified in the other studies. The data indicate that hESC express a larger complement of miRNAs than previously appreciated, and they provide a resource for additional studies of miRNA regulation of hESC physiology.
Journal of Computer and System Sciences | 1980
Walter L. Ruzzo
The size of an accepting computation tree of an alternating Turing machine (ATM) is introduced as a complexity measure. We present a number of applications of tree-size to the study of more traditional complexity classes. Tree-size on ATMs is shown to closely correspond to time on nondeterministic TMs and on nondeterministic auxiliary pushdown automata. One application of the later is a useful new characterization of the class of languages log-space-reducible to context-free languages. Surprising relationships with parallel-time complexity are also demonstrated. ATM computations using at most space S(n) and tree-size Z(n) (simultaneously) can be simulated in alternating space S(n) and time S(n) · log Z(n) (simultaneously). Several well-known simulations, e.g., Savitchs theorem, are special cases of this result. It also leads to improved parallel complexity bounds for many problems in terms of both time and number of “processors.” As one example we show that context-free language recognition in time O(log2 n) is possible on several parallel models. Further, this bound is achievable with only a polynomial number of processors, in contrast to all previously known sub-linear time CFL recognizers.
Bioinformatics | 2006
Zizhen Yao; Zasha Weinberg; Walter L. Ruzzo
MOTIVATION The recent discoveries of large numbers of non-coding RNAs and computational advances in genome-scale RNA search create a need for tools for automatic, high quality identification and characterization of conserved RNA motifs that can be readily used for database search. Previous tools fall short of this goal. RESULTS CMfinder is a new tool to predict RNA motifs in unaligned sequences. It is an expectation maximization algorithm using covariance models for motif description, featuring novel integration of multiple techniques for effective search of motif space, and a Bayesian framework that blends mutual information-based and folding energy-based approaches to predict structure in a principled way. Extensive tests show that our method works well on datasets with either low or high sequence similarity, is robust to inclusion of lengthy extraneous flanking sequence and/or completely unrelated sequences, and is reasonably fast and scalable. In testing on 19 known ncRNA families, including some difficult cases with poor sequence conservation and large indels, our method demonstrates excellent average per-base-pair accuracy--79% compared with at most 60% for alternative methods. More importantly, the resulting probabilistic model can be directly used for homology search, allowing iterative refinement of structural models based on additional homologs. We have used this approach to obtain highly accurate covariance models of known RNA motifs based on small numbers of related sequences, which identified homologs in deeply-diverged species.
ACM Transactions on Programming Languages and Systems | 1980
Susan L. Graham; Michael A. Harrison; Walter L. Ruzzo
A new algorithm for recognizing and parsing arbitrary context-free languages is presented, and several new results are given on the computational complexity of these problems. The new algorithm is of both practical and theoretical interest. It is conceptually simple and allows a variety of efficient implementations, which are worked out in detail. Two versions are given which run in faster than cubic time. Surprisingly close connections between the Cocke-Kasami-Younger and Earley algorithms are established which reveal that the two algorithms are “almost” identical.
Developmental Cell | 2012
Linda N. Geng; Zizhen Yao; Lauren Snider; Abraham P. Fong; Jennifer N. Cech; Janet M. Young; Silvère M. van der Maarel; Walter L. Ruzzo; Robert Gentleman; Rabi Tawil; Stephen J. Tapscott
Facioscapulohumeral dystrophy (FSHD) is one of the most common inherited muscular dystrophies. The causative gene remains controversial and the mechanism of pathophysiology unknown. Here we identify genes associated with germline and early stem cell development as targets of the DUX4 transcription factor, a leading candidate gene for FSHD. The genes regulated by DUX4 are reliably detected in FSHD muscle but not in controls, providing direct support for the model that misexpression of DUX4 is a causal factor for FSHD. Additionally, we show that DUX4 binds and activates LTR elements from a class of MaLR endogenous primate retrotransposons and suppresses the innate immune response to viral infection, at least in part through the activation of DEFB103, a human defensin that can inhibit muscle differentiation. These findings suggest specific mechanisms of FSHD pathology and identify candidate biomarkers for disease diagnosis and progression.