Paul M. B. Vitányi
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Paul M. B. Vitányi.
international symposium on information theory | 2003
Ming Li; Xin Chen; Xin Li; Bin Ma; Paul M. B. Vitányi
We present a new method for clustering based on compression. The method does not use subject-specific features or background knowledge, and works as follows: First, we determine a parameter-free, universal, similarity distance, the normalized compression distance or NCD, computed from the lengths of compressed data files (singly and in pairwise concatenation). Second, we apply a hierarchical clustering method. The NCD is not restricted to a specific application area, and works across application area boundaries. A theoretical precursor, the normalized information distance, co-developed by one of the authors, is provably optimal. However, the optimality comes at the price of using the noncomputable notion of Kolmogorov complexity. We propose axioms to capture the real-world setting, and show that the NCD approximates optimality. To extract a hierarchy of clusters from the distance matrix, we determine a dendrogram (ternary tree) by a new quartet method and a fast heuristic to implement it. The method is implemented and available as public software, and is robust under choice of different compressors. To substantiate our claims of universality and robustness, we report evidence of successful application in areas as diverse as genomics, virology, languages, literature, music, handwritten digits, astronomy, and combinations of objects from completely different domains, using statistical, dictionary, and block sorting compressors. In genomics, we presented new evidence for major questions in Mammalian evolution, based on whole-mitochondrial genomic analysis: the Eutherian orders and the Marsupionta hypothesis against the Theria hypothesis.
IEEE Transactions on Information Theory | 2000
Paul M. B. Vitányi; Ming Li
The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles minimum description length (MDL) and minimum message length (MML), abstracted as the ideal MDL principle and defined from Bayess rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the fundamental inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. The ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to finite sets then application of the ideal principle turns into Kolmogorovs minimal sufficient statistic. In general, we show that data compression is almost always the best strategy, both in model selection and prediction.
Computer Music Journal | 2004
Rudi Cilibrasi; Paul M. B. Vitányi; Ronald de Wolf
Cilibrasi, Vitanyi, and de Wolf Computer Music Journal, 28:4, pp. 49–67, Winter 2004 2004 Massachusetts Institute of Technology Rudi Cilibrasi,* Paul Vitanyi,*† and Ronald de Wolf* *Centrum voor Wiskunde en Informatica Kruislaan 413 1098 SJ Amsterdam, The Netherlands †Institute for Logic, Language, and Computation University of Amsterdam Plantage Muidergracht 24 1018 TV Amsterdam, The Netherlands {Rudi.Cilibrasi, Paul.Vitanyi, Ronald.de.Wolf}@cwi.nl Algorithmic Clustering of Music Based on String Compression
foundations of computer science | 1986
Paul M. B. Vitányi; Baruch Awerbuch
The contribution of this paper is two-fold. First, we describe two ways to construct multivalued atomic n-writer n-reader registers. The first solution uses atomic 1-writer 1-reader registers and unbounded tags. the other solution uses atomic 1-writer n-reader registers and bounded tags. The second part of the paper develops a general methodology to prove atomicity, by identifying a set of criteria which guaranty an effective construction for the required atomic mapping. We apply the method to prove atomicity of the two implementations for atomic multiwriter multireader registers.
Journal of Mathematical Psychology | 2007
Nick Chater; Paul M. B. Vitányi
Within psychology, neuroscience and artificial intelligence, there has been increasing interest in the proposal that the brain builds probabilistic models of sensory and linguistic input: that is, to infer a probabilistic model from a sample. The practical problems of such inference are substantial: the brain has limited data and restricted computational resources. But there is a more fundamental question: is the problem of inferring a probabilistic model from a sample possible even in principle? We explore this question and find some surprisingly positive and general results. First, for a broad class of probability distributions characterised by computability restrictions, we specify a learning algorithm that will almost surely identify a probability distribution in the limit given a finite i.i.d. sample of sufficient but unknown length. This is similarly shown to hold for sequences generated by a broad class of Markov chains, subject to computability assumptions. The technical tool is the strong law of large numbers. Second, for a large class of dependent sequences, we specify an algorithm which identifies in the limit a computable measure for which the sequence is typical, in the sense of Martin-Lof (there may be more than one such measure). The technical tool is the theory of Kolmogorov complexity. We analyse the associated predictions in both cases. We also briefly consider special cases, including language learning, and wider theoretical implications for psychology.
SIAM Journal on Computing | 1988
Paul M. B. Vitányi
We derive a lower bound on the average interconnect (edge) length in d-dimensional embeddings of arbitrary graphs, expressed in terms of diameter and symmetry. It is optimal for all graph topologies we have examined, including complete graph, star, binary n-cube, cube-connected cycles, complete binary tree, and mesh with wraparound (e.g., torus, ring). The lower bound is technology independent, and shows that many interconnection topologies of today’s multicomputers do not scale well in the physical world
SIAM Journal on Computing | 1991
Ming Li; Paul M. B. Vitányi
(d = 3)
The Mathematical Intelligencer | 1997
Walter W. Kirchherr; Ming Li; Paul M. B. Vitányi
. The new proof technique is simple, geometrical, and works for wires with zero volume, e.g., for optical (fibre) or photonic (fibreless, laser) communication networks. Apparently, while getting rid of the “von Neumann” bottleneck in the shift from sequential to nonsequential computation, a new communication bottleneck arises because of the interplay between locality of computation, communication, and the number of dimensions of physical space. As a consequence, realistic models for nonsequentia...
Journal of Mathematical Psychology | 2003
Nick Chater; Paul M. B. Vitányi
This paper aims at developing a learning theory where “simple” concepts are easily learnable. In Valiant’s learning model, many concepts turn out to be too hard (like NP hard) to learn. Relatively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to learn are usually learnable. To model the intuitive notion of learning more closely, it is not required that the learning algorithm learn (polynomially) under all distributions, but only under all simple distributions. A distribution is simple if it is dominated by an enumerable distribution. All distributions with computable parameters that are used in statistics are simple. Simple distributions are complete in the sense that a concept class is learnable under all simple distributions if and only if it is learnable under a fixed “universal” simple distribution. This holds both for polynomial learning in the discrete case (under a modified model), and for non-time-restricted learning in the continuous case...
Journal of Physics A | 2001
Harry Buhrman; John Tromp; Paul M. B. Vitányi
Supported in part by the NSERC Operating Grant OGP0046506, ITRC, a CGAT grant, and the Steacie Fellowship.