Chris S. Wallace | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chris S. Wallace is active.

Explore More

Publication

Featured researches published by Chris S. Wallace.

IEEE Transactions on Electronic Computers | 1964

A Suggestion for a Fast Multiplier

Chris S. Wallace

It is suggested that the economics of present large-scale scientific computers could benefit from a greater investment in hardware to mechanize multiplication and division than is now common. As a move in this direction, a design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step. Using straightforward diode-transistor logic, it appears presently possible to obtain products in under 1, ?sec, and quotients in 3 ?sec. A rapid square-root process is also outlined. Approximate component counts are given for the proposed design, and it is found that the cost of the unit would be about 10 per cent of the cost of a modern large-scale computer.

The Computer Journal | 1999

Minimum Message Length and Kolmogorov Complexity

Chris S. Wallace; David L. Dowe

The notion of algorithmic complexity was developed by Kolmogorov (1965) and Chaitin (1966) independently of one another and of Solomonoff’s notion (1964) of algorithmic probability. Given a Turing machine T , the (prefix) algorithmic complexity of a string S is the length of the shortest input to T which would cause T to output S and stop. The Solomonoff probability of S given T is the probability that a random binary string of 0s and 1s will result in T producing an output having S as a prefix. We attempt to establish a parallel between a restricted (two-part) version of the Kolmogorov model and the minimum message length approach to statistical inference and machine learning of Wallace and Boulton (1968), in which an ‘explanation’ of a data string is modelled as a two-part message, the first part stating a general hypothesis about the data and the second encoding details of the data not implied by the hypothesis. Solomonoff’s model is tailored to prediction rather than inference in that it considers not just the most likely explanation, but it also gives weights to all explanations depending upon their posterior probability. However, as the amount of data increases, we typically expect the most likely explanation to have a dominant weighting in the prediction.

Machine Learning | 1993

Coding Decision Trees

Chris S. Wallace; J. D. Patrick

Quinlan and Rivest have suggested a decision-tree inference method using the Minimum Description Length idea. We show that there is an error in their derivation of message lengths, which fortunately has no effect on the final inference. We further suggest two improvements to their coding techniques, one removing an inefficiency in the description of non-binary trees, and one improving the coding of leaves. We argue that these improvements are superior to similarly motivated proposals in the original paper.Empirical tests confirm the good results reported by Quinlan and Rivest, and show our coding proposals to lead to useful improvements in the performance of the method.

Statistics and Computing | 2000

MML clustering of multi-state, Poisson, vonMises circular and Gaussian distributions

Chris S. Wallace; David L. Dowe

Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194; Wallace C.S. and Freeman P.R. 1987. J. Royal Statistical Society (Series B), 49: 240–252; Wallace C.S. and Dowe D.L. (1999). Computer Journal), and how it has both an information-theoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194; Wallace C.S. 1986. In: Proceedings of the Nineteenth Australian Computer Science Conference (ACSC-9), Vol. 8, Monash University, Australia, pp. 357–366; Wallace C.S. and Dowe D.L. 1994b. In: Zhang C. et al. (Eds.), Proc. 7th Australian Joint Conf. on Artif. Intelligence. World Scientific, Singapore, pp. 37–44. See http://www.csse.monash.edu.au/-dld/Snob.html) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components and estimation of the relative abundances of the components. The message length is (to within a constant) the logarithm of the posterior probability (not a posterior density) of the theory. So, the MML theory can also be regarded as the theory with the highest posterior probability. Snob currently assumes that variables are uncorrelated within each component, and permits multi-variate data from Gaussian, discrete multi-category (or multi-state or multinomial), Poisson and von Mises circular distributions, as well as missing data. Additionally, Snob can do fully-parameterised mixture modelling, estimating the latent class assignments in addition to estimating the number of components, the relative abundances of the parameters and the component parameters. We also report on extensions of Snob for data which has sequential or spatial correlations between observations, or correlations between attributes.

Journal of Molecular Evolution | 1992

Finite-state models in the alignment of macromolecules

Lloyd Allison; Chris S. Wallace; Chut N. Yee

SummaryMinimum message length encoding is a technique of inductive inference with theoretical and practical advantages. It allows the posterior odds-ratio of two theories or hypotheses to be calculated. Here it is applied to problems of aligning or relating two strings, in particular two biological macromolecules. We compare the r-theory, that the strings are related, with the null-theory, that they are not related. If they are related, the probabilities of the various alignments can be calculated. This is done for one-, three-, and five-state models of relation or mutation. These correspond to linear and piecewise linear cost functions on runs of insertions and deletions. We describe how to estimate parameters of a model. The validity of a model is itself an hypothesis and can be objectively tested. This is done on real DNA strings and on artificial data. The tests on artificial data indicate limits on what can be inferred in various situations. The tests on real DNA support either the three- or five-state models over the one-state model. Finally, a fast, approximate minimum message length string comparison algorithm is described.

ACM Transactions on Mathematical Software | 1996

Fast pseudorandom generators for normal and exponential variates

Chris S. Wallace

Fast algorithms for generating pseudorandom numbers from the unit-normal and unit-exponential distributions are described. The methods are unusual in that they do not rely on a source of uniform random numbers, but generate the target distributions directly by using their maximal-entropy properties. The algorithms are fast. The normal generator is faster than the commonly used Unix library uniform generator “random” when the latter is used to yield real values. Their statistical properties seem satisfactory, but only a limited suite of tests has been conducted. They are written in C and as written assume 32-bit integer arithmetic. The code is publicly available as C source and can easily be adopted for longer word lengths and/or vector processing.

Archive | 1999

Learning Linear Causal Models by MML Sampling

Chris S. Wallace; Kevin B. Korb

We combine Minimum Message Length (MML) evaluation of linear causal models with Monte Carlo sampling to produce a program that, given ordinary joint sample data, reports the posterior probabilities of equivalence classes of causal models and their member models. We compare our program with TETRAD II [7.15] and the Bayesian MCMC program of Madigan et al. [7.11]. Our approach differs from that of Madigan et al. [7.11] particularly in not assigning equal prior probabilities to equivalence classes of causal models and in merging models from distinct equivalence classes when the causal links are sufficiently weak that the sample data available could not be expected to distinguish between them (which we call ‘small effect equivalence’).

Journal of Applied Physics | 1958

Ettingshausen Effect and Thermomagnetic Cooling

B. J. O'Brien; Chris S. Wallace

The use of the Ettingshausen effect in refrigeration is considered. The phenomenological similarity of this cooling effect with the cooling by a cascade of Peltier couples is treated. Theoretical expressions are derived for the maximum cooling that is possible with the Ettingshausen effect, and it is shown that the optimum shape of the cooling element is an exponential of a given form. Preliminary experiments with bismuth alloys have given a cooling of about 0.25°C.

The Computer Journal | 1998

Intrinsic Classification of Spatially Correlated Data

Chris S. Wallace

Intrinsic classification, or unsupervised learning of a classification, was the earliest application of what is now termed minimum message length (MML) or minimum description length (MDL) inference. The MML algorithm ‘Snob’ and its relatives have been used successfully in many domains. These algorithms treat the ‘things’ to be classified as independent random selections from an unknown population whose class structure, if any, is to be estimated. This work extends MML classification to domains where the ‘things’ have a known spatial arrangement and it may be expected that the classes of neighbouring things are correlated. Two cases are considered. In the first, the things are arranged in a sequence and the correlation between the classes of successive things modelled by a first-order Markov process. An algorithm for this case is constructed by combining the Snob algorithm with a simple dynamic programming algorithm. The method has been applied to the classification of protein secondary structure. In the second case, the things are arranged on a two-dimensional (2D) square grid, like the pixels of an image. Correlation is modelled by a prior over patterns of class assignments whose log probability depends on the number of adjacent mismatched pixel pairs. The algorithm uses Gibbs sampling from the pattern posterior and a thermodynamic relation to calculate message length.

knowledge discovery and data mining | 1998

Minimum Message Length Segmentation

Jonathan J. Oliver; Rohan A. Baxter; Chris S. Wallace

The segmentation problem arises in many applications in data mining, A.I. and statistics, including segmenting time series, decision tree algorithms and image processing. In this paper, we consider a range of criteria which may be applied to determine if some data should be segmented into two or regions. We develop a information theoretic criterion (MML) for the segmentation of univariate data with Gaussian errors. We perform simulations comparing segmentation methods (MML, AIC, MDL and BIC) and conclude that the MML criterion is the preferred criterion. We then apply the segmentation method to financial time series data.

Explore More