Thomas M. Cover | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thomas M. Cover is active.

Explore More

Publication

Featured researches published by Thomas M. Cover.

IEEE Transactions on Information Theory | 1967

Nearest neighbor pattern classification

Thomas M. Cover; Peter E. Hart

The nearest neighbor decision rule assigns to an unclassified sample point the classification of the nearest of a set of previously classified points. This rule is independent of the underlying joint distribution on the sample points and their classifications, and hence the probability of error R of such a rule must be at least as great as the Bayes probability of error R^{\ast} --the minimum probability of error over all decision rules taking underlying probability structure into account. However, in a large sample analysis, we will show in the M -category case that R^{\ast} \leq R \leq R^{\ast}(2 --MR^{\ast}/(M-1)) , where these bounds are the tightest possible, for all suitably smooth underlying distributions. Thus for any number of categories, the probability of error of the nearest neighbor rule is bounded above by twice the Bayes probability of error. In this sense, it may be said that half the classification information in an infinite sample set is contained in the nearest neighbor.

IEEE Transactions on Information Theory | 1979

Capacity theorems for the relay channel

Thomas M. Cover; Abbas El Gamal

A relay channel consists of an input x_{l} , a relay output y_{1} , a channel output y , and a relay sender x_{2} (whose transmission is allowed to depend on the past symbols y_{1} . The dependence of the received symbols upon the inputs is given by p(y,y_{1}|x_{1},x_{2}) . The channel is assumed to be memoryless. In this paper the following capacity theorems are proved. 1)If y is a degraded form of y_{1} , then C \: = \: \max \!_{p(x_{1},x_{2})} \min \,{I(X_{1},X_{2};Y), I(X_{1}; Y_{1}|X_{2})} . 2)If y_{1} is a degraded form of y , then C \: = \: \max \!_{p(x_{1})} \max_{x_{2}} I(X_{1};Y|x_{2}) . 3)If p(y,y_{1}|x_{1},x_{2}) is an arbitrary relay channel with feedback from (y,y_{1}) to both x_{1} \and x_{2} , then C\: = \: \max_{p(x_{1},x_{2})} \min \,{I(X_{1},X_{2};Y),I \,(X_{1};Y,Y_{1}|X_{2})} . 4)For a general relay channel, C \: \leq \: \max_{p(x_{1},x_{2})} \min \,{I \,(X_{1}, X_{2};Y),I(X_{1};Y,Y_{1}|X_{2}) . Superposition block Markov encoding is used to show achievability of C , and converses are established. The capacities of the Gaussian relay channel and certain discrete relay channels are evaluated. Finally, an achievable lower bound to the capacity of the general relay channel is established.

IEEE Transactions on Electronic Computers | 1965

Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition

Thomas M. Cover

This paper develops the separating capacities of families of nonlinear decision surfaces by a direct application of a theorem in classical combinatorial geometry. It is shown that a family of surfaces having d degrees of freedom has a natural separating capacity of 2d pattern vectors, thus extending and unifying results of Winder and others on the pattern-separating capacity of hyperplanes. Applying these ideas to the vertices of a binary n-cube yields bounds on the number of spherically, quadratically, and, in general, nonlinearly separable Boolean functions of n variables. It is shown that the set of all surfaces which separate a dichotomy of an infinite, random, separable set of pattern vectors can be characterized, on the average, by a subset of only 2d extreme pattern vectors. In addition, the problem of generalizing the classifications on a labeled set of pattern points to the classification of a new point is defined, and it is found that the probability of ambiguous generalization is large unless the number of training patterns exceeds the capacity of the set of separating surfaces.

IEEE Transactions on Information Theory | 1982

Achievable rates for multiple descriptions

Abbas El Gamal; Thomas M. Cover

Consider a sequence of independent identically distributed (i.i.d.) random variables X_{l},X_{2}, \cdots, X_{n} and a distortion measure d(X_{i},X_{i}) on the estimates X_{i} of X_{i} . Two descriptions i(X)\in \{1,2, \cdots ,2^{nR_{1}\} and j(X)\in \{1,2, \cdots,2^{nR_{2}\} are given of the sequence X=(X_{1}, X_{2}, \cdots ,X_{n}) . From these two descriptions, three estimates (i(X)), X2(j(X)) , and \hat{X}_{O}(i(X),j(X)) are formed, with resulting expected distortions E \frac{1/n} \sum^{n}_{k=1} d(X_{k}, \hat{X}_{mk})=D_{m}, m=0,1,2. We find that the distortion constraints D_{0}, D_{1}, D_{2} are achievable if there exists a probability mass distribution p(x)p(\hat{x}_{1},\hat{x}_{2},\hat{x}_{0}|x) with Ed(X,\hat{x}_{m})\leq D_{m} such that R_{1}>I(X;\hat{X}_{1}), R_{2}>I(X;\hat{X}_{2}), where I(\cdot) denotes Shannon mutual information. These rates are shown to be optimal for deterministic distortion measures.

IEEE Transactions on Information Theory | 1991

Successive refinement of information

William H. R. Equitz; Thomas M. Cover

The successive refinement of information consists of first approximating data using a few bits of information, then iteratively improving the approximation as more and more information is supplied. The goal is to achieve an optimal description at each stage. In general, an ongoing description which is rate-distortion optimal whenever it is interrupted is sought. It is shown that in order to achieve optimal successive refinement the necessary and sufficient conditions are that the solutions of the rate distortion problem can be written as a Markov chain. In particular, all finite alphabet signals with Hamming distortion satisfy these requirements. It is also shown that the same is true for Gaussian signals with squared error distortion and for Laplacian signals with absolute error distortion. A simple counterexample with absolute error distortion and a symmetric source distribution which shows that successive refinement is not always achievable is presented. >

IEEE Transactions on Information Theory | 1991

Information theoretic inequalities

Amir Dembo; Thomas M. Cover; Joy A. Thomas

The role of inequalities in information theory is reviewed, and the relationship of these inequalities to inequalities in other branches of mathematics is developed. The simple inequalities for differential entropy are applied to the standard multivariate normal to furnish new and simpler proofs of the major determinant inequalities in classical mathematics. The authors discuss differential entropy inequalities for random subsets of samples. These inequalities when specialized to multivariate normal variables provide the determinant inequalities that are presented. The authors focus on the entropy power inequality (including the related Brunn-Minkowski, Youngs, and Fisher information inequalities) and address various uncertainty principles and their interrelations. >

IEEE Transactions on Information Theory | 1973

Enumerative source encoding

Thomas M. Cover

Let S be a given subset of binary n-sequences. We provide an explicit scheme for calculating the index of any sequence in S according to its position in the lexicographic ordering of S . A simple inverse algorithm is also given. Particularly nice formulas arise when S is the set of all n -sequences of weight k and also when S is the set of all sequences having a given empirical Markov property. Schalkwijk and Lynch have investigated the former case. The envisioned use of this indexing scheme is to transmit or store the index rather than the sequence, thus resulting in a data compression of (\log\midS\mid)/n .

IEEE Transactions on Information Theory | 1991

Minimum complexity density estimation

Andrew R. Barron; Thomas M. Cover

The authors introduce an index of resolvability that is proved to bound the rate of convergence of minimum complexity density estimators as well as the information-theoretic redundancy of the corresponding total description length. The results on the index of resolvability demonstrate the statistical effectiveness of the minimum description-length principle as a method of inference. The minimum complexity estimator converges to true density nearly as fast as an estimator based on prior knowledge of the true subclass of densities. Interpretations and basic properties of minimum complexity estimators are discussed. Some regression and classification problems that can be examined from the minimum description-length framework are considered. >

IEEE Transactions on Information Theory | 1980

Multiple access channels with arbitrarily correlated sources

Thomas M. Cover; Abbas El Gamal; Masoud Salehi

Let \{(U_{i},V_{i})\}_{i=1}^{n} be a source of independent identically distributed (i.i.d.) discrete random variables with joint probability mass function p(u,v) and common part w=f(u)=g(v) in the sense of Witsenhausen, Gacs, and Korner. It is shown that such a source can be sent with arbitrarily small probability of error over a multiple access channel (MAC) \{\cal X_{1} \times \cal X_{2},\cal Y,p(y|x_{1},x_{2})\}, with allowed codes \{x_{l}(u), x_{2}(v)\} if there exist probability mass functions p(s), p(x_{1}|s,u),p(x_{2}|s,v) , such that H(U|V) H(V|U ) H(U,V|W) H(U,V) \mbox{where} p(s,u,v,x_{1},x_{2},y), Xl, X2, y)=p(s)p(u,v)p(x_{1}|u,s)p(x_{2}|v,s)p(y|x_{1},x_{2}). lifts region includes the multiple access channel region and the Slepian-Wolf data compression region as special cases.

IEEE Transactions on Information Theory | 1975

A proof of the data compression theorem of Slepian and Wolf for ergodic sources (Corresp.)

Thomas M. Cover

If \{(X_i, Y_i)\}_{i=1}^{\infty} is a sequence of independent identically distributed discrete random pairs with (X_i, Y_i) ~ p(x,y) , Slepian and Wolf have shown that the X process and the Y process can be separately described to a common receiver at rates R_X and R_Y hits per symbol if R_X + R_Y > H(X,Y), R_X > H(X\midY), R_Y > H(Y\midX) . A simpler proof of this result will be given. As a consequence it is established that the Slepian-Wolf theorem is true without change for arbitrary ergodic processes \{(X_i,Y_i)\}_{i=1}^{\infty} and countably infinite alphabets. The extension to an arbitrary number of processes is immediate.

Explore More