Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kartik Venkat is active.

Publication


Featured researches published by Kartik Venkat.


IEEE Transactions on Information Theory | 2017

Maximum Likelihood Estimation of Functionals of Discrete Distributions

Jiantao Jiao; Kartik Venkat; Yanjun Han; Tsachy Weissman

The Dirichlet prior is widely used in estimating discrete distributions and functionals of discrete distributions. In terms of Shannon entropy estimation, one approach is to plug-in the Dirichlet prior smoothed distribution into the entropy functional, while the other one is to calculate the Bayes estimator for entropy under the Dirichlet prior for squared error, which is the conditional expectation. We show that in general they do not improve over the maximum likelihood estimator, which plugs-in the empirical distribution into the entropy functional. No matter how we tune the parameters in the Dirichlet prior, this approach cannot achieve the minimax rates in entropy estimation, as recently characterized by Jiao, Venkat, Han, and Weissman [1], and Wu and Yang [2]. The performance of the minimax rate-optimal estimator with n samples is essentially at least as good as that of the Dirichlet smoothed entropy estimators with n ln n samples. We harness the theory of approximation using positive linear operators for analyzing the bias of plug-in estimators for general functionals under arbitrary statistical models, thereby further consolidating the interplay between these two fields, which was thoroughly exploited by Jiao, Venkat, Han, and Weissman [3] in estimating various functionals of discrete distributions. We establish new results in approximation theory, and apply them to analyze the bias of the Dirichlet prior smoothed plug-in entropy estimator. This interplay between bias analysis and approximation theory is of relevance and consequence far beyond the specific problem setting in this paper.


IEEE Transactions on Information Theory | 2015

Justification of Logarithmic Loss via the Benefit of Side Information

Jiantao Jiao; Thomas A. Courtade; Kartik Venkat; Tsachy Weissman

We consider a natural measure of relevance: the reduction in optimal prediction risk in the presence of side information. For any given loss function, this relevance measure captures the benefit of side information for performing inference on a random variable under this loss function. When such a measure satisfies a natural data processing property, and the random variable of interest has alphabet size greater than two, we show that it is uniquely characterized by the mutual information, and the corresponding loss function coincides with logarithmic loss. In doing so, our work provides a new characterization of mutual information, and justifies its use as a measure of relevance. When the alphabet is binary, we characterize the only admissible forms the measure of relevance can assume while obeying the specified data processing property. Our results naturally extend to measuring the causal influence between stochastic processes, where we unify different causality measures in the literature as instantiations of directed information.


information theory workshop | 2012

Reference based genome compression

Bobbie Chern; Idoia Ochoa; Alexandros Manolakos; Albert No; Kartik Venkat; Tsachy Weissman

DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target genome, and then compresses this mapping with an entropy coder. As an illustration of the performance: applying our algorithm to James Watsons genome with hg18 as a reference, we are able to reduce the 2991 megabyte (MB) genome down to 6.99 MB, while Gzip compresses it to 834.8 MB.


IEEE Transactions on Information Theory | 2012

Pointwise Relations Between Information and Estimation in Gaussian Noise

Kartik Venkat; Tsachy Weissman

Many of the classical and recent relations between information and estimation in the presence of Gaussian noise can be viewed as identities between expectations of random quantities. These include the relationship between mutual information and minimum mean square error (I-MMSE) of Guo ; the relative entropy and mismatched estimation relationship of Verdu; the relationship between causal estimation and mutual information of Duncan, and its extension to the presence of feedback by Kadota ; the relationship between causal and non-casual estimation of Guo , and its mismatched version of Weissman. We dispense with the expectations and explore the nature of the pointwise relations between the respective random quantities. The pointwise relations that we find are as succinctly stated as-and give considerable insight into-the original expectation identities. As an illustration of our results, consider Duncans 1970 discovery that the mutual information is equal to the causal MMSE in the additive white Gaussian noise channel, which can equivalently be expressed saying that the difference between the input-output information density and half the causal estimation error is a zero-mean random variable (regardless of the distribution of the channel input). We characterize this random variable explicitly, rather than merely its expectation. Classical estimation and information theoretic quantities emerge with new and surprising roles. For example, the variance of this random variable turns out to be given by the causal MMSE (which, in turn, is equal to twice the mutual information by Duncans result).


IEEE Transactions on Information Theory | 2014

Information Measures: The Curious Case of the Binary Alphabet

Jiantao Jiao; Thomas A. Courtade; Albert No; Kartik Venkat; Tsachy Weissman

Four problems related to information divergence measures defined on finite alphabets are considered. In three of the cases we consider, we illustrate a contrast that arises between the binary-alphabet and larger alphabet settings. This is surprising in some instances, since characterizations for the larger alphabet settings do not generalize their binary-alphabet counterparts. In particular, we show that f-divergences are not the unique decomposable divergences on binary alphabets that satisfy the data processing inequality, thereby clarifying claims that have previously appeared in the literature. We also show that Kullback-Leibler (KL) divergence is the unique Bregman divergence, which is also an f-divergence for any alphabet size. We show that KL divergence is the unique Bregman divergence, which is invariant to statistically sufficient transformations of the data, even when nondecomposable divergences are considered. Like some of the problems we consider, this result holds only when the alphabet size is at least three.


BMC Genomics | 2014

CaMoDi: a new method for cancer module discovery.

Alexandros Manolakos; Idoia Ochoa; Kartik Venkat; Andrea J. Goldsmith; Olivier Gevaert

BackgroundIdentification of genomic patterns in tumors is an important problem, which would enable the community to understand and extend effective therapies across the current tissue-based tumor boundaries. With this in mind, in this work we develop a robust and fast algorithm to discover cancer driver genes using an unsupervised clustering of similarly expressed genes across cancer patients. Specifically, we introduce CaMoDi, a new method for module discovery which demonstrates superior performance across a number of computational and statistical metrics.ResultsThe proposed algorithm CaMoDi demonstrates effective statistical performance compared to the state of the art, and is algorithmically simple and scalable - which makes it suitable for tissue-independent genomic characterization of individual tumors as well as groups of tumors. We perform an extensive comparative study between CaMoDi and two previously developed methods (CONEXIC and AMARETTO), across 11 individual tumors and 8 combinations of tumors from The Cancer Genome Atlas. We demonstrate that CaMoDi is able to discover modules with better average consistency and homogeneity, with similar or better adjusted R2 performance compared to CONEXIC and AMARETTO.ConclusionsWe present a novel method for Cancer Module Discovery, CaMoDi, and demonstrate through extensive simulations on the TCGA Pan-Cancer dataset that it achieves comparable or better performance than that of CONEXIC and AMARETTO, while achieving an order-of-magnitude improvement in computational run time compared to the other methods.


international symposium on information theory | 2014

Relations between information and estimation in scalar Lévy channels

Jiantao Jiao; Kartik Venkat; Tsachy Weissman

Fundamental relations between information and estimation have been established in the literature for the scalar Gaussian and Poisson channels. In this work, we demonstrate that such relations hold for a much larger class of observation models. We introduce the natural family of scalar Lévy channels where the distribution of the output conditioned on the input is infinitely divisible. For Lévy channels, we establish new representations relating the mutual information between the channel input and output to an optimal estimation loss, thereby unifying and considerably extending results from the Gaussian and Poissonian settings. We demonstrate the richness of our results by working out two examples of Lévy channels, namely the Gamma channel and the Negative Binomial channel, with corresponding relations between information and estimation. Extensions to the setting of mismatched estimation are also presented.


IEEE Transactions on Signal Processing | 2016

Information, Estimation, and Lookahead in the Gaussian Channel

Kartik Venkat; Tsachy Weissman; Yair Carmon; Shlomo Shamai

We consider mean squared estimation with lookahead of a continuous-time signal corrupted by additive white Gaussian noise. We show that the mutual information rate function, i.e., the mutual information rate as function of the signal-to-noise ratio (SNR), does not, in general, determine the minimum mean squared error (MMSE) with fixed finite lookahead, in contrast to the special cases with 0 and infinite lookahead (filtering and smoothing errors), respectively, which were previously established in the literature. Further, we investigate the simple class of continuous-time stationary Gauss-Markov processes (Ornstein-Uhlenbeck processes) as channel inputs, and explicitly characterize the behavior of the minimum mean squared error (MMSE) with finite lookahead and signal-to-noise ratio (SNR). We extend our results to mixtures of Ornstein-Uhlenbeck processes, and use the insight gained to present lower and upper bounds on the MMSE with lookahead for a class of stationary Gaussian input processes, whose spectrum can be expressed as a mixture of Ornstein-Uhlenbeck spectra.


international symposium on information theory | 2013

Pointwise relations between information and estimation in the Poisson channel

Jiantao Jiao; Kartik Venkat; Tsachy Weissman

Identities yielding optimal estimation interpretations for mutual information and relative entropy - paralleling those known for minimum mean squared estimation under additive Gaussian noise - were recently discovered for the Poisson channel by Atar and Weissman. We express these identities as equalities between expectations of the associated estimation and information theoretic random variables such as the actual estimation loss and the information density. By explicitly characterizing the relations between these random variables we show that they are related in much stronger pointwise senses that directly imply the known expectation identities while deepening our understanding of them. As an example for the nature of our results, consider the equality between the mutual information and the mean cumulative filtering loss of the optimal filter in continuous-time estimation. We show that the difference between the information density and the cumulative filtering loss is a martingale expressible as a stochastic integral. This explicit characterization not only directly recovers the previously known expectation relation, but allows to characterize other distributional properties of the random variables involved where some of the original objects of interest emerge in new and surprising roles. For example, we find that the increasing predictable part of the Doob-Meyer decomposition of the information density (which is a sub-martinagle) is nothing but the cumulative loss of the optimal filter.


IEEE Transactions on Information Theory | 2017

Relations Between Information and Estimation in Discrete-Time Lévy Channels

Jiantao Jiao; Kartik Venkat; Tsachy Weissman

Fundamental relations between information and estimation have been established in the literature for the discrete-time Gaussian and Poisson channels. In this paper, we demonstrate that such relations hold for a much larger class of observation models. We introduce the natural family of discrete-time Lévy channels where the distribution of the output conditioned on the input is infinitely divisible. For Lévy channels, we establish new representations relating the mutual information between the channel input and output to an optimal expected estimation loss, thereby unifying and considerably extending results from the Gaussian and Poisson settings. We demonstrate the richness of our results by working out two examples of Lévy channels, namely the gamma channel and the negative binomial channel, with corresponding relations between information and estimation. Extensions to the setting of mismatched estimation are also presented.

Collaboration


Dive into the Kartik Venkat's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shlomo Shamai

Technion – Israel Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yair Carmon

Technion – Israel Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge