Megasthenis Asteris
University of Texas at Austin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Megasthenis Asteris.
very large data bases | 2013
Maheswaran Sathiamoorthy; Megasthenis Asteris; Dimitris S. Papailiopoulos; Alexandros G. Dimakis; Ramkumar Venkat Vadali; Scott Shaobing Chen; Dhruba Borthakur
Distributed storage systems for large clusters typically use replication to provide reliability. Recently, erasure codes have been used to reduce the large storage overhead of three-replicated systems. Reed-Solomon codes are the standard design choice and their high repair cost is often considered an unavoidable price to pay for high storage efficiency and high reliability. This paper shows how to overcome this limitation. We present a novel family of erasure codes that are efficiently repairable and offer higher reliability compared to Reed-Solomon codes. We show analytically that our codes are optimal on a recently identified tradeoff between locality and minimum distance. We implement our new codes in Hadoop HDFS and compare to a currently deployed HDFS module that uses Reed-Solomon codes. Our modified HDFS implementation shows a reduction of approximately 2× on the repair disk I/O and repair network traffic. The disadvantage of the new coding scheme is that it requires 14% more storage compared to Reed-Solomon codes, an overhead shown to be information theoretically optimal to obtain locality. Because the new codes repair failures faster, this provides higher reliability, which is orders of magnitude higher compared to replication.
international symposium on information theory | 2012
Megasthenis Asteris; Alexandros G. Dimakis
We introduce a new family of Fountain codes that are systematic and also have sparse parities. Given an input of k symbols, our codes produce an unbounded number of output symbols, generating each parity independently by linearly combining a logarithmic number of randomly selected input symbols. The construction guarantees that for any ε>0 accessing a random subset of (1+ε)k encoded symbols, asymptotically suffices to recover the k input symbols with high probability. Our codes have the additional benefit of logarithmic locality: a single lost symbol can be repaired by accessing a subset of O(log k) of the remaining encoded symbols. This is a desired property for distributed storage systems where symbols are spread over a network of storage nodes. Beyond recovery upon loss, local reconstruction provides an efficient alternative for reading symbols that cannot be accessed directly. In our code, a logarithmic number of disjoint local groups is associated with each systematic symbol, allowing multiple parallel reads. Our main mathematical contribution involves analyzing the rank of sparse random matrices with specific structure over finite fields. We rely on establishing that a new family of sparse random bipartite graphs have perfect matchings with high probability.
IEEE Journal on Selected Areas in Communications | 2014
Megasthenis Asteris; Alexandros G. Dimakis
We introduce a new family of Fountain codes that are systematic and also have sparse parities. Given an input of k symbols, our codes produce an unbounded number of output symbols, generating each parity independently by linearly combining a logarithmic number of randomly selected input symbols. The construction guarantees that for any e>0 accessing a random subset of (1+e)k encoded symbols, asymptotically suffices to recover the k input symbols with high probability. Our codes have the additional benefit of logarithmic locality: a single lost symbol can be repaired by accessing a subset of O(log k) of the remaining encoded symbols. This is a desired property for distributed storage systems where symbols are spread over a network of storage nodes. Beyond recovery upon loss, local reconstruction provides an efficient alternative for reading symbols that cannot be accessed directly. In our code, a logarithmic number of disjoint local groups is associated with each systematic symbol, allowing multiple parallel reads. Our main mathematical contribution involves analyzing the rank of sparse random matrices with specific structure over finite fields. We rely on establishing that a new family of sparse random bipartite graphs have perfect matchings with high probability.
international symposium on information theory | 2011
Megasthenis Asteris; Dimitris S. Papailiopoulos; George N. Karystinos
We consider the problem of identifying the sparse principal component of a rank-deficient matrix. We introduce auxiliary spherical variables and prove that there exists a set of candidate index-sets (that is, sets of indices to the nonzero elements of the vector argument) whose size is polynomially bounded, in terms of rank, and contains the optimal index-set, i.e. the index-set of the nonzero elements of the optimal solution. Finally, we develop an algorithm that computes the optimal sparse principal component in polynomial time for any sparsity degree.
IEEE Transactions on Information Theory | 2014
Megasthenis Asteris; Dimitris S. Papailiopoulos; George N. Karystinos
The computation of the sparse principal component of a matrix is equivalent to the identification of its principal submatrix with the largest maximum eigenvalue. Finding this optimal submatrix is what renders the problem NP-hard. In this paper, we prove that, if the matrix is positive semidefinite and its rank is constant, then its sparse principal component is polynomially computable. Our proof utilizes the auxiliary unit vector technique that has been recently developed to identify problems that are polynomially solvable. In addition, we use this technique to design an algorithm which, for any sparsity value, computes the sparse principal component with complexity O(ND+1), where N and D are the matrix size and rank, respectively. Our algorithm is fully parallelizable and memory efficient.
IEEE Pervasive Computing | 2013
Aggelos Bletsas; Aikaterini Vlachaki; Eleftherios Kampianakis; George Sklivanitis; John Kimionis; Konstadinos Tountas; Megasthenis Asteris; Panagiotis Markopoulos
In an interdisciplinary, semester-long class, undergraduate students learn how to build a low-cost, multihop wireless sensor network from first principles for a digital garden. This type of course better prepares electrical engineering graduates for the sensor-rich, pervasive computing era.
international symposium on information theory | 2015
Karthikeyan Shanmugam; Megasthenis Asteris; Alexandros G. Dimakis
We study upper bounds on the sum-rate of multiple-unicasts. We approximate the Generalized Network Sharing Bound (GNS cut) of the multiple-unicasts network coding problem with k independent sources. Our approximation algorithm runs in polynomial time and yields an upper bound on the joint source entropy rate, which is within an O(log2 k) factor from the GNS cut. It further yields a vector-linear network code that achieves joint source entropy rate within an O(log2 k) factor from the GNS cut, but not with independent sources: the code induces a correlation pattern among the sources. Our second contribution is establishing a separation result for vector-linear network codes: for any given field F there exist networks for which the optimum sum-rate supported by vector-linear codes over F for independent sources can be multiplicatively separated by a factor of k1-δ, for any constant δ > 0, from the optimum joint entropy rate supported by a code that allows correlation between sources. Finally, we establish a similar separation result for the asymmetric optimum vector-linear sum-rates achieved over two distinct fields Fp and Fq for independent sources, revealing that the choice of field can heavily impact the performance of a linear network code.
conference on information sciences and systems | 2014
Dimitris S. Papailiopoulos; Megasthenis Asteris; Alexandros G. Dimakis
Several large scale data processing problems can be formulated as quadratic programs subject to combinatorial constraints. We present a novel low-rank approximation framework for problems such as sparse PCA, nonnegative PCA, or finding the k-densest submatrix. Our framework comes with provable approximation guarantees, that dependent on the spectrum of the data-set matrix. Our algorithm operates by solving a number of QP instances, which are randomly sampled from a low-dimensional subspace of the input matrix.
international conference on machine learning | 2014
Megasthenis Asteris; Dimitris S. Papailiopoulos; Alexandros G. Dimakis
neural information processing systems | 2015
Megasthenis Asteris; Dimitris S. Papailiopoulos; Anastasios Kyrillidis; Alexandros G. Dimakis