Morteza Monemizadeh
Goethe University Frankfurt
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Morteza Monemizadeh.
symposium on computational geometry | 2007
Dan Feldman; Morteza Monemizadeh; Christian Sohler
Given a point set P ⊆ R<sup>d</sup> the k-means clustering problem is to find a set C=(c<sub>1</sub>,...,c<sub>k</sub>) of k points and a partition of P into k clusters C<sub>1</sub>,...,C<sub>k</sub> such that the sum of squared errors ∑<sub>i=1</sub><sup>k</sup> ∑<sub>p ∈ C<sub>i</sub></sub> |p -c<sub>i</sub> |<sub>2</sub><sup>2</sup> is minimized. For given centers this cost function is minimized byassigning points to the nearest center.The k-means cost function is probably the most widely used cost function in the area of clustering.In this paper we show that every unweighted point set P has a weak (ε, k)-coreset of size Poly(k,1/ε) for the k-means clustering problem, i.e. its size is <i>independent</i> of the cardinality |P| of the point set and the dimension d of the Euclidean space R<sup>d</sup>. A weak coreset is a weighted set S ⊆ P together with a set T such that T contains a (1+ε)-approximation for the optimal cluster centers from P and for every set of kcenters from T the cost of the centers for S is a (1±ε)-approximation of the cost for P.We apply our weak coreset to obtain a PTAS for the k-means clustering problem with running time O(nkd + d · Poly(k/ε) + 2<sup>Õ</sup>(k/ε)).
Archive | 2010
Morteza Monemizadeh; David P. Woodruff
For any <i>p</i> ∈ [0, 2], we give a 1-pass poly(ε<sup>-1</sup> log <i>n</i>)-space algorithm which, given a data stream of length <i>m</i> with insertions and deletions of an <i>n</i>-dimensional vector <i>a</i>, with updates in the range { -- <i>M</i>, -- <i>M</i> + 1, ..., <i>M</i> -- 1, <i>M</i>}, outputs a sample of [<i>n</i>] = {1, 2, ..., <i>n</i>} for which for all <i>i</i> the probability that <i>i</i> is returned is (1 ± ε) |<i>a</i><sub><i>i</i></sub>|<i>p</i>/<i>F</i><sub><i>p</i></sub>(<i>a</i>) ± <i>n<sup>-C</sup></i>, where <i>a</i><sub><i>i</i></sub> denotes the (possibly negative) value of coordinate <i>i</i>, <i>F</i><sub><i>p</i></sub>(<i>a</i>) = Σ<i><sup>n</sup></i><sub><i>i</i>=1</sub> |<i>a</i><sub><i>i</i></sub>|<i><sup>p</sup></i> = ||<i>a</i>||<i><sup>p</sup></i><sub><i>p</i></sub> denotes the <i>p</i>-th frequency moment (i.e., the <i>p</i>-th power of the <i>L</i><sub><i>p</i></sub> norm), and <i>C</i> > 0 is an arbitrarily large constant. Here we assume that <i>n, m</i>, and <i>M</i> are polynomially related. Our generic sampling framework improves and unifies algorithms for several communication and streaming problems, including cascaded norms, heavy hitters, and moment estimation. It also gives the first relative-error forward sampling algorithm in a data stream with deletions, answering an open question of Cormode <i>et al</i>.
international colloquium on automata languages and programming | 2017
Morteza Monemizadeh; S. Muthukrishnan; Pan Peng; Christian Sohler
We study which property testing and sublinear time algorithms can be transformed into graph streaming algorithms for random order streams. Our main result is that for bounded degree graphs, any property that is constant-query testable in the adjacency list model can be tested with constant space in a single-pass in random order streams. Our result is obtained by estimating the distribution of local neighborhoods of the vertices on a random order graph stream using constant space. We then show that our approach can also be applied to constant time approximation algorithms for bounded degree graphs in the adjacency list model: As an example, we obtain a constant-space single-pass random order streaming algorithms for approximating the size of a maximum matching with additive error epsilon n (n is the number of nodes). Our result establishes for the first time that a large class of sublinear algorithms can be simulated in random order streams, while Omega(n) space is needed for many graph streaming problems for adversarial orders.
Archive | 2011
Morteza Monemizadeh
Aproximating a sum without computing the summands is a classic problem in statistics and machine learning. The problem is defined as follows: Assume Z is the sum of n numbers, Z1, · · · , Zn i.e., Z = Z1 + · · ·+ Zn. The goal is to estimate Z without computing all the n summands but few. According to the uniform sampling we choose a number Zi with probability 1 n and assign the weight n to Zi. The number nZi will be our estimation for Z. We see that the expectation of the random variable nZi is Z but the variance of nZi can be large. The reason is if the number of large numbers is few, then the probability that the random sample does not take one of them will be high and if this happens the variance of nZi will be large. Using non-uniform sampling we can bound the variance in terms of the expectation and therefore estimate Z within factor (1± ) as follows: Having n probabilities ri ≥ 1 γ Zi Z for 1 ≤ i ≤ n corresponding to the numbers Z1, · · · , Zn we take a sample set A = {a1, · · · , aj, · · · , as} ⊆ [n] of indices according to the probabilities ri and assign a weight of w(Zaj) = 1 s·raj to a sampled number Zaj for 1 ≤ j ≤ s. We then use the concentration bounds to show that for s = O(γ −2 log(1/δ)) the probability that the estimator X = ∑ aj∈Aw(Zaj) · Zaj deviates from Z by more than Z is at most δ. In this thesis we study applications of this estimator in high dimensional clustering and streaming. In particular, for the k-means and the j-subspace problems we get unbiased estimators that can (1± )approximate the cost of the point set to an arbitrary center set. We then use these estimators to get coresets, linear time (1+ )-approximation and insertion only streaming algorithms. In the turnstile streaming model we are given a vector a of length n where the i-th coordinate is represented by ai and a stream S as m = poly(n,M) updates of the form (i, x), where i ∈ [n] and x ∈ {−M,−M + 1, . . . ,M − 1,M}, indicating that the i-th coordinate ai of a should be incremented by x. Let Zi = |ai| for p ∈ [0, 2], 1 ≤ i ≤ n and Z = Fp(a) = ∑n i=1 |ai| . In this model finding n probabilities ri ≥ 1 γ Zi Z using one pass and polylog space was known to be an open problem in the streaming community [CMI05]. We give a 1-pass poly( −1 logn)-space algorithm called Lp-sampler that samples according to probabilities ri for γ = (1± ), p ∈ [0, 2]. We show that the Lp-sampler leads to many improvements and a unification of well-studied streaming problems, including cascaded norms, heavy hitters, and moment estimation. In particular, as for the moment estimation using O(n1−2/k −2) L2-samplers in parallel for k > 2 we can (1 ± )-estimate Fk(a) = ∑n i=1 |ai| k using optimal space n1−2/k · poly( −1 logn). This algorithm is the first that does not use Nisan’s pseudorandom generator as a subroutine, potentially making it more practical.
Algorithmica | 2018
Marc Bury; Elena Grigorescu; Andrew McGregor; Morteza Monemizadeh; Chris Schwiegelshohn; Sofya Vorotnikova; Samson Zhou
We study the problem of estimating the size of a matching when the graph is revealed in a streaming fashion. Our results are multifold:1.We give a tight structural result relating the size of a maximum matching to the arboricity
symposium on discrete algorithms | 2010
Morteza Monemizadeh; David P. Woodruff
symposium on discrete algorithms | 2010
Dan Feldman; Morteza Monemizadeh; Christian Sohler; David P. Woodruff
\alpha
symposium on discrete algorithms | 2015
Hossein Esfandiari; Mohammad Taghi Hajiaghayi; Vahid Liaghat; Morteza Monemizadeh; Krzysztof Onak
symposium on discrete algorithms | 2015
Rajesh Hemant Chitnis; Graham Cormode; Mohammad Taghi Hajiaghayi; Morteza Monemizadeh
α of a graph, which has been one of the most studied graph parameters for matching algorithms in data streams. One of the implications is an algorithm that estimates the matching size up to a factor of
symposium on discrete algorithms | 2016
Rajesh Hemant Chitnis; Graham Cormode; Hossein Esfandiari; Mohammad Taghi Hajiaghayi; Andrew McGregor; Morteza Monemizadeh; Sofya Vorotnikova