Morteza Monemizadeh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Morteza Monemizadeh is active.

Explore More

Publication

Featured researches published by Morteza Monemizadeh.

symposium on computational geometry | 2007

A PTAS for k-means clustering based on weak coresets

Dan Feldman; Morteza Monemizadeh; Christian Sohler

Given a point set P ⊆ Rd the k-means clustering problem is to find a set C=(c1,...,ck) of k points and a partition of P into k clusters C1,...,Ck such that the sum of squared errors ∑i=1k ∑p ∈ Ci |p -ci |22 is minimized. For given centers this cost function is minimized byassigning points to the nearest center.The k-means cost function is probably the most widely used cost function in the area of clustering.In this paper we show that every unweighted point set P has a weak (ε, k)-coreset of size Poly(k,1/ε) for the k-means clustering problem, i.e. its size is independent of the cardinality |P| of the point set and the dimension d of the Euclidean space Rd. A weak coreset is a weighted set S ⊆ P together with a set T such that T contains a (1+ε)-approximation for the optimal cluster centers from P and for every set of kcenters from T the cost of the centers for S is a (1±ε)-approximation of the cost for P.We apply our weak coreset to obtain a PTAS for the k-means clustering problem with running time O(nkd + d · Poly(k/ε) + 2Õ(k/ε)).

Archive | 2010

1-Pass Relative-ErrorLp-Sampling with Applications

Morteza Monemizadeh; David P. Woodruff

For any p ∈ [0, 2], we give a 1-pass poly(ε-1 log n)-space algorithm which, given a data stream of length m with insertions and deletions of an n-dimensional vector a, with updates in the range { -- M, -- M + 1, ..., M -- 1, M}, outputs a sample of [n] = {1, 2, ..., n} for which for all i the probability that i is returned is (1 ± ε) |ai|p/Fp(a) ± n-C, where ai denotes the (possibly negative) value of coordinate i, Fp(a) = Σni=1 |ai|p = ||a||pp denotes the p-th frequency moment (i.e., the p-th power of the Lp norm), and C > 0 is an arbitrarily large constant. Here we assume that n, m, and M are polynomially related. Our generic sampling framework improves and unifies algorithms for several communication and streaming problems, including cascaded norms, heavy hitters, and moment estimation. It also gives the first relative-error forward sampling algorithm in a data stream with deletions, answering an open question of Cormode et al.

international colloquium on automata languages and programming | 2017

Testable Bounded Degree Graph Properties Are Random Order Streamable

Morteza Monemizadeh; S. Muthukrishnan; Pan Peng; Christian Sohler

We study which property testing and sublinear time algorithms can be transformed into graph streaming algorithms for random order streams. Our main result is that for bounded degree graphs, any property that is constant-query testable in the adjacency list model can be tested with constant space in a single-pass in random order streams. Our result is obtained by estimating the distribution of local neighborhoods of the vertices on a random order graph stream using constant space. We then show that our approach can also be applied to constant time approximation algorithms for bounded degree graphs in the adjacency list model: As an example, we obtain a constant-space single-pass random order streaming algorithms for approximating the size of a maximum matching with additive error epsilon n (n is the number of nodes). Our result establishes for the first time that a large class of sublinear algorithms can be simulated in random order streams, while Omega(n) space is needed for many graph streaming problems for adversarial orders.

Archive | 2011

Non-uniform Sampling in Clustering and Streaming

Morteza Monemizadeh

Aproximating a sum without computing the summands is a classic problem in statistics and machine learning. The problem is defined as follows: Assume Z is the sum of n numbers, Z1, · · · , Zn i.e., Z = Z1 + · · ·+ Zn. The goal is to estimate Z without computing all the n summands but few. According to the uniform sampling we choose a number Zi with probability 1 n and assign the weight n to Zi. The number nZi will be our estimation for Z. We see that the expectation of the random variable nZi is Z but the variance of nZi can be large. The reason is if the number of large numbers is few, then the probability that the random sample does not take one of them will be high and if this happens the variance of nZi will be large. Using non-uniform sampling we can bound the variance in terms of the expectation and therefore estimate Z within factor (1± ) as follows: Having n probabilities ri ≥ 1 γ Zi Z for 1 ≤ i ≤ n corresponding to the numbers Z1, · · · , Zn we take a sample set A = {a1, · · · , aj, · · · , as} ⊆ [n] of indices according to the probabilities ri and assign a weight of w(Zaj) = 1 s·raj to a sampled number Zaj for 1 ≤ j ≤ s. We then use the concentration bounds to show that for s = O(γ −2 log(1/δ)) the probability that the estimator X = ∑ aj∈Aw(Zaj) · Zaj deviates from Z by more than Z is at most δ. In this thesis we study applications of this estimator in high dimensional clustering and streaming. In particular, for the k-means and the j-subspace problems we get unbiased estimators that can (1± )approximate the cost of the point set to an arbitrary center set. We then use these estimators to get coresets, linear time (1+ )-approximation and insertion only streaming algorithms. In the turnstile streaming model we are given a vector a of length n where the i-th coordinate is represented by ai and a stream S as m = poly(n,M) updates of the form (i, x), where i ∈ [n] and x ∈ {−M,−M + 1, . . . ,M − 1,M}, indicating that the i-th coordinate ai of a should be incremented by x. Let Zi = |ai| for p ∈ [0, 2], 1 ≤ i ≤ n and Z = Fp(a) = ∑n i=1 |ai| . In this model finding n probabilities ri ≥ 1 γ Zi Z using one pass and polylog space was known to be an open problem in the streaming community [CMI05]. We give a 1-pass poly( −1 logn)-space algorithm called Lp-sampler that samples according to probabilities ri for γ = (1± ), p ∈ [0, 2]. We show that the Lp-sampler leads to many improvements and a unification of well-studied streaming problems, including cascaded norms, heavy hitters, and moment estimation. In particular, as for the moment estimation using O(n1−2/k −2) L2-samplers in parallel for k > 2 we can (1 ± )-estimate Fk(a) = ∑n i=1 |ai| k using optimal space n1−2/k · poly( −1 logn). This algorithm is the first that does not use Nisan’s pseudorandom generator as a subroutine, potentially making it more practical.

Algorithmica | 2018

Structural Results on Matching Estimation with Applications to Streaming

Marc Bury; Elena Grigorescu; Andrew McGregor; Morteza Monemizadeh; Chris Schwiegelshohn; Sofya Vorotnikova; Samson Zhou

We study the problem of estimating the size of a matching when the graph is revealed in a streaming fashion. Our results are multifold:1.We give a tight structural result relating the size of a maximum matching to the arboricity

symposium on discrete algorithms | 2010

1-pass relative-error L p -sampling with applications

Morteza Monemizadeh; David P. Woodruff

symposium on discrete algorithms | 2010

Coresets and sketches for high dimensional subspace approximation problems

Dan Feldman; Morteza Monemizadeh; Christian Sohler; David P. Woodruff

\alpha

symposium on discrete algorithms | 2015

Streaming algorithms for estimating the matching size in planar graphs and beyond

Hossein Esfandiari; Mohammad Taghi Hajiaghayi; Vahid Liaghat; Morteza Monemizadeh; Krzysztof Onak

symposium on discrete algorithms | 2015

Parameterized streaming: maximal matching and vertex cover

Rajesh Hemant Chitnis; Graham Cormode; Mohammad Taghi Hajiaghayi; Morteza Monemizadeh

α of a graph, which has been one of the most studied graph parameters for matching algorithms in data streams. One of the implications is an algorithm that estimates the matching size up to a factor of

symposium on discrete algorithms | 2016