Michael J Laszlo
Nova Southeastern University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael J Laszlo.
IEEE Transactions on Knowledge and Data Engineering | 2005
Michael J Laszlo; Sumitra Mukherjee
This paper presents a clustering algorithm for partitioning a minimum spanning tree with a constraint on minimum group size. The problem is motivated by microaggregation, a disclosure limitation technique in which similar records are aggregated into groups containing a minimum of k records. Heuristic clustering methods are needed since the minimum information loss microaggregation problem is NP-hard. Our MST partitioning algorithm for microaggregation is sufficiently efficient to be practical for large data sets and yields results that are comparable to the best available heuristic methods for microaggregation. For data that contain pronounced clustering effects, our method results in significantly lower information loss. Our algorithm is general enough to accommodate different measures of information loss and can be used for other clustering applications that have a constraint on minimum group size.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006
Michael J Laszlo; Sumitra Mukherjee
The k-means algorithm is widely used for clustering because of its computational efficiency. Given n points in d-dimensional space and the number of desired clusters k, k-means seeks a set of k-cluster centers so as to minimize the sum of the squared Euclidean distance between each point and its nearest cluster center. However, the algorithm is very sensitive to the initial selection of centers and is likely to converge to partitions that are significantly inferior to the global optimum. We present a genetic algorithm (GA) for evolving centers in the k-means algorithm that simultaneously identifies good partitions for a range of values around a specified k. The set of centers is represented using a hyper-quadtree constructed on the data. This representation is exploited in our GA to generate an initial population of good centers and to support a novel crossover operation that selectively passes good subsets of neighboring centers from parents to offspring by swapping subtrees. Experimental results indicate that our GA finds the global optimum for data sets with known optima and finds good solutions for large simulated data sets.
Operations Research Letters | 2005
Michael J Laszlo; Sumitra Mukherjee
The constrained forest problem seeks a minimum-weight spanning forest in an undirected edge-weighted graph such that each tree spans at least a specified number of vertices. We present a greedy heuristic for this NP-hard problem, whose solutions are at least as good as, and often better than, those produced by the best-known 2-approximate heuristic.
IEEE Transactions on Knowledge and Data Engineering | 2009
Michael J Laszlo; Sumitra Mukherjee
The NP-hard microaggregation problem seeks a partition of data points into groups of minimum specified size k, so as to minimize the sum of the squared euclidean distances of every point to its groups centroid. One recent heuristic provides an O(k3) guarantee for this objective function and an O(k2) guarantee for a version of the problem that seeks to minimize the sum of the distances of the points to its groups centroid. This paper establishes approximation bounds for another microaggregation heuristic, providing better approximation guarantees of O(k2) for the squared distance measure and O(k) for the distance measure.
Discrete Applied Mathematics | 2006
Michael J Laszlo; Sumitra Mukherjee
The constrained forest problem seeks a minimum-weight spanning forest in an undirected edge-weighted graph such that each tree spans at least a specified number of vertices. We present a structured class of greedy heuristics for this NP-hard problem, and identify the best heuristic.
Journal of Systems and Software | 2015
Michael J Laszlo; Sumitra Mukherjee
Our paper presents a microaggregation method to prevent disclosure of sensitive data.We define a local search method that monotonically improves solution quality while preserving feasibility.We employ local search in an iterated local search heuristic.Our method consistently identifies better quality solutions than all extant methods on benchmark problems. Microaggregation is a disclosure control method used to protect microdata. We introduce a local search method and employ it in an iterated local search algorithm for the NP-hard minimum information loss microaggregation problem. Experimental results with benchmark data sets demonstrate that our algorithm consistently identifies better quality solutions than extant microaggregation methods.
Optimization Letters | 2008
Michael J Laszlo; Sumitra Mukherjee
Building on an existing 2-approximate algorithm for the class of network design problems with downwards-monotone demand functions, many of which are NP-hard, we present an algorithm that produces solutions that are at least as good as and typically better than solutions produced by the existing algorithm.
Journal of Systems and Software | 2013
Michael J Laszlo; Sumitra Mukherjee
Microaggregation is a disclosure limitation method that provides security through k-anonymity by modifying data before release but does not allow suppression of data. We define the microaggregation problem with suppression (MPS) to accommodate data suppression, and present a polynomial-time algorithm, based on dynamic programming, for optimal univariate microaggregation with suppression. Experimental results demonstrate the practical benefits of suppressing a few carefully selected data points during microaggregation using our method.
ieee high performance extreme computing conference | 2016
Newton Campbell; Michael J Laszlo; Sumitra Mukherjee
Point-to-point shortest path distance queries are a core operation in graph analytics. However, preprocessing algorithms that speed up these queries rely on large data structures for reference. In this paper, we discuss the computational challenge introduced by these data structures when using landmark-based preprocessing algorithms on large graphs. We introduce a new heuristic for the A* algorithm that references a data structure of size θ(|L|2 + |V|), where L represents a set of strategically chosen landmark vertices and V the set of vertices in the graph. This heuristics benefits are permitted by an approach for computing lower bounds based on generalized polygon inequalities. In this approach, each landmark stores the distances between the landmark and vertices within its graph partition. The heuristic is experimentally compared with a previous landmark heuristic in a fixed-memory environment, as an analog to an embedded system. The new heuristic demonstrates a reduction in overall computational time and memory requirements in this environment.
Archive | 1995
Michael J Laszlo