Michael A. Palis
University of Pennsylvania
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael A. Palis.
IEEE Transactions on Parallel and Distributed Systems | 1996
Michael A. Palis; Jing-Chiou Liou; David S. L. Wei
This paper addresses the problem of scheduling parallel programs represented as directed acyclic task graphs for execution on distributed memory parallel architectures. Because of the high communication overhead in existing parallel machines, a crucial step in scheduling is task clustering, the process of coalescing fine grain tasks into single coarser ones so that the overall execution time is minimized. The task clustering problem is NP-hard, even when the number of processors is unbounded and task duplication is allowed. A simple greedy algorithm is presented for this problem which, for a task graph with arbitrary granularity, produces a schedule whose makespan is at most twice optimal. Indeed, the quality of the schedule improves as the granularity of the task graph becomes larger. For example, if the granularity is at least 1/2, the makespan of the schedule is at most 5/3 times optimal. For a task graph with n tasks and e inter-task communication constraints, the algorithm runs in O(n(n lg n+e)) time, which is n times faster than the currently best known algorithm for this problem. Similar algorithms are developed that produce: (1) optimal schedules for coarse grain graphs; (2) 2-optimal schedules for trees with no task duplication; and (3) optimal schedules for coarse grain trees with no task duplication.
IEEE Transactions on Computers | 1987
Jik Hyun Chang; Oscar H. Ibarra; Michael A. Palis
We show that a one-way two-dimensional iterative array of finite-state machines (2-DIA) can recognize and parse strings of any context-free language in linear time. What makes this result interesting and rather surprising is the fact that each processor of the array holds only a fixed amount of information (independent of the size of the input) and communicates with its neighbors in only one direction. This makes for a simple VLSI implementation. Although it is known that recognition can be done on a 2-DIA, previous parsing algorithms require the processors to have unbounded memory, even when the communication is two-way. We also consider the problem of finding approximate patterns in strings, the string-to-string correction problem, and the longest common subsequence problem, and show that they can be solved in linear time on a 2-DIA.
IEEE Transactions on Acoustics, Speech, and Signal Processing | 1987
Oscar H. Ibarra; Michael A. Palis
Optimal linear-time algorithms for solving recurrence equations on simple systolic arrays are presented. The systolic arrays use only one-way communication between processors and communicate with the external environment through only one I/O port. Because of their architectural simplicity, the arrays are well suited for direct VLSI implementation. Applications to some pattern recognition and sequence comparison problems are given. For example, it is shown that the set of (k + 2)-tuples of strings (x 1 , . . . , x k+1 , Y) such that y is a shuffle of x 1 ,. . . , x k+1 can be recognized by a one-way k-dimensional systolic array in (k + 1)n - k time. The longest common subsequence (LCS) problem and the string-to-string correction problem are also considered: the length of an LCS of k + 1 sequences can be computed by a one-way k-dimensional systolic array in (k + 1) n - k time; the edit distance between two strings can be computed by a one-way dimensional systolic array in 2n - 1 time. Applications to other related problems, e.g., dynamic time warping and optimum generalized alignment, as well as optimal-time simulations of multihead acceptors and multitape transducers are also given.
mathematical foundations of computer science | 1988
Oscar H. Ibarra; Michael A. Palis
We analyse some properties of two-dimensional iterative and cellular arrays. For example, we show that arrays operating in T(n) time can be sped up to operate in time n+(T(n)−n)k. Thus, a running time of the form n+R(n), where R(n) is sublinear (e.g., log n, log∗n, etc.), can still be sped up to n+R(n)k. This type of speed-up is stronger than any previously known speed-up for any type of device. Even for Turing machines, the speed-up is only from T(n) to n+T(n)k. Another interesting result is that simultaneous space-reduction and speed-up is possible, i.e., the number of processors of the array can be reduced while simultaneously speeding up its computation. Unlike previous approaches, we carry out our analyses using sequential machine characterizations of the iterative and cellular arrays. Consequently, we are able to prove our results on the much simpler sequential machine models.
Journal of Parallel and Distributed Computing | 1994
Michael A. Palis; Sanguthevar Rajasekaran
We consider the problem of permutation routing on a star graph, an interconnection network which has better properties than the hypercube. In particular, its degree and diameter are sublogarithmic in the network size. We present optimal randomized routing algorithms that run in O(D) steps (where D is the network diameter) for the worst-case input with high probability. We also show that for the n-way shuffle network with N = nn nodes, there exists a randomized routing algorithm which runs in O(n) time with high probability. Another contribution of this paper is a universal randomized routing algorithm that could do optimal routing for a large class of networks (called leveled networks) which includes the star graph. The associative analysis is also network-independent. In addition, we present a deterministic routing algorithm, for the star graph, which is near optimal. All the algorithms we give are oblivious. As an application of our routing algorithms, we also show how to emulate a PRAM optimally on this class of networks.
Theoretical Computer Science | 1992
Michael A. Palis; Sunil M. Shende
Control grammars, a generalization of context-free grammars recently introduced for use in natural language recognition, are investigated. In particular, it is shown that a hierarchy of non-context-free languages, called control language hierarchy (CLH), generated by control grammars can be recognized in polynomial time. Previously, the best-known upper bound was exponential time. It is also shown that CLH is in NC(2), the class of languages recognizable by uniform boolean circuits of polynomial size and O(log2 n) depth.
Theory of Computing Systems \/ Mathematical Systems Theory | 1995
Michael A. Palis; Sunil M. Shende
We investigate a progression of grammatically defined language families, thecontrol language hierarchy. This hierarchy has been studied recently from the perspective of providing a linguistic framework for natural language syntax. We exhibit a progression of pumping lemmas, one for each family in the hierarchy, thereby showing that the hierarchy is strictly separable.
Parallel Processing Letters | 1995
Michael A. Palis; Jing-Chiou Liou; Sanguthevar Rajasekaran; Sunil M. Shende; David S. L. Wei
The scheduling problem for dynamic tree-structured task graphs is studied and is shown to be inherently more difficult than the static case. It is shown that any online scheduling algorithm, deterministic or randomized, has competitive ratio Ω((1/g)/logd(1/g)) for trees with granularity g and degree at most d. On the other hand, it is known that static trees with arbitrary granularity can be scheduled to within twice the optimal schedule. It is also shown that the lower bound is tight: there is a deterministic online tree scheduling algorithm that has competitive ratio O((1/g)/logd(1/g)). Thus, randomization does not help.
symposium on frontiers of massively parallel computation | 1990
Michael A. Palis; Donald K. Krecker
A parallel algorithm for square-root Kalman filtering has been developed and implemented on the Connection Machine (CM). Performance measurements show that the CM filter runs in time linear in the state vector size. This represents a great improvement over serial implementations, which run in cubic time. A specific multiple-target-tracking application in which several targets are to be tracked simultaneously, each requiring one or more filters, is considered. A parallel algorithm that, for fixed-size filters, runs in constant time, independently of the number of filters simultaneously processed, has been developed.<<ETX>>
foundations of computer science | 1984
Oscar H. Ibarra; Michael A. Palis; Sam M. Kim
We offer a methodology for simplifying the design and analysis of systolic systems. Specifically, we give characterization of systolic arrays in terms of (single processor) sequential machines which are easier to analyze and to program. We give several examples to illustrate the design methodology. In particular, we show how systolic arrays can be easily designed to implement priority queues, integer bitwise multiplication, dynamic programming, etc. Because the designs are based on the sequential machine, the constructions we obtain are much simpler then those that have appeared in the literature. We also give some results concerning the properties and computational power (e.g., speed-up, hierarchy, etc.) of systolic arrays.