Allan Gottlieb | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Allan Gottlieb is active.

Explore More

Publication

Featured researches published by Allan Gottlieb.

acm international conference on digital libraries | 1999

A prototype implementation of archival Intermemory

Yuan Chen; Jan Edler; Andrew V. Goldberg; Allan Gottlieb; Sumeet Sobti; Peter N. Yianilos

An Archival Intermemory solves the problem of highly survivable digital data storage in the spirit of the Internet. In this paper we describe a prototype implementation of Intermemory, including an overall system architecture and implementations of key system components. The result is a working Intermemory that tolerates up to 17 simultaneous node failures, and includes a Web gateway for browser-based access to data. Our work demonstrates the basic feasibility of Intermemory and represents signi cant progress towards a deployable system.

international symposium on computer architecture | 1998

The NYU ultracomputer—designing a MIMD, shared-memory parallel machine

Allan Gottlieb; Ralph Grishman; Clyde P. Kruskal; Kevin P. McAuliffe; Larry Rudolph; Marc Snir

The design for the NYU ultracomputer, a shared-memory MIMD parallel machine composed of thousands of autonomous processing elements is presented. This machine uses an enhanced message switching network with the geometry of an omega-network to approximate the ideal behaviour of Schwartzs paracomputer model of computation and to implement efficiently the important fetch-and-add synchronisation primitive. The hardware which would be required to build a 4096 processor system using 1990s technology is outlined. System software issues are discussed and analytic studies of the network performance are presented. A sample of efforts to implement and simulate parallel variants of important scientific programs is included. 37 references.

international symposium on computer architecture | 1982

The NYU Ultracomputer—designing a MIMD, shared-memory parallel machine (Extended Abstract)

Allan Gottlieb; Ralph Grishman; Clyde P. Kruskal; Kevin P. McAuliffe; Larry Rudolph; Marc Snir

We present the design for the NYU Ultracomputer, a shared-memory MIMD parallel machine composed of thousands of autonomous processing elements. This machine uses an enhanced message switching network with the geometry of an Omega-network to approximate the ideal behavior of Schwartzs paracomputer model of computation and to implement efficiently the important fetch-and-add synchronization primitive. We outline the hardware that would be required to build a 4096 processor system using 1990s technology. We also discuss system software issues, and present analytic studies of the network performance. Finally, we include a sample of our effort to implement and simulate parallel variants of important scientific programs.

Journal of Chemical Physics | 2002

Fast tree search for enumeration of a lattice model of protein folding

Henry Cejtin; Jan Edler; Allan Gottlieb; Robert Helling; Hao Li; James Philbin; Ned S. Wingreen; Chao Tang

33. We used two types of amino acids—hydrophobic ~H! and polar ~P!—to make up the sequences, so there were 2 36 ’6.87310 10 different sequences. The total number of distinct structures was 84 731 192. We made use of a simple solvation model in which the energy of a sequence folded into a structure is minus the number of hydrophobic amino acids in the ‘‘core’’ of the structure. For every sequence, we found its ground state or ground states, i.e., the structure or structures for which its energy is lowest. About 0.3% of the sequences have a unique ground state. The number of structures that are unique ground states of at least one sequence is 2 662 050, about 3% of the total number of structures. However, these ‘‘designable’’ structures differ drastically in their designability, defined as the number of sequences whose unique ground state is that structure. To understand this variation in designability, we studied the distribution of structures in a high dimensional space in which each structure is represented by a string of 1’s and 0’s, denoting core and surface sites, respectively.

hawaii international conference on system sciences | 1994

Locating multiprocessor TLBs at memory

Patricia J. Teller; Allan Gottlieb

Compares the performance, in shared-memory multiprocessors, of locating translation-lookaside buffers (TLBs) at processors with that of locating TLBs at memory. The comparison is based on trace-driven simulations of multiprocessors with log N-stage networks interconnecting N processors and N memory modules. For the systems and workloads studied, memory-based TLBs perform noticeably better than processor-based TLBs. Provided that memory is organized as multiple paging arenas, i.e., multiple clusters of memory modules where the mapping of a page to a cluster is fixed. The cost of a processor-based TLB reload is at least log N because of network transit. In contrast, the cost of a memory-based TLB reload can be smaller, since network transits are not required. Furthermore, with multiple paging arenas, the number of reloads is smaller with memory-based TLBs.<<ETX>>

Circuits Systems and Signal Processing | 1987

Designing VLSI network nodes to reduce memory traffic in a shared memory parallel computer

Susan R. Dickey; Allan Gottlieb; Richard Kenner; Yue Sheng Liu

Serialization of memory access can be a critical bottleneck in shared memory parallel computers. The NYU Ultracomputer, a large-scale MIMD (multiple instruction stream, multiple data stream) shared memory architecture, may be viewed as a column of processors and a column of memory modules connected by a rectangular network of enhanced 2×2 buffered crossbars. These VLSI nodes enable the network to combine multiple requests directed at the same memory location. Such requests include a new coordination primitive, fetch- and-add, which permits task coordination to be achieved in a highly parallel manner. Processing within the network is used to reduce serialization at the memory modules.To avoid large network latency, the VLSI network nodes must be high-performance components. Design tradeoffs between architectural features, asymptotic performance requirements, cycle time, and packaging limitations are complex. This report sketches the Ultracomputer architecture and discusses the issues involved in the design of the VLSI enhanced buffered crossbars which are the key element in reducing serialization.

international symposium on computer architecture | 1998

Retrospective: a personal retrospective on the NYU ultracomputer

Allan Gottlieb

The NYU Ultracomputer project, a long lasting research endeavor, was started in 1979 by Jack Schwartz and was very active throughout the 80s and the early 90s. This project was an early attempt to explore the possibilities of large scale, sharedmemory parallel computers. The project was broad spectrumed: we developed hardware primitives for coordination and implemented full-custom VLSI chips to speed their concurrent execution, we built prototype multiprocessors incorporating these chips, we implemented a scalable, symmetric Unix operating system, we ported compilers to several microprocessors, we contributed to network design and analysis, we implemented a parallel lisp system and worked on prologue, and we developed algorithms and software for scientific applications. The 1982 ISCA paper you have before you, a revision of which appeared in the Feb. 1983 “IEEE Transactions on Computing“, includes contributions in three areas: first, the coordination primitive fetch-and-add and its generalization to fetch-andphi; second, a technique inspired by Larry Rudolph for combining in hardware concurrent memory references, including concurrent fetchand-adds, directed at the same memory location, and a high-level VLSI design, inspired by Marc Snir and Jon Solworth, for network switches implementing combining; and third, analytic results primarily due to Clyde Kruskal and Marc Snir on the performance of buffered multistage networks. Fetch-and-add is now present on commercial processors including models from SGI-Cray and Intel. Hardware combining inspired a mini-industry of research results from a number of institutions, but has not been realized commercially in anything like the generality we proposed it. The analytic network results form part of a well developed theory with many contributors. In the mid 8Os, we cooperated closely with a team from IBM research in the development of their RP3 system. This cooperation raised our fame (and funding level) significantly and had several other very positive results. There were also some imperfections in the cooperation as discussed below: one in particular highlighted a weakness in the NYU software team in general and Allan Gottlieb in particular. Our first compilers were PCC-based and targeted the 68K microprocessor used in our early hardware. When we decided to use the AMD 29K for the Ultra III prototype, we chose GCC as the compiler and we, primarily Richard Kenner, retargeted it to the 29K and to the IBM ROMP used in the IBM RTjPC workstations that constituted our development environment (and that were used in the IBM RP3). Kenner became very interested in GCC, retargeted it to other microprocessors, and has been the lead developer of its machine independent portion for the last several years, an important “spin-off” of our research efforts. I summarize a few accomplishments in the next section. Since we have bragged before and many of these boasts can be found starting at my home page, I have kept the next section short. Less easily found in the literature is an account of our shortcomings, which is the subject of the longer section thereafter. Finally, I offer some thoughts on doing it again.

Proceedings of SPIE - The International Society for Optical Engineering | 1982

New york university (nyu) ultracomputer — a general-purpose parallel processor

Allan Gottlieb; Ralph Grishman; Clyde P. Kruskal; Kevin P. McAuliffe; Larry Rudolph; Marc Snir

We present the design for the NYU ultracomputer, a general-purpose MIMD parallel processor composed of thousands of autonomous processing elements. This machine uses an enhanced omega-network to approximate the ideal behavior of Schwartzs paracomputer model of computation and to effeciently implement the important replace-add synchronization primitive. The novelty of the design lies in the enhanced network, in particular in the constituent switches and interfaces. We also present the results of analytic and simulation studies of the network as well as including a sample of our efforts to implement parallel variants of important scientific codes.

Archive | 1989