Nils Gura
Sun Microsystems
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nils Gura.
international symposium on microarchitecture | 2005
Hans Eberle; Sheueling Chang Shantz; Vipul Gupta; Nils Gura; Leonard D. Rarick; Lawrence Spracklen
This article describes low-cost techniques for accelerating the ECC and RSA public-key cryptosystems on general-purpose processor architectures. We focus on hardware acceleration of public-key cryptosystems on 64-bit server machines. A prototype based on a Sparc CPU data path shows a clear performance advantage of ECC over RSA.
ieee international conference on high performance computing data and analytics | 2008
Hans Eberle; Pedro Javier García; Jose Flich; José Duato; Robert J. Drost; Nils Gura; David Hopkins; Wladek Olesinski
We describe a novel way to implement high-radix crossbar switches. Our work is enabled by a new chip interconnect technology called proximity communication (PxC) that offers unparalleled chip IO density. First, we show how a crossbar architecture is topologically mapped onto a PxC-enabled multi-chip module (MCM). Then, we describe a first prototype implementation of a small-scale switch based on a PxC MCM. Finally, we present a performance analysis of two large-scale switch configurations with 288 ports and 1,728 ports, respectively, contrasting a 1-stage PxC-enabled switch and a multi-stage switch using conventional technology. Our simulation results show that (a) arbitration delays in a large 1-stage switch can be considerable, (b) multi-stage switches are extremely susceptible to saturation under non-uniform traffic, a problem that becomes worse for higher radices (1-stage switches, in contrast, are not affected by this problem).
network on chip architectures | 2009
Jesús Camacho Villanueva; Jose Flich; José Duato; Hans Eberle; Nils Gura; Wladek Olesinski
As the number of processing nodes on chip multi-processors (CMPs) keeps increasing, providing efficient communication with the on-chip interconnect becomes increasingly critical. With 32-core CMP designs on the drawing table of engineers, there is a demand for accurate simulation models that capture all the complexities and interactions of the different design layers including the application, operating system, cache hierarchy, coherency protocol, and other on-chip resources. These components cannot be modeled anymore in isolation as unpredicted performance anomalies may arise once all the system variables are taken into account. In this paper, we present a simulation framework for CMP systems, focusing our attention on the on-chip network. We show preliminary results for the choice of key network parameters (topology, flit size) with respect to the behavior and performance of applications running on top of different network configurations. This paper tries to convey the need for an overall CMP system simulator as a way to accurately characterize the actual behavior of the on-chip network.
high performance interconnects | 2009
Wladek Olesinski; Hans Eberle; Nils Gura; Bob Dickson; Aron J. Silverton; Sumti Jairath; Peter Yakutis
In this paper, we focus on fair access to bandwidth in daisy-chained interconnects. We first analyze the original HyperTransport fairness protocol and show that it is not perfectly fair. We then propose a new, fair scheme whose complexity is similar to that of the HyperTransport protocol.
international conference on communications | 2008
Aditya Dua; Benjamin Yolken; Nicholas Bambos; Wladek Olesinski; Hans Eberle; Nils Gura
A novel architecture was proposed in [1] to address scalability issues in large, high speed packet switches. The architecture proposed in [1], namely OBIG (output buffers with input groups), distributes the switch fabric across multiple chips, which communicate via high speed interconnects enabled by proximity communication (PC), a recently developed circuit technology [2]. An OBIG switch aggregates multiple input flows inside the switch fabric, thereby significantly reducing the amount of memory required for internal buffers, vis-a-vis a conventional buffered crossbar, which has buffers at every crosspoint. Thus, the OBIG architecture is promising for realizing terabit switches with hundreds of ports. This paper studies packet scheduling algorithms which help realize the potential of OBIG-like switch architectures. The emphasis here is on designing backlog aware scheduling algorithms, while ensuring desirable traits such as low computational complexity and scalability. The efficacy of the proposed scheduling algorithms with respect to performance metrics such as average delay and fairness is demonstrated via simulations under a variety of scenarios.
high performance switching and routing | 2008
Wladek Olesinski; Hans Eberle; Nils Gura; Nikos Chrysos
We have recently proposed a novel architecture called Output Buffered Switch with Input Groups (OBIG) together with a scheduler called Parallel Wrapped Wave Front Arbiter with Fast Scheduler (PWWFA-FS) for building large, fast switches. In this paper we continue this work and tackle the issue of flow control. We show how on/off and credit flow control schemes can be applied in OBIG; we further introduce implicit flow control that does not use any explicit control information. Simulation results show how the proposed flow control schemes perform in our architecture.
asian solid state circuits conference | 2007
M. Shah; J. Barren; J. Brooks; Robert T. Golla; G. Grohoski; Nils Gura; R. Hetherington; P. Jordan; M. Luttrell; C. Olson; B. Sana; Denis Sheahan; Lawrence Spracklen; A. Wynn
network and distributed system security symposium | 2004
Vipul Gupta; Douglas Stebila; Stephen Fung; Sheueling Chang Shantz; Nils Gura; Hans Eberle
Archive | 2003
Hans Eberle; Nils Gura; Daniel Finchelstein; Sheueling Chang-Shantz; Vipul Gupta
Archive | 2006
Hans Eberle; Nils Gura; Wladyslaw Olesinski