Giuseppe Ciaccio
University of Genoa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Giuseppe Ciaccio.
parallel computing | 2000
Giovanni Chiola; Giuseppe Ciaccio
Abstract The Genoa Active Message MAchine (GAMMA) is an efficient communication layer for 100base-T clusters of Personal Computers under the Linux operating system (OS). It is based on Active Ports, a communication mechanism derived from Active Messages. Active Ports share most of the low-level optimization opportunities with Generic Active Messages while offering a higher-level programming interface not only in the SPMD but also in the MIMD and client/server paradigms. In addition to point-to-point communications, multi-cast, barrier synchronization, scatter, and gather primitives have also been developed based on Active Ports and exploiting shared 100base-T LAN technology in an optimal way. GAMMA Active Ports deliver excellent communication performance at the user level (latency 13 μ s, maximum throughput 12.2 MByte/s, half-power point reached with 200 byte long messages), thus enabling cost-effective cluster computing on 100base-T. Despite being implemented at the kernel level in the Linux OS, performance numbers of GAMMA Active Ports are much better than many other LAN-oriented communication layers, including so called “user-level” ones (e.g. U-Net). Some code porting efforts have already shown that several applications are reasonably easy to develop on top of GAMMA and that they can actually take advantage of the efficient point-to-point as well as collective communication primitives offered by our prototype library implementation. A porting of the MPICH higher-level interface atop GAMMA is currently under way.
privacy enhancing technologies | 2006
Giuseppe Ciaccio
In the framework of peer to peer distributed systems, the problem of anonymity in structured overlay networks remains a quite elusive one. It is especially unclear how to evaluate and improve sender anonymity, that is, untraceability of the peers who issue messages to other participants in the overlay. In a structured overlay organized as a chordal ring, we have found that a technique originally developed for recipient anonymity also improves sender anonymity. The technique is based on the use of imprecise entries in the routing tables of each participating peer. Simulations show that the sender anonymity, as measured in terms of average size of anonymity set, decreases slightly if the peers use imprecise routing; yet, the anonymity takes a better distribution, with good anonymity levels becoming more likely at the expenses of very high and very low levels. A better quality of anonymity service is thus provided to participants.
international parallel processing symposium | 1998
Giuseppe Ciaccio
The current prototype of the Genoa Active Message MAchine (GAMMA) is a low-overhead, Active Messages-based inter-process communication layer implemented mainly at kernel level in the Linux Operating System. It runs on a pool of low-cost Pentium-based Personal Computers (PCs) networked by a low-cost 100base-TX Ethernet hub to form a low-cost message-passing parallel platform. In this paper we describe in detail how GAMMA could achieve unprecedented communication performance (less than 13 μs one-way user-to-user latency time and up to 98% of the communication throughput of the raw interconnection hardware) on such a kind of low-cost parallel architecture.
european pvm mpi users group meeting on recent advances in parallel virtual machine and message passing interface | 2000
Giuseppe Ciaccio; Giovanni Chiola
The Genoa Active Message MAchine (GAMMA) is a light-weight communication system based on the Active Ports paradigm, originally designed for efficient implementation over low-cost Fast Ethernet interconnects. A very efficient porting of MPICH atop GAMMA as been recently completed, providing unprecedented messaging performance over the cheapest cluster computing technology currently available. In this paper we describe the recently completed porting of GAMMA to the GNIC-II Gigabit Ethernet adapters by Packet Engines. A combination of less than 10 µs latency and more than 93 MByte/s throughput demonstrates the possibility for Gigabit Ethernet and GAMMA to yield messaging performance comparable to the ones from many lightweight protocols running on Myrinet. This result is of interest, given the envisaged drop in cost of Gigabit Ethernet due to the forthcoming transition from fiber optic to UTP cabling and ever increasing mass market production of such standard interconnect.
Cluster Computing | 2003
Giuseppe Ciaccio
The Genoa Active Message MAchine (GAMMA) is a lightweight communication system based on the Active Ports paradigm, originally designed for efficient implementation over low-cost Fast Ethernet interconnects. In this paper we report about the recently completed porting of GAMMA to the Packet Engines GNIC-II and the Netgear GA620 Gigabit Ethernet adapters, and provide a comparison among GAMMA, MPI/GAMMA, TCP/IP, and MPICH, on such commodity interconnects, using different performance metrics. With a combination of low end-to-end latency (9.5 μs with GNIC-II, 32 μs with GA620) and high transmission throughput (almost 97 MByte/s with GNIC-II and 125 MByte/s with GA620, the latter obtained without changing the firmware of the adapter), GAMMA demonstrates the potential for Gigabit Ethernet lightweight protocols to yield messaging performance comparable to the best Myrinet-based messaging systems. This result is of interest, given the envisaged drop in cost of Gigabit Ethernet due to the transition from fiber optic to UTP cabling and ever increasing mass market production of such standard interconnect. We also reports about a technique for message fragmentation that is commonly exploited to increase the throughput with short message. When a different, though more widely used, performance metrics is considered, such a technique results into a performance loss rather than improvement.
local computer networks | 2003
A. Di Marco; Giovanni Chiola; Giuseppe Ciaccio
A cluster of PCs can be seen as a collection of networked low cost disks; such a collection can be operated by proper software so as to provide the abstraction of a single, larger block device. By adding suitable data redundancy, such a disk collection as a whole could act as single, highly fault tolerant, distributed RAID device, providing capacity and reliability along with the convenient price/performance typical of commodity clusters. We report about the design and performance of DRAID, a distributed RAID prototype running on a Gigabit Ethernet cluster of PCs. DRAID offers storage services under a single I/O space (SIOS) block device abstraction. The SIOS feature implies that the storage space is accessible by each of the stations in the cluster, rather than throughout one or few end-points, with a potentially higher aggregate I/O bandwidth and better suitability to parallel I/O.
local computer networks | 2002
Giuseppe Ciaccio; Marco Ehlert; Bettina Schnor
In this paper we report about the recently completed porting of GAMMA to the Netgear GA621 Gigabit Ethernet adapter, and provide a comparison among GAMMA, MPI/GAMMA, TCP/IP and MPICH/TCP, based on the Netgear GA621 and the older Netgear GA620 network adapters and using different device drivers, in a Gigabit Ethernet cluster of PC running Linux 2.4. GAMMA (the Genoa Active Message Machine) is a lightweight messaging system based on an active message-like paradigm, originally designed for efficient exploitation of Fast Ethernet interconnects. The comparison includes simple latency/bandwidth evaluation of the messaging systems on both adapters, as well as performance comparisons based on the NAS Parallel Benchmarks and an end-user fluid dynamics application called Modular Ocean Model (MOM). The analysis of results provides useful hints concerning the efficient use of Gigabit Ethernet with clusters of PC. In particular, it emerges that GAMMA on the GA621 adapter, with a combination of low end-to-end latency (8.5 /spl mu/s) and high throughput (118.4 MByte/s), provides a performing, cost-effective alternative to proprietary high-speed networks, e.g. Myrinet, for a wide range of cluster computing applications.
international parallel and distributed processing symposium | 2001
Giuseppe Ciaccio
The Genoa Active Message MAchine (GAMMA) is a lightweight communication system based on the Active Ports paradigm, originally designed for efficient implementation over low-cost Fast Ethernet interconnects. In this paper we report about the recently completed porting of GAMMA to the Packet Engines GNIC-II and the Netgear GA620 Gigabit Ethernet adapters, and provide a comparison among GAMMA, MPI/GAMMA, TCP/IP, and MPICH, on such commodity interconnects, using different performance metrics. With a combination of low end-to-end latency (9.5 μs with GNIC-II, 32 μs with GA620) and high transmission throughput (almost 97 MByte/s with GNIC-II and 125 MByte/s with GA620, the latter obtained without changing the firmware of the adapter), GAMMA demonstrates the potential for Gigabit Ethernet lightweight protocols to yield messaging performance comparable to the best Myrinet-based messaging systems. This result is of interest, given the envisaged drop in cost of Gigabit Ethernet due to the transition from fiber optic to UTP cabling and ever increasing mass market production of such standard interconnect. We also reports about a technique for message fragmentation that is commonly exploited to increase the throughput with short message. When a different, though more widely used, performance metrics is considered, such a technique results into a performance loss rather than improvement.
international parallel and distributed processing symposium | 2007
Giuseppe Ciaccio
We propose and motivate an API for programming distributed applications using a structured overlay network of peers as infrastructure. The API offers simple primitives and powerful mechanisms, in a way that is independent from the underlying overlay. The dynamic set of participants is abstracted by providing a flat space of keys, transparently scattered across all participants in the overlay. The API primitives allow application instances to send messages towards individual keys. Two different kinds of messages can be exchanged, namely, unidirectional and request-response; the latter takes place in a split-phase non-blocking way, so that the application can be made latency-tolerant and thus more performing. The request-response pattern is also shown to be crucial for those applications demanding a degree of user anonymity. The semantics of messages is not defined by the API itself. Rather, the API offers a mechanism to allow the application to set up handlers, which are upcalls to run upon message arrivals at each peer. The overall behaviour of the application is thus shaped by the handlers. The API also allows to define application-level handlers for other two typical tasks of any dynamic peer-to-peer system, namely, the migration of keys across peers after new peer arrivals, and the regeneration of missing keys after peer departures.
Lecture Notes in Computer Science | 2003
Giuseppe Ciaccio
Memory copies in messaging systems can be a major source of performance degradation in cluster computing. In this paper we discuss a system which can offload a host CPU from most of the overhead of copying data between distinct regions in the host physical memory. The sistem is implemented as a special-purpose Linux device driver operating a generic, non-programmable Gigabit Ethernet adapter connected to itself. Whenever the descriptor-based DMA engines of the adapter are instructed to start a data communication, the data are read from the host memory and written to the memory itself thanks to the loopback cable; this is semantically equivalent to a non-blocking memory copy operation performed by the two DMA engines. Suitable completion test/waiting routines are also implemented, in order to provide traditional, blocking semantics in a split-phase fashion. An implementation of MPI using this system in place of traditional memcpy() calls on receive shows a significantly lower receive overhead.