Is this you? Create Your Porfile

Lorenzo Rota

Karlsruhe Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lorenzo Rota is active.

Explore More

Publication

Featured researches published by Lorenzo Rota.

IEEE Transactions on Nuclear Science | 2015

A PCIe DMA Architecture for Multi-Gigabyte Per Second Data Transmission

Lorenzo Rota; Michele Caselle; Suren Chilingaryan; Andreas Kopmann; M. Weber

We developed a direct memory access (DMA) engine compatible with the Xilinx PCI Express (PCIe) core to provide a high-performance and low-occupancy alternative to commercial solutions. In order to maximize the PCIe throughput while minimizing the FPGA resources utilization, the DMA engine adopts a novel strategy where the DMA address list is stored inside the FPGA and not in the central memory of the host CPU. The FPGA design package is complemented with simple register access to control the DMA engine by a Linux driver. The design is compatible with Xilinx FPGA Families 6 and 7, and operates with the Xilinx PCIe endpoint Generation 1 and 2 with all lane configurations (x1, x2, x4, x8). A multi-engine architecture is also presented, where two x8 lanes cores are used in parallel together with a PCIe bridge, to exploit fully the capabilities of a PCIe Gen2 x16 lanes link. A data throughput of 3461 MBytes/s has been achieved with a single PCIe Gen2 x8 lanes endpoint. If the dual-engine architecture is used, the throughput is increased up to 6920 MBytes/s. The presented DMA is currently used in several experiments at the ANKA synchrotron light source.

ieee-npss real-time conference | 2014

A new DMA PCIe architecture for Gigabyte data transmission

Lorenzo Rota; Michele Caselle; Suren Chilingaryan; Andreas Kopmann; Marc Weber

PCI Express (PCIe) is a high-speed serial point-to-point interconnect that delivers high-performance data throughput. KIT has developed a Direct Memory Access (DMA) engine compatible with the Xilinx PCIe core to provide a smart and low-occupancy alternative logic to expensive commercial solutions. In order to maximize the PCIe throughput the DMA engine adopts a new strategy, where the DMA descriptor list is stored inside the FPGA and not in the central memory system. The FPGA design package is complemented with a simple register access to control the DMA engine by a Linux driver. A handshaking sequence between the DMA engine and the Linux driver ensures that no errors occure, even in data transfers of several hundreds of Gigabytes. The design has been tested with Xilinx FPGA Families 6 and 7, and operates with the Xilinx PCIe endpoint generation 1 and 2 with all lane configurations (x1, x2, x4, x8, x16). Data throughput of more than 3.4 GB/s has been achieved with a PCIe Gen 2 ×8 lanes endpoint. The proposed DMA is currently used in several experiments at the ANKA synchrotron light source.

Journal of Instrumentation | 2016

A high-throughput readout architecture based on PCI-Express Gen3 and DirectGMA technology

Lorenzo Rota; Matthias Vogelgesang; L.E. Ardila Perez; Michele Caselle; Suren Chilingaryan; T. Dritschler; N. Zilio; Andreas Kopmann; M. Balzer; M. Weber

Modern physics experiments produce multi-GB/s data rates. Fast data links and high performance computing stages are required for continuous data acquisition and processing. Because of their intrinsic parallelism and computational power, GPUs emerged as an ideal solution to process this data in high performance computing applications. In this paper we present a high-throughput platform based on direct FPGA-GPU communication. The architecture consists of a Direct Memory Access (DMA) engine compatible with the Xilinx PCI-Express core, a Linux driver for register access, and high- level software to manage direct memory transfers using AMDs DirectGMA technology. Measurements with a Gen3 x8 link show a throughput of 6.4 GB/s for transfers to GPU memory and 6.6 GB/s to system memory. We also assess the possibility of using the architecture in low latency systems: preliminary measurements show a round-trip latency as low as 1 μs for data transfers to system memory, while the additional latency introduced by OpenCL scheduling is the current limitation for GPU based systems. Our implementation is suitable for real-time DAQ system applications ranging from photon science and medical imaging to High Energy Physics (HEP) systems.

Physical review accelerators and beams | 2016

Fast Mapping of Terahertz Bursting Thresholds and Characteristics at Synchrotron Light Sources

Miriam Brosi; Johannes Steinmann; Edmund Blomley; Erik Bründermann; Michele Caselle; N. Hiller; Benjamin Kehrer; Y.-L. Mathis; Michael J. Nasse; Lorenzo Rota; Manuel Schedler; Patrik Schönfeldt; Marcel Schuh; Markus Schwarz; Marc Weber; Anke-Susanne Müller

Dedicated optics with extremely short electron bunches enable synchrotron light sources to generate intense coherent THz radiation. The high degree of spatial compression in this so-called low-αc optics entails a complex longitudinal dynamics of the electron bunches, which can be probed studying the fluctuations in the emitted terahertz radiation caused by the micro-bunching instability (“bursting”). This article presents a “quasi-instantaneous” method for measuring the bursting characteristics by simultaneously collecting and evaluating the information from all bunches in a multi-bunch fill, reducing the measurement time from hours to seconds. This speed-up allows systematic studies of the bursting characteristics for various accelerator settings within a single fill of the machine, enabling a comprehensive comparison of the measured bursting thresholds with theoretical predictions by the bunched-beam theory. This paper introduces the method and presents first results obtained at the ANKA synchrotron radiation facility.

Proceedings of Topical Workshop on Electronics for Particle Physics — PoS(TWEPP-17) | 2018

Development of a Front-End ASIC for 1D Detectors with 12 MHz Frame-Rate

Lorenzo Rota; Caselle. Michele; M. Balzer; Marc Weber; A. Mozzanica; B. Schmitt

We present a front-end readout ASIC developed for a new family of ultra-fast 1D imaging detectors operating at frame rates of up to 12 MHz. The ASIC, realized in 110 nm CMOS technology, is designed to be compatible with different semiconductor sensors. The final chip will contain up to 128 channels, each consisting of a Charge-Sensitive Amplifier, a noise shaper based on a fully-differential Correlated Double Sampling stage and a Sample-and-Hold buffer. The differential channels are connected through 8:1 analog multiplexers to the output drivers, which directly interface external analog-to-digital converters. A first prototype with a limited number of channels have been characterized with a Si microstrip detector. When operated at the maximum frame-rate of 12 MHz, the ASIC exhibits an Equivalent Noise Charge of 417 electrons with a detector capacitance of 1.3 pF.

Journal of Instrumentation | 2017

KAPTURE-2. A picosecond sampling system for individual THz pulses with high repetition rate

Michele Caselle; L.E. Ardila Perez; M. Balzer; Andreas Kopmann; Lorenzo Rota; M. Weber; Miriam Brosi; Johannes Steinmann; Erik Bründermann; Anke-Susanne Müller

This paper presents a novel data acquisition system for continuous sampling of ultra-short pulses generated by terahertz (THz) detectors. Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to digitize pulse shapes with a sampling time down to 3 ps and pulse repetition rates up to 500 MHz. KAPTURE has been integrated as a permanent diagnostic device at ANKA and is used for investigating the emitted coherent synchrotron radiation in the THz range. A second version of KAPTURE has been developed to improve the performance and flexibility. The new version offers a better sampling accuracy for a pulse repetition rate up to 2 GHz. The higher data rate produced by the sampling system is processed in real-time by a heterogeneous FPGA and GPU architecture operating up to 6.5 GB/s continuously. Results in accelerator physics will be reported and the new design of KAPTURE be discussed.

Proceedings of the International Particle Accelerator Conference (IPAC’17), Copenhagen, DK, May 14-19, 2017 | 2017

4-Channel Single Shot and Turn-by-Turn Spectral Measurements of Bursting CSR

Johannes Steinmann; Edmund Blomley; Miriam Brosi; Erik BrÃ¼ndermann; Michele Caselle; Benjamin Kehrer; Anke-Susanne Müller; Lorenzo Rota; Marcel Schuh; Patrik SchÃ¶nfeldt; M. Siegel; Marc Weber

The test facility and synchrotron radiation source ANKA at the Karlsruhe Institute of Technology (KIT) in Karlsruhe, Germany, can be operated in a short-pulse mode. Above a threshold current, the high charge density leads to microwave instabilities and the formation of sub-structures. These timevarying sub-structures on bunches of picosecond duration lead to the observation of bursting coherent synchrotron radiation (CSR) in the terahertz (THz) frequency range. The spectral information in this range contains valuable information about the bunch length, shape and sub-structures. We present recent measurements of a spectrometer setup that consists of 4 ultra-fast THz detectors, sensitive in different frequency bands, combined with the KAPTURE readout system developed at KIT for studies requiring high data throughput. This setup allows to record continuously the spectral information on a bunch-by-bunch and turn-by-turn basis. This contribution describes the potential of timeresolved spectral measurements of the short-bunch beam dynamics.

Journal of Instrumentation | 2017

Evaluation of GPUs as a level-1 track trigger for the High-Luminosity LHC

H. Mohr; T. Dritschler; L. E. Ardila; M. Balzer; Michele Caselle; Suren Chilingaryan; Andreas Kopmann; Lorenzo Rota; T. Schuh; Matthias Vogelgesang; M. Weber

In this work, we investigate the use of GPUs as a way of realizing a low-latency, high-throughput track trigger, using CMS as a showcase example. The CMS detector at the Large Hadron Collider (LHC) will undergo a major upgrade after the long shutdown from 2024 to 2026 when it will enter the high luminosity era. During this upgrade, the silicon tracker will have to be completely replaced. In the High Luminosity operation mode, luminosities of 5–7 × 1034 cm−2s−1 and pileups averaging at 140 events, with a maximum of up to 200 events, will be reached. These changes will require a major update of the triggering system. The demonstrated systems rely on dedicated hardware such as associative memory ASICs and FPGAs. We investigate the use of GPUs as an alternative way of realizing the requirements of the L1 track trigger. To this end we implemeted a Hough transformation track finding step on GPUs and established a low-latency RDMA connection using the PCIe bus. To showcase the benefits of floating point operations, made possible by the use of GPUs, we present a modified algorithm. It uses hexagonal bins for the parameter space and leads to a more truthful representation of the possible track parameters of the individual hits in Hough space. This leads to fewer duplicate candidates and reduces fake track candidates compared to the regular approach. With data-transfer latencies of 2 μs and processing times for the Hough transformation as low as 3.6 μs, we can show that latencies are not as critical as expected. However, computing throughput proves to be challenging due to hardware limitations.

ieee npss real time conference | 2016

An ultra-fast linear array detector for MHz line repetition rate spectroscopy

Lorenzo Rota; M. Balzer; Michele Caselle; Simon Kudella; M. Weber; A. Mozzanica; N. Hiller; Michael J. Nasse; G. Niehues; Patrik Schönfeldt; C. Gerth; Bernd Steffen; S. Walther; Dariusz Makowski; Aleksander Mielczarek

We developed a fast linear array detector to improve the acquisition rate and the resolution of Electro-Optical Spectral Decoding (EOSD) experimental setups currently installed at several light sources. The system consists of a detector board, an FPGA readout board and a high-throughput data link. InGaAs or Si sensors are used to detect near-infrared (NIR) or visible light. The data acquisition, the operation of the detector board and its synchronization with synchrotron machines are handled by the FPGA. The readout architecture is based on a high-throughput PCI-Express data link. In this paper we describe the system and we present preliminary measurements taken at the ANKA storage ring. A line-rate of 2.7 Mlps (lines per second) has been demonstrated.

Proceedings of SPIE | 2016

High-throughput data acquisition and processing for real-time X-ray imaging

Matthias Vogelgesang; Lorenzo Rota; Luis Eduardo Ardila Perez; Michele Caselle; Suren Chilingaryan; Andreas Kopmann

With ever-increasing data rates due to stronger light sources and better detectors, X-ray imaging experiments conducted at synchrotron beamlines face bandwidth and processing limitations that inhibit efficient workflows and prevent real-time operations. We propose an experiment platform comprised of programmable hardware and optimized software to lift these limitations and make beamline setups future-proof. The hardware consists of an FPGA-based data acquisition system with custom logic for data pre-processing and a PCIe data connection for transmission of currently up to 6.6 GB/s. Moreover, the accompanying firmware supports pushing data directly into GPU memory using AMD’s DirectGMA technology without crossing system memory first. The GPUs are used to pre-process projection data and reconstruct final volumetric data with OpenCL faster than possible with CPUs alone. Besides, more efficient use of resources this enables a real-time preview of a reconstruction for early quality assessment of both experiment setup and the investigated sample. The entire system is designed in a modular way and allows swapping all components, e.g. replacing our custom FPGA camera with a commercial system but keep reconstructing data with GPUs. Moreover, every component is accessible using a low-level C library or using a high-level Python interface in order to integrate these components in any legacy environment.

Explore More