Abstract

In LHC Run 3, ALICE will increase the data taking rate significantly to 50 kHz continuous read out of minimum bias Pb-Pb collisions. The reconstruction strategy of the online offline computing upgrade foresees a first synchronous online reconstruction stage during data taking enabling detector calibration, and a posterior calibrated asynchronous reconstruction stage. The significant increase in the data rate poses challenges for online and offline reconstruction as well as for data compression. Compared to Run 2, the online farm must process 50 times more events per second and achieve a higher data compression factor. ALICE will rely on GPUs to perform real time processing and data compression of the Time Projection Chamber (TPC) detector in real time, the biggest contributor to the data rate. With GPUs available in the online farm, we are evaluating their usage also for the full tracking chain during the asynchronous reconstruction for the silicon Inner Tracking System (ITS) and Transition Radiation Detector (TRD). The software is written in a generic way, such that it can also run on processors on the WLCG with the same reconstruction output. We give an overview of the status and the current performance of the reconstruction and the data compression implementations on the GPU for the TPC and for the global reconstruction.

Full PDF

GGPU-based reconstruction and data compression at ALICEduring LHC Run 3

David

Rohr on behalf of the ALICE collaboration , ∗ European Organization for Nuclear Research (CERN), Geneva, Switzerland

Abstract.

In LHC Run 3, ALICE will increase the data taking rate signiﬁ-cantly to 50 kHz continuous read out of minimum bias Pb-Pb collisions. Thereconstruction strategy of the online o ﬄ ine computing upgrade foresees a ﬁrstsynchronous online reconstruction stage during data taking enabling detectorcalibration, and a posterior calibrated asynchronous reconstruction stage. Thesigniﬁcant increase in the data rate poses challenges for online and o ﬄ ine re-construction as well as for data compression. Compared to Run 2, the onlinefarm must process 50 times more events per second and achieve a higher datacompression factor. ALICE will rely on GPUs to perform real time processingand data compression of the Time Projection Chamber (TPC) detector in realtime, the biggest contributor to the data rate. With GPUs available in the onlinefarm, we are evaluating their usage also for the full tracking chain during theasynchronous reconstruction for the silicon Inner Tracking System (ITS) andTransition Radiation Detector (TRD). The software is written in a generic way,such that it can also run on processors on the WLCG with the same reconstruc-tion output. We give an overview of the status and the current performance ofthe reconstruction and the data compression implementations on the GPU forthe TPC and for the global reconstruction. ALICE (A Large Ion Collider Experiment [1]) is one of the four major experiments at theLHC (Large Hadron Collider) at CERN. It is a dedicated heavy-ion experiment studying leadcollisions at the LHC at unprecedented energies. During the second long LHC shutdown in2019 and 2020, the LHC upgrade will provide a higher Pb–Pb collision rate, and ALICE willupdate many of its detectors and systems [2]. In particular, the main tracking detectors TPC(Time Projection Chamber) and ITS (Inner Tracking System) will be upgraded [3], and thecomputing scheme will change with the O online-o ﬄ ine computing upgrade [4].ALICE will upgrade the detectors for LHC Run 3 and switch from the current triggeredread-out of up to 1 kHz of Pb–Pb events to a continuous read-out of 50 kHz minimum biasPb–Pb events. The continuous read-out of pp collisions will happen at rates between 200 kHzand 1 MHz. ALICE is abandoning the hardware triggers and will switch to a full onlineprocessing in software. During data taking, the synchronous processing will serve two mainobjectives: detector calibration and data compression. With a ﬂat budget and the yearly in-creases of storage capacity, recording and storing raw data as today is prohibitively expensive ∗ e-mail: [email protected] a r X i v : . [ phy s i c s . i n s - d e t ] J un ata links from detectors Disk buffer R un f a r m Synchronous processing - Local processing- Event / timeframe building- Calibration / compression

Asynchronous processing - Reprocessing with full calibration- Full reconstruction

Permanent storage

Compressed Raw DataReconstructed Data D u r i ng D a t a t ak i ng D u r i ng no b ea m > 3.5 TB/s< 100 GB/s Readout nodes > 600 GB/s

Figure 1.

Illustration of the ALICE computing strategy for Run 3, with synchronous processing duringdata taking, and asynchronous processing in periods without beam. at 50 to 100 times the data rate. ALICE aims at a compression of the TPC data, the largestcontributor to raw data size, of a factor 20 compared to the zero-suppressed raw data size ofRun 2. By producing the calibration during data taking, ALICE will reduce the number ofo ﬄ ine reconstruction passes over the data, where the ﬁrst two passes serve the calibrationtoday. The output of the synchronous data processing will be compressed time frames, whichare stored to an on-site disk bu ﬀ er, and from there written to tapes. When the computingfarm is not fully used for the synchronous processing, e.g. in periods without beam or duringpp data taking, it will perform a part of the asynchronous reconstruction, which reprocessesthe data and generates ﬁnal reconstruction output. The part of asynchronous processing thatexceeds the capacity of the farm will be done in the grid. This asynchronous stage will em-ploy the same algorithms and software as the synchronous stage, but with di ﬀ erent settings,additional reconstruction steps, and ﬁnal calibration. Figure 1 gives an overview of the O computing. The reconstruction of the central barrel detectors of ALICE, foremost the TPC (Time Pro-jection Chamber), is the most computing-intense part of event reconstruction, and the focusof this paper. Therefore, ALICE foresees the usage of Graphics Cards (GPUs) to acceleratethese steps. In parallel, a similar e ﬀ ort on a smaller scale has started to investigate whetherthe reconstruction of the forward detector reconstruction could leverage GPUs in the sameway. The core part is the tracking of the TPC, which was adapted from the ALICE High LevelTrigger [5] and improved to match the Run 2 o ﬄ ine reconstruction in terms of e ﬃ ciency andresolution. Several new algorithms have been implemented for the GPU reconstruction, inparticular for the Inner Tracking System (ITS) [6]. Another addition is the data compressionor the TPC, which consists of a track model compression step [7] and an entropy encodingstep, which will most likely use ANS [8] encoding. We foresee 2 GPU processing scenarios: • Baseline scenario : This contains the minimum set of reconstruction steps on the GPUrequired to perform the synchronous reconstruction on the online processing farm atthe peak data rate assumed for LHC Run 3. This scenario deﬁnes the size of the onlineprocessing farm, in particular the number of processor cores and GPUs. • Optimistic scenario : The asynchronous reconstruction will perform many processingsteps of the synchronous reconstruction (except for the calibration and the data com-pression) one more time, thus it can leverage the available GPU algorithms. Sincethere are many more steps in the asynchronous reconstruction, it will be inevitablyCPU bound if all these steps are processed by the processor while there are no addi-tional steps on the GPU. Therefore we aim to o ﬄ oad more processing steps onto theGPU, and a promising candidate is the complete central barrel tracking chain.Figure 2 gives an overview of the corresponding reconstruction steps. TPC Track Finding TPC Track MergingITS Track Finding ITS Track Fit TPC ITS MatchingTPC dE/dx ITS AfterburnerTRD TrackingITS Vertexing TOF Matching Global FitV0 FindingTPC Track Model

Compression

TPC Entropy CompressionTPC Track Fit

In operationNearly readyBeing studiedDevelopment not started

TPC Cluster removalTPC <10MeV/ c identificationSorting Material Lookup Memory ReuseGPU API Framework Common GPU Components:

TPC Calibration

TPC Cluster Finder

GPU barrel tracking chain part of baseline scenario part of optimistic scenario