Nothing Special   »   [go: up one dir, main page]

CERN Accelerating science

 
The ATLAS TDAQ system as implemented for LHC Run~2. The RoIB component was integrated into the HLT Supervisor (HLTSV) early in the run, so only the combined entity is shown. FE refers to front-end electronics and SCT refers to the Semiconductor Tracker detector. Requests from the HLT nodes to the ROS travel over the same network as the data sent in response. Note that the rates and throughput to permanent storage increased by approximately 50\% over the run period, so the values quoted indicate the initial expectations.
A block diagram of the complete RobinNP firmware, showing a SubRob (two copies) servicing six input links through common logic before transfer of data across the PCIe interface via DMA.
RobinNP memory access and arbitration schematic. If the number of items stored is equal to or larger than a programmable level (MemFifoFill), this results in the full flag being raised. The test Input Channel is not shown.
Schematic of the FIFO Duplicator Protocol.
Run~2 ROS firmware/software interaction.
RobinNP core firmware interface threads. Each firmware SubRob has its own set of threads and FIFOs, thus for each RobinNP card there are two copies of the structure shown in the diagram.
RobinNP request management threads, including Async I/O, Pending Handler and Collector threads. Also shown is the flow of commands and associated descriptors throughout the system.
The inverse delete rate (effectively the average time between deletes) observed as a function of the readout fraction for the EB request fractions and fragment sizes indicated, for an RoI size (R) of 4. The coloured dots represent the measurement results and the grey crosses results for the same readout fractions, calculated from a fit to these and other measurement results with the measured inverse delete rate required to be larger than 8~$\mu$s for a fragment size of 450 words and larger than 7~$\mu$s for the other fragment sizes. The lines connecting the dots and crosses have been drawn to guide the eye.
Performance plots based on operational data taken from an ATLAS run from October 2017, with a peak luminosity at start of run of 1.7$\times10^{34}\textrm{cm}^{-2}\textrm{s}^{-1}$ and peak average pileup of 66.5. Such luminosity and pileup conditions are expected to be replicated in Run~3. (left) Max ROS buffer occupancy against the number of HLT application instances in use in the HLT farm. (middle) Buffer occupancy versus pileup. (right) The evolution of max buffer occupancy over time across the run. The dashed line indicates the 64 MB of buffer space available per ROL for Run 1.
Performance plots based on operational data taken from an ATLAS run from October 2017, with a peak luminosity at start of run of 1.7$\times10^{34}\textrm{cm}^{-2}\textrm{s}^{-1}$ and peak average pileup of 66.5. Such luminosity and pileup conditions are expected to be replicated in Run~3. (left) Max ROS buffer occupancy against the number of HLT application instances in use in the HLT farm. (middle) Buffer occupancy versus pileup. (right) The evolution of max buffer occupancy over time across the run. The dashed line indicates the 64 MB of buffer space available per ROL for Run 1.
Performance plots based on operational data taken from an ATLAS run from October 2017, with a peak luminosity at start of run of 1.7$\times10^{34}\textrm{cm}^{-2}\textrm{s}^{-1}$ and peak average pileup of 66.5. Such luminosity and pileup conditions are expected to be replicated in Run~3. (left) Max ROS buffer occupancy against the number of HLT application instances in use in the HLT farm. (middle) Buffer occupancy versus pileup. (right) The evolution of max buffer occupancy over time across the run. The dashed line indicates the 64 MB of buffer space available per ROL for Run 1.
(left) Maximum request rates for the most heavily loaded ROS systems for particular detectors during an ATLAS run from October 2017, with a peak luminosity at start of run of 1.7$\times10^{34}\textrm{cm}^{-2}\textrm{s}^{-1}$ and peak average pileup of 66.5. (centre) Maximum request rates for the same ROS servers in the same run, this time as a function of pileup. The rates linearly depend on the pileup, with the Pixel detector most frequently requested across the range. In both left and centre plots the maximum expected request rate (Section~\ref{subsec:requirements}) is shown as a horizontal dot-dashed line labelled 'Req'. The discontinuity near pileup of 38 is due to a temporary pause in data taking during the run and can be ignored. (right) Maximum request rates as a fraction of L1 rate as for the same ROS servers in the same run, once again as a function of time. As can be seen, sections of the Pixel detector readout receive requests at a rate significantly above the L1 rate for the first half of the run. The sharp changes in rate at 5.5 hours (left) and at a pileup of 38 and 47 (right) were due to prescale changes made during the run to optimise HLT farm occupancy, accounting for the fact that the L1 rate will otherwise decay over time. In all plots the order of entries in the legend match the order of appearance the corresponding lines.
(left) Maximum request rates for the most heavily loaded ROS systems for particular detectors during an ATLAS run from October 2017, with a peak luminosity at start of run of 1.7$\times10^{34}\textrm{cm}^{-2}\textrm{s}^{-1}$ and peak average pileup of 66.5. (centre) Maximum request rates for the same ROS servers in the same run, this time as a function of pileup. The rates linearly depend on the pileup, with the Pixel detector most frequently requested across the range. In both left and centre plots the maximum expected request rate (Section~\ref{subsec:requirements}) is shown as a horizontal dot-dashed line labelled 'Req'. The discontinuity near pileup of 38 is due to a temporary pause in data taking during the run and can be ignored. (right) Maximum request rates as a fraction of L1 rate as for the same ROS servers in the same run, once again as a function of time. As can be seen, sections of the Pixel detector readout receive requests at a rate significantly above the L1 rate for the first half of the run. The sharp changes in rate at 5.5 hours (left) and at a pileup of 38 and 47 (right) were due to prescale changes made during the run to optimise HLT farm occupancy, accounting for the fact that the L1 rate will otherwise decay over time. In all plots the order of entries in the legend match the order of appearance the corresponding lines.
(left) Maximum request rates for the most heavily loaded ROS systems for particular detectors during an ATLAS run from October 2017, with a peak luminosity at start of run of 1.7$\times10^{34}\textrm{cm}^{-2}\textrm{s}^{-1}$ and peak average pileup of 66.5. (centre) Maximum request rates for the same ROS servers in the same run, this time as a function of pileup. The rates linearly depend on the pileup, with the Pixel detector most frequently requested across the range. In both left and centre plots the maximum expected request rate (Section~\ref{subsec:requirements}) is shown as a horizontal dot-dashed line labelled 'Req'. The discontinuity near pileup of 38 is due to a temporary pause in data taking during the run and can be ignored. (right) Maximum request rates as a fraction of L1 rate as for the same ROS servers in the same run, once again as a function of time. As can be seen, sections of the Pixel detector readout receive requests at a rate significantly above the L1 rate for the first half of the run. The sharp changes in rate at 5.5 hours (left) and at a pileup of 38 and 47 (right) were due to prescale changes made during the run to optimise HLT farm occupancy, accounting for the fact that the L1 rate will otherwise decay over time. In all plots the order of entries in the legend match the order of appearance the corresponding lines.
{Distributions of fragment sizes for different pileup regimes for readout slices from the inner tracker (pixels, SCT andt TRT) nearing the limit of S-LINK occupancy. Data are obtained by scaling the distribution from an ATLAS run in mid 2018 with peak average pileup of 54.9 and peak instantaneous luminosity of 2$\times10^{34}\textrm{cm}^{-2}\textrm{s}^{-1}$}. The maximum occupancy of the S-LINK implementation for the given systems at 100 kHz L1 rate is shown by the horizontal dashed line. The data series for a pileup value of 70 corresponds to the highest average fragment size for each readout slice, with the series for a pileup value of 50 corresponding to the lowest fragment size and the series for a pileup value of 60 in-between. As can be seen, for parts of the Pixel and TRT detectors the average fragment size is such that it will not be possible to operate at a 100 kHz in very high pileup conditions without further adaptations. For all systems not included in this plot the distributions show that significant margin remains before the S-LINK occupancy limit and thus no further action is required. Variations in fragment size within a given detector type are typically explained by differences in geometrical acceptance.
Predicted utilisation (see equation~\ref{eqn:utilisation}) for Run~2 ROS server hardware based on estimated request rates for Run~3 Trigger configuration, assuming similar luminosity and pileup conditions as Run~2, but for updated versions of server operating system and TDAQ software that were deployed during Run~2. The X-axis shows all servers in the system enumerated from 0 to 101 and grouped by detector system. The maximum utilisation above which saturation is typically experienced is estimated to be in the region of 0.8. For each ROS two points are shown. One for the nominal configuration (i.e. without pre-fetching) and the other with pre-fetching enabled. As can be seen, for two groups of servers the predicted request rates significantly exceed the 0.8 limit in both pre-fetch and non-pre-fetch regimes. The left hand group with the largest utilisation values corresponds to the Liquid Argon (LAr) Calorimeter and the right hand group corresponds to the Tile Calorimeter. In many cases the pre-fetch case has a higher utilisation value than the non-pre-fetch case. This is expected to be due to the pre-fetch implementation proposed for the start of Run~3 requesting more of the data at once for each event than in Run~2 (see main text for a more detailed description).
Schematic of the multi-threaded processes of the PC-RoIB and the memory structures that are accessed.
Diagram of the logical firmware blocks making up the QuestNP. The PCIe interface and DMA engine, as well as the S-LINK and TLK blocks were re-used from the RobinNP codebase.
Comparison of estimated ROS utilisation (see equation~\ref{eqn:utilisation}) for individual servers (same data as Figure~\ref{fig:l2_saturation}), for Run~2 and Run~3 ROS servers with 32 GB RAM configuration. Note that the Run~2 server was operated using the Run~2 operating system and software conditions, whereas the Run~3 server was operated with the operating system and software conditions optimised for the start of Run~3.