Author(s)
| Cárdenas-Montes, M (Madrid, CIEMAT) ; Peris, A Delgado (Madrid, CIEMAT) ; Flix, J (Madrid, CIEMAT ; Barcelona, IFAE) ; Hernández, J M (Madrid, CIEMAT) ; Holgado, J León (Madrid, CIEMAT) ; Pérez, C Morcillo (Madrid, CIEMAT) ; Yzquierdo, A Pérez-Calero (Madrid, CIEMAT ; Barcelona, IFAE) ; Calonge, F J Rodríguez (Madrid, CIEMAT) |
Abstract
| The increasingly larger data volumes that the LHC experiments will accumulate in the coming years, especially in the High-Luminosity LHC era, call for a paradigm shift in the way experimental datasets are accessed and analyzed. The current model, based on data reduction on the Grid infrastructure, followed by interactive data analysis of manageable size samples on the physicists’ individual computers, will be superseded by the adoption of Analysis Facilities. This rapidly evolving concept is converging to include dedicated hardware infrastructures and computing services optimized for the effective analysis of large HEP data samples. This paper describes the actual implementation of this new analysis facility model at the CIEMAT institute, in Spain, to support the local CMS experiment community. Our work details the deployment of dedicated highly performant hardware, the operation of data staging and caching services ensuring prompt and efficient access to CMS physics analysis datasets, and the integration and optimization of a custom analysis framework based on ROOT’s RDataFrame and CMS NanoAOD format. Finally, performance results obtained by benchmarking the deployed infrastructure and software against a CMS analysis workflow are summarized. |