002848667 001__ 2848667
002848667 005__ 20240912041744.0
002848667 0248_ $$aoai:cds.cern.ch:2848667$$pcerncds:FULLTEXT$$pcerncds:CERN:FULLTEXT$$pcerncds:CERN
002848667 0247_ $$2DOI$$9Springer$$a10.1140/epjc/s10052-023-11885-1
002848667 037__ $$9arXiv$$aarXiv:2302.03583$$chep-ex
002848667 037__ $$9arXiv:reportnumber$$aDPHEP-2023-01
002848667 035__ $$9arXiv$$aoai:arXiv.org:2302.03583
002848667 035__ $$9Inspire$$aoai:inspirehep.net:2630153$$d2024-09-11T17:20:33Z$$h2024-09-12T02:00:18Z$$mmarcxml$$ttrue$$uhttps://inspirehep.net/api/oai2d
002848667 035__ $$9Inspire$$a2630153
002848667 041__ $$aeng
002848667 100__ $$aBasaglia, T.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 245__ $$9Springer$$aData preservation in high energy physics
002848667 269__ $$c2023-02-07
002848667 260__ $$c2023-09-08
002848667 300__ $$a41 p
002848667 520__ $$9Springer$$aData preservation is a mandatory specification for any present and future experimental facility and it is a cost-effective way of doing fundamental research by exploiting unique data sets in the light of the continuously increasing theoretical understanding. This document summarizes the status of data preservation in high energy physics. The paradigms and the methodological advances are discussed from a perspective of more than ten years of experience with a structured effort at international level. The status and the scientific return related to the preservation of data accumulated at large collider experiments are presented, together with an account of ongoing efforts to ensure long-term analysis capabilities for ongoing and future experiments. Transverse projects aimed at generic solutions, most of which are specifically inspired by open science and FAIR principles, are presented as well. A prospective and an action plan are also indicated.
002848667 520__ $$9arXiv$$aData preservation is a mandatory specification for any present and future experimental facility and it is a cost-effective way of doing fundamental research by exploiting unique data sets in the light of the continuously increasing theoretical understanding. This document summarizes the status of data preservation in high energy physics. The paradigms and the methodological advances are discussed from a perspective of more than ten years of experience with a structured effort at international level. The status and the scientific return related to the preservation of data accumulated at large collider experiments are presented, together with an account of ongoing efforts to ensure long-term analysis capabilities for ongoing and future experiments. Transverse projects aimed at generic solutions, most of which are specifically inspired by open science and FAIR principles, are presented as well. A prospective and an action plan are also indicated.
002848667 540__ $$3preprint$$aCC-BY-4.0$$uhttp://creativecommons.org/licenses/by/4.0/
002848667 540__ $$3publication$$aCC-BY-4.0$$fSCOAP3$$uhttps://creativecommons.org/licenses/by/4.0/
002848667 542__ $$3publication$$bSpringer$$dThe author(s)$$g2023
002848667 595__ $$cHAL
002848667 595_D $$aG$$d2023-02-10$$sfullabs
002848667 595_D $$aG$$d2023-02-13$$sprinted
002848667 65017 $$2arXiv$$ahep-ex
002848667 65017 $$2SzGeCERN$$aParticle Physics - Experiment
002848667 690C_ $$aCERN
002848667 690C_ $$aARTICLE
002848667 700__ $$aBellis, M.$$tGRID:grid.5386.8$$tGRID:grid.263614.4$$uSokendai, Tsukuba$$uKEK, Tsukuba$$uCornell U.$$vCornell University, Ithaca, USA$$vSiena College, New York, USA
002848667 700__ $$aBlomer, J.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aBoyd, J.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aBozzi, C.$$tGRID:grid.470200.1$$uINFN, Ferrara$$uFerrara U.$$vINFN Ferrara, Ferrara, Italy
002848667 700__ $$aBritzger, D.$$tGRID:grid.435824.c$$uMunich, Max Planck Inst.$$vMax-Planck-Institut für Physik, Munich, Germany
002848667 700__ $$aCampana, S.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aCartaro, C.$$tGRID:grid.445003.6$$uSLAC$$vSLAC National Accelerator Laboratory, Menlo Park, USA
002848667 700__ $$aChen, G.$$tGRID:grid.418741.f$$uCAS, IHEP, Dongguan$$vInstitute of High Energy Physics, IHEP, CAS, Beijing, China
002848667 700__ $$aCouturier, B.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aDavid, G.$$tGRID:grid.202665.5$$tGRID:grid.36425.36$$uSUNY, Stony Brook$$uBNL, NSLS$$vBrookhaven National Laboratory, BNL, Upton, USA$$vStony Brook University, Stony Brook, USA
002848667 700__ $$aDiaconu, C.$$mdiaconu@cppm.in2p3.fr$$tGRID:grid.470046.1$$uMarseille, CPPM$$vAix Marseille Univ, CNRS/IN2P3, CPPM, Marseille, France
002848667 700__ $$aDobrin, A.$$tGRID:grid.450283.8$$uBucharest, Inst. Space Science$$vInstitute of Space Science, ISS, Bucharest, Măgurele, Romania
002848667 700__ $$aDuellmann, D.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aEbert, M.$$tGRID:grid.143640.4$$uVictoria U.$$vHEP Research Computing, University of Victoria, Victoria, BC, Canada
002848667 700__ $$aElmer, P.$$tGRID:grid.16750.35$$uPrinceton U.$$vPrinceton University, Princeton, USA
002848667 700__ $$aFernandes, J.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aFields, L.$$tGRID:grid.131063.6$$uU. Notre Dame (main)$$vUniversity of Notre Dame, Notre Dame, USA
002848667 700__ $$aFokianos, P.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aGanis, G.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aGeiser, A.$$tGRID:grid.7683.a$$uDESY$$vDeutsches Elektronen Synchrotron, DESY, Hamburg, Germany
002848667 700__ $$aGheata, M.$$tGRID:grid.450283.8$$uBucharest, Inst. Space Science$$vInstitute of Space Science, ISS, Bucharest, Măgurele, Romania
002848667 700__ $$aLopez, J.B. Gonzalez$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aHara, T.$$tGRID:grid.410794.f$$uKEK, Tsukuba$$vHigh Energy Accelerator Research Organization, KEK, Tsukuba, Japan
002848667 700__ $$aHeinrich, L.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aHerner, K.$$tGRID:grid.417851.e$$uFermilab$$vFermi National Accelerator Laboratory, Batavia, USA
002848667 700__ $$aHildreth, M.$$tGRID:grid.131063.6$$uU. Notre Dame (main)$$vUniversity of Notre Dame, Notre Dame, USA
002848667 700__ $$aJayatilaka, B.$$tGRID:grid.417851.e$$uFermilab$$vFermi National Accelerator Laboratory, Batavia, USA
002848667 700__ $$aKado, M.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aKeeble, O.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aKohls, A.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aNaim, K.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aLange, C.$$tGRID:grid.5991.4$$uPSI, Villigen$$vPaul Scherrer Institut, Villigen, Switzerland
002848667 700__ $$aLassila-Perini, K.$$tGRID:grid.470106.4$$uHelsinki Inst. of Phys.$$vHelsinki Institute of Physics, Helsinki, Finland
002848667 700__ $$aLevonian, S.$$tGRID:grid.7683.a$$uDESY$$vDeutsches Elektronen Synchrotron, DESY, Hamburg, Germany
002848667 700__ $$aMaggi, M.$$uINFN, Trieste$$vINFN Bari, Bari, Italy
002848667 700__ $$aMarshall, Z.$$tGRID:grid.184769.5$$uLBL, Berkeley$$vLawrence Berkeley National Laboratory, Berkeley, USA
002848667 700__ $$aVila, P. Mato$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aMečionis, A.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aMorris, A.$$tGRID:grid.10388.32$$uBonn U.$$vUniversity of Bonn, Bonn, Germany
002848667 700__ $$aPiano, S.$$uINFN, Trieste$$vINFN Trieste, Trieste, Italy
002848667 700__ $$aPotekhin, M.$$tGRID:grid.202665.5$$uBNL, NSLS$$vBrookhaven National Laboratory, BNL, Upton, USA
002848667 700__ $$aSchröder, M.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aSchwickerath, U.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aSexton-Kennedy, E.$$tGRID:grid.417851.e$$uFermilab$$vFermi National Accelerator Laboratory, Batavia, USA
002848667 700__ $$aŠimko, T.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aSmith, T.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aSouth, D.$$tGRID:grid.7683.a$$uDESY$$vDeutsches Elektronen Synchrotron, DESY, Hamburg, Germany
002848667 700__ $$aVerbytskyi, A.$$tGRID:grid.435824.c$$uMunich, Max Planck Inst.$$vMax-Planck-Institut für Physik, Munich, Germany
002848667 700__ $$aVidal, M.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aVivace, A.$$tGRID:grid.9132.9$$uCERN$$vCERN, Geneva, Switzerland
002848667 700__ $$aWang, L.$$tGRID:grid.418741.f$$uCAS, IHEP, Dongguan$$vInstitute of High Energy Physics, IHEP, CAS, Beijing, China
002848667 700__ $$aWatt, G.$$tGRID:grid.8250.f$$uDurham U., IPPP$$vIPPP, Durham University, Durham, UK
002848667 700__ $$aWenaus, T.$$tGRID:grid.202665.5$$uBNL, NSLS$$vBrookhaven National Laboratory, BNL, Upton, USA
002848667 710__ $$gDPHEP Collaboration
002848667 773__ $$c795$$n9$$pEur. Phys. J. C$$v83$$y2023
002848667 8564_ $$uhttps://lss.fnal.gov/archive/test-fn/1000/fermilab-fn-1223-csaid-ppd.pdf$$yFermilab Library Server
002848667 8564_ $$82425867$$s50521$$uhttp://cds.cern.ch/record/2848667/files/cms_data_timeline.png$$y00011 CMS data release timeline.
002848667 8564_ $$82425868$$s50266$$uhttp://cds.cern.ch/record/2848667/files/zeuspapers2022only.png$$y00008 Number of ZEUS papers published and anticipated to be published per year. Original 2012 version (left), compared to current 2023 version (right).
002848667 8564_ $$82425869$$s148631$$uhttp://cds.cern.ch/record/2848667/files/REANA.png$$y00013 Example of Higgs-to-four-leptons analysis of CMS open data running on REANA reproducible analysis platform. The analysis consists of four steps and is expressed in the Snakemake workflow specification language. The specification for one of the steps is illustrated on the right.
002848667 8564_ $$82425870$$s18085$$uhttp://cds.cern.ch/record/2848667/files/H1auth_1.png$$y00006 Left: Number of Monte Carlo events produced centrally by the H1 Collaboration. The years without MC production are related to a change of the computing environment, or no MC requests. Right: Number of H1 authors is increasing since 2019 due to retained analysis capabilities and new interest in $ep$ physics. The colored areas indicate the data taking period (green), the period with active funding (yellow) and the period under the new collaboration agreement in \emph{data preservation mode} (cyan). The number of corresponding publications is also indicated.
002848667 8564_ $$82425871$$s149792$$uhttp://cds.cern.ch/record/2848667/files/babar2new.png$$y00009 \babar\ submitted (green) and published (red) papers per year. In 2012 predictions for submissions (yellow) were made for the years 2013 to 2018. In 2012 it was predicted that no analysis would run after 2018.
002848667 8564_ $$82425872$$s99151$$uhttp://cds.cern.ch/record/2848667/files/BellePublishedPaperStatistics.png$$y00010 Number of the published paper per year from Belle.
002848667 8564_ $$82425873$$s582780$$uhttp://cds.cern.ch/record/2848667/files/t19905_001_r109187_e003066-eps-converted-to.png$$y00009 Example of a DELPHI event, reconstructed from raw data using the revised software stack.
002848667 8564_ $$82425874$$s20850$$uhttp://cds.cern.ch/record/2848667/files/lep.png$$y00003 LEP publications per experiment. Note that those publications only reflect the work done by the collaborations themselves. Further usage of those publications, also enhancing the impact of the preserved data, is not accounted for by this figure.
002848667 8564_ $$82425875$$s32269$$uhttp://cds.cern.ch/record/2848667/files/CERNVMswgrowth.png$$y00013 Growth of the CMS software stack in approximately 5 years.
002848667 8564_ $$82425876$$s3428943$$uhttp://cds.cern.ch/record/2848667/files/2302.03583.pdf$$yFulltext
002848667 8564_ $$82425877$$s97479$$uhttp://cds.cern.ch/record/2848667/files/PUBLICATIONS_EXPERIMENTS_DPHEP_FIGURE_JAN2023.png$$y00000 The publication record for four major experimental facilities. The end of the data taking is indicated by the red ``stop'' symbol. The coverage by a dedicated data preservation system is also shown as an arrow.
002848667 8564_ $$82425878$$s228680$$uhttp://cds.cern.ch/record/2848667/files/CODP.png$$y00012 Example of a CMS simulated data set released on the CERN Open Data portal. The full data set provenance information is captured and made available to users.
002848667 8564_ $$82425879$$s44936$$uhttp://cds.cern.ch/record/2848667/files/zeuspapers.png$$y00007 Number of ZEUS papers published and anticipated to be published per year. Original 2012 version (left), compared to current 2023 version (right).
002848667 8564_ $$82425880$$s294233$$uhttp://cds.cern.ch/record/2848667/files/atlas_recast.png$$y00001 ATLAS plot.
002848667 8564_ $$82425881$$s12721$$uhttp://cds.cern.ch/record/2848667/files/mcprod_210915.png$$y00005 Left: Number of Monte Carlo events produced centrally by the H1 Collaboration. The years without MC production are related to a change of the computing environment, or no MC requests. Right: Number of H1 authors is increasing since 2019 due to retained analysis capabilities and new interest in $ep$ physics. The colored areas indicate the data taking period (green), the period with active funding (yellow) and the period under the new collaboration agreement in \emph{data preservation mode} (cyan). The number of corresponding publications is also indicated.
002848667 8564_ $$82425882$$s770791$$uhttp://cds.cern.ch/record/2848667/files/JADEDet.png$$y00002 Longtitudual cross-section of JADE detector. The diameter of the Jet Chamber is about 1~m.
002848667 8564_ $$82429804$$s1709770$$uhttp://cds.cern.ch/record/2848667/files/FERMILAB-FN-1223-CSAID-PPD.pdf$$yFulltext
002848667 8564_ $$82474949$$s3082153$$uhttp://cds.cern.ch/record/2848667/files/Publication.pdf$$yFulltext from Publisher
002848667 8564_ $$82476703$$s987640$$uhttp://cds.cern.ch/record/2848667/files/PUBLICATIONS_EXPERIMENTS_DPHEP_FIGURE_AUG2023.png$$y00000 The publication record for four major experimental facilities. The period after the data taking is indicated by the shaded area. The coverage by a \textit{dedicated} data preservation project ( i.e. pursued in addition to the regular computing activities) is also shown as an arrow labelled DP.
002848667 8564_ $$82476704$$s335769$$uhttp://cds.cern.ch/record/2848667/files/t209724_002_r109187_e003066.png$$y00004 Example of a DELPHI event, reconstructed from raw data using the revised software stack.
002848667 8564_ $$82476705$$s491308$$uhttp://cds.cern.ch/record/2848667/files/CODP_20230514.png$$y00012 Example of a CMS simulated data set released on the CERN Open Data portal. The full data set provenance information is captured and made available to users.
002848667 8564_ $$82429804$$s10131$$uhttp://cds.cern.ch/record/2848667/files/FERMILAB-FN-1223-CSAID-PPD.gif?subformat=icon$$xicon$$yFulltext
002848667 8564_ $$82429804$$s62008$$uhttp://cds.cern.ch/record/2848667/files/FERMILAB-FN-1223-CSAID-PPD.jpg?subformat=icon-1440$$xicon-1440$$yFulltext
002848667 8564_ $$82429804$$s62008$$uhttp://cds.cern.ch/record/2848667/files/FERMILAB-FN-1223-CSAID-PPD.jpg?subformat=icon-640$$xicon-640$$yFulltext
002848667 8564_ $$82429804$$s62008$$uhttp://cds.cern.ch/record/2848667/files/FERMILAB-FN-1223-CSAID-PPD.jpg?subformat=icon-700$$xicon-700$$yFulltext
002848667 8564_ $$82429804$$s9472$$uhttp://cds.cern.ch/record/2848667/files/FERMILAB-FN-1223-CSAID-PPD.jpg?subformat=icon-180$$xicon-180$$yFulltext
002848667 8564_ $$82474949$$s125570$$uhttp://cds.cern.ch/record/2848667/files/Publication.jpg?subformat=icon-1440$$xicon-1440$$yFulltext
002848667 8564_ $$82474949$$s125570$$uhttp://cds.cern.ch/record/2848667/files/Publication.jpg?subformat=icon-640$$xicon-640$$yFulltext
002848667 8564_ $$82474949$$s125570$$uhttp://cds.cern.ch/record/2848667/files/Publication.jpg?subformat=icon-700$$xicon-700$$yFulltext
002848667 8564_ $$82474949$$s15856$$uhttp://cds.cern.ch/record/2848667/files/Publication.jpg?subformat=icon-180$$xicon-180$$yFulltext
002848667 8564_ $$82474949$$s9666$$uhttp://cds.cern.ch/record/2848667/files/Publication.gif?subformat=icon$$xicon$$yFulltext
002848667 960__ $$a13
002848667 980__ $$aARTICLE