Author(s)
|
Barrass, T A (West England U.) ; Maroney, O (West England U.) ; Metson, S (West England U.) ; Newbold, D (West England U.) ; Jank, W (CERN) ; García-Abia, P (Madrid, CIEMAT) ; Hernández, J M (Madrid, CIEMAT) ; Afaq, A (Fermilab) ; Ernst, M (Fermilab) ; Fisk, I (Fermilab) ; Wu, Y (Fermilab) ; Charlot, C (Ecole Polytechnique) ; Semeniouk, I N (Ecole Polytechnique) ; Bonacorsi, D (INFN, CNAF) ; Fanfani, A (INFN, CNAF) ; Grandi, C (INFN, CNAF) ; De Filippis, N (INFN, CNAF) ; Rabbertz, K (Karlsruhe U.) ; Rehn, J (Karlsruhe U.) ; Tuura, L (Northeastern U.) ; Wildish, T (Princeton U.) ; Newbold, D (Rutherford) |
Abstract
| CMS currently uses a number of tools to transfer data which, taken together, form the basis of a heterogeneous datagrid. The range of tools used, and the directed, rather than optimized nature of CMS recent large scale data challenge required the creation of a simple infrastructure that allowed a range of tools to operate in a complementary way. The system created comprises a hierarchy of simple processes (named agents) that propagate files through a number of transfer states. File locations and some application metadata were stored in POOL file catalogues, with LCG LRC or MySQL back-ends. Agents were assigned limited responsibilities, and were restricted to communicating state 9in a well-defined, indirect fashion through a central transfer management database. In this way, the task of distributing data was easily divided between different groups for implementation. The prototype system w as developed rapidly, and achieved the required sustained transfer rate of 10 MBps, with O(10^6) files distributed to 6 sites from CERN. Experience with the system during the data challenge raised issues with underlying technology (MSS write/read, stability of the LRC, maintenance of file catalogues, synchronization of filespaces), all of which have been successfully identified and handled. The development of this prototype infrastructure allows us to plan the evolution of backbone CMS data distribution from a simple hierarchy to a more autonomous, scalable model drawing on emerging agent and grid technology. |