Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2542050.2542063acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

Stretch optimization for virtual screening on multi-user pilot-agent platforms on grid/cloud

Published: 05 December 2013 Publication History

Abstract

Virtual screening has proven very effective on grid infrastructures where large scale deployments have led to the identification of active inhibitors for biological targets of interest against malaria, SARS or diabetes. Operating a dedicated virtual screening platform on grid resources requires optimizing the scheduling policy. The scheduling can be done at 2 levels; at site level and at platform level. Site scheduling is done at each site independently; each site is autonomous in its choice of job scheduling. Each site allocates time slots for different groups of users. Platform scheduling is done at group level: inside a time slot jobs from many users are allocated. Pilot agents are sent to sites and act as a container of actual users jobs. They pick up users jobs from a central queue where the second stage scheduling is done. In this paper, we focus on pilot-agent platform shared by many virtual screening users. They need a suitable scheduling algorithm to ensure a certain fairness between users. We have studied the scheduling of users jobs inside central queue and examined the relevance and impact of different scheduling policies (FIFO, SPT, LPT and Round Robin) on the user experience. Optimal criterion used in our research is the stretch, a measure for user experience on the platform. In a first step, we simulated the operation of virtual screening applications on the pilot-agent platform in order to compare the scheduling policies. According to simulation, SPT algorithm was shown to significantly improve scheduling performances. In a second step, the Shortest Processing Time (SPT) and Longest Processing Time (LPT) scheduling policies were implemented on a DIRAC pilot-agent platform at IFI in Hanoi and tested on EGI Biomed Virtual Organization. Experimental results are in good agreement with simulation and confirm that SPT algorithm significantly improves user experience.
The relevance of our conclusions also extends to cloud computing. Indeed, cloud infrastructures are also characterized by limited machine availability.

References

[1]
Rao, V. S., and Srinivas, K. 2011. Modern drug discovery process: an in silico approach. Journal of Bioinformatics and Sequence Analysis, 2(5), 89--94.
[2]
Goodsell, D. S., Morris, G. M., and Olson, A. J. 1996. Automated docking of flexible ligands: applications of AutoDock. Journal of Molecular Recognition, 9(1), 1--5.
[3]
Coleman, R. G., and Sharp, K. A. 2010. Protein pockets: inventory, shape, and comparison. Journal of chemical information and modeling, 50(4), 589--603.
[4]
Schellhammer, I., and Rarey, M. 2004. FlexX-Scan: Fast, structure-based virtual screening. PROTEINS: Structure, Function, and Bioinformatics, 57(3), 504--517.
[5]
Jacq, N., Breton, V., Chen, H. Y., Ho, L. Y., Hofmann, M., Lee, H. C., and Zimmermann, M. 2006. Large scale in silico screening on grid infrastructures. arXiv preprint cs/0611084.
[6]
Jacq, N., Salzemann, J., Jacq, F., Legré, Y., Medernach, E., Montagnat, J., and Breton, V. 2008. Grid-enabled virtual screening against malaria. Journal of Grid Computing, 6(1), 29--43.
[7]
Lee, H. C., Salzemann, J., Jacq, N., Chen, H. Y., Ho, L. Y., Merelli, I., and Wu, Y. T. 2006. Grid-enabled high-throughput in silico screening against influenza A neuraminidase. IEEE transactions on nanobioscience, 5, 288--295.
[8]
Kasam, V., Salzemann, J., Botha, M., Dacosta, A., Degliesposti, G., Isea, R., and Breton, V. 2009. WISDOM-II: Screening against multiple targets implicated in malaria using computational grid infrastructures. Malaria Journal, 8(1), 88.
[9]
van Herwijnen, E., Closier, J., Frank, M., Gaspar, C., Loverre, F., Ponce, S., and Gandelman, M. 2003. Dirac---distributed infrastructure with remote agent control. In Conference for Computing in High-Energy and Nuclear Physics (CHEP 03).
[10]
Mościcki, J. T. 2003. Distributed analysis environment for HEP and interdisciplinary applications. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 502(2), 426--429.
[11]
Sfiligoi, I. 2008, July. glideinWMS---a generic pilot-based workload management system. In Journal of Physics: Conference Series (Vol. 119, No. 6, p. 062044). IOP Publishing.
[12]
Maeno, T. 2008. PanDA: distributed production and distributed analysis system for ATLAS. In Journal of Physics: Conference Series (Vol. 119, No. 6, p. 062036). IOP Publishing.
[13]
Da Silva, R. F., Camarasu-Pop, S., Grenier, B., Hamar, V., Manset, D., Montagnat, J., and Glatard, T. 2011. Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform. In Proceedings of the 9th HealthGrid Conference. 1--10.
[14]
Maruthanayagam, D. and Uma Rani, R. 2010. Grid scheduling algorithms: a survey. International Journal of Current Research. Vol. 11 (December. 2010). 228--235.
[15]
Jiang, C., Wang, C., Liu, X., and Zhao, Y. 2007. A survey of job scheduling in grids. In Advances in Data and Web Management. Springer Berlin Heidelberg. 419--427.
[16]
Schmidt, G. 2000. Scheduling with limited machine availability. European Journal of Operational Research, 121(1), 1--15.
[17]
Marrow, P., Bonsma, E., Wang, F., and Hoile, C. 2003. DIET---a scalable, robust and adaptable multi-agent platform for information management. BT technology journal, 21(4).130--137.
[18]
Berman, F., Wolski, R., Figueira, S., Schopf, J., and Shao, G. 1996. Application-level scheduling on distributed heterogeneous networks. In Proceedings of Supercomputing. vol. 96. Citeseer, 1996. 1--28.
[19]
Pandey, S., Wu, L., Guru, S. M., and Buyya, R. 2010. A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In AINA '10: Proceedings of the 2010, 24th IEEE International Conference on Advanced Information Networking and Applications. Washington, DC, USA. 2010. IEEE Computer Society. 400--407
[20]
Li, W., Tordsson, J., and Elmroth, E. 2011. Modeling for dynamic cloud scheduling via migration of virtual machines. In Proceedings of the 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011). 163--171.
[21]
Luckow, A., Lacinski, L., and Jha, S. 2010. SAGA BigJob: An extensible and interoperable pilot-job abstraction for distributed applications and systems. In Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference. 135--144.
[22]
Fifield, T., Carmona, A., Casajús, A., Graciani, R., and Sevior, M. 2011. Integration of cloud, grid and local cluster resources with DIRAC. In Journal of Physics: Conference Series (Vol. 331, No. 6, p 062009)
[23]
Muthukrishnan, S., Rajaraman, R., Shaheen, A., and Gehrke, J. E. 1999. Online scheduling to minimize average stretch. In IEEE Symposium on Foundations of Computer Science.433--442.
[24]
Legrand, A., Su, A., and Vivien, F. 2006. Minimizing the stretch when scheduling flows of biological requests. In Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures. 103--112. DOI=http://doi.acm.org/10.1145/1148109.1148124
[25]
Chen, B., Potts, C. N., and Woeginger, G. J. 1998. A review of machine scheduling: Complexity, algorithms and approximability. In Handbook of combinatorial optimization, 3, 21--169.
[26]
Casanova, H., Legrand, A., and Quinson, M. 2008. SimGrid: a generic framework for large-scale distributed experiments. In Proceeding 10th International Conference Computer Modeling and Simulation. (Mar. 2008). 126--131
[27]
Medernach, E. 2005. Workload analysis of a cluster in a grid environment. In Job scheduling strategies for parallel processing. Springer Berlin Heidelberg. 36--61.
[28]
Lawler, E. L., Lenstra, J. K., Kan, A. R., & Shmoys, D. B. 1993. Sequencing and scheduling: Algorithms and complexity. Handbooks in operations research and management science, 4, 445--522.
[29]
Cheng, T. C. E., & Sin, C. C. S. 1990. A state-of-the-art review of parallel-machine scheduling research. European Journal of Operational Research, 47(3), 271--292. DOI=http://dx.doi.org/10.1016/0377-2217(90)90215-W
[30]
Jain, Raj. The art of computer systems performance analysis. Vol. 182. Chichester: John Wiley & Sons, 1991.
[31]
Downey, Allen B. A parallel workload model and its implications for processor allocation. In Cluster Computing 1.1 (1998): 133--145.
[32]
Feitelson, Dror G. Packing schemes for gang scheduling. In Job Scheduling Strategies for Parallel Processing. Springer Berlin Heidelberg, 1996.
[33]
Azmi, Z. R. M., Bakar, K. A., Abdullah, A. H., Shamsir, M. S., & Manan, W. N. W. 2011. Performance Comparison of Priority Rule Scheduling Algorithms Using Different Inter Arrival Time Jobs in Grid Environment. International Journal of Grid and Distributed Computing, 4(3), 61--70.

Cited By

View all
  • (2017)Towards effective scheduling policies for many-task applications: Practice and experience based on HTCaaSConcurrency and Computation: Practice and Experience10.1002/cpe.424229:21(e4242)Online publication date: 24-Aug-2017

Index Terms

  1. Stretch optimization for virtual screening on multi-user pilot-agent platforms on grid/cloud

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SoICT '13: Proceedings of the 4th Symposium on Information and Communication Technology
    December 2013
    345 pages
    ISBN:9781450324540
    DOI:10.1145/2542050
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • SOICT: School of Information and Communication Technology - HUST
    • NAFOSTED: The National Foundation for Science and Technology Development
    • ACM Vietnam Chapter: ACM Vietnam Chapter
    • Danang Univ. of Technol.: Danang University of Technology

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 December 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SimGrid
    2. cloud computing
    3. fairness
    4. grid computing
    5. online-algorithm
    6. scheduling
    7. stretch
    8. virtual screening

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SoICT '13
    Sponsor:
    • SOICT
    • NAFOSTED
    • ACM Vietnam Chapter
    • Danang Univ. of Technol.

    Acceptance Rates

    SoICT '13 Paper Acceptance Rate 40 of 80 submissions, 50%;
    Overall Acceptance Rate 147 of 318 submissions, 46%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 28 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Towards effective scheduling policies for many-task applications: Practice and experience based on HTCaaSConcurrency and Computation: Practice and Experience10.1002/cpe.424229:21(e4242)Online publication date: 24-Aug-2017

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media