Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1413370.1413416acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

BitDew: a programmable environment for large-scale data management and distribution

Published: 15 November 2008 Publication History

Abstract

Desktop Grids use the computing, network and storage resources from idle desktop PC's distributed over multiple-LAN's or the Internet to compute a large variety of resource-demanding distributed applications. While these applications need to access, compute, store and circulate large volumes of data, little attention has been paid to data management in such large-scale, dynamic, heterogeneous, volatile and highly distributed Grids. In most cases, data management relies on ad-hoc solutions, and providing a general approach is still a challenging issue.
To address this problem, we propose the BitDew framework, a programmable environment for automatic and transparent data management on computational Desktop Grids. This paper describes the BitDew programming interface, its architecture, and the performance evaluation of its runtime components. BitDew relies on a specific set of meta-data to drive key data management operations, namely life cycle, distribution, placement, replication and fault-tolerance with a high level of abstraction. The Bitdew runtime environment is a flexible distributed service architecture that integrates modular P2P components such as DHT's for a distributed data catalog and collaborative transport protocols for data distribution. Through several examples, we describe how application programmers and Bitdew users can exploit Bitdew's features. The performance evaluation demonstrates that the high level of abstraction and transparency is obtained with a reasonable overhead, while offering the benefit of scalability, performance and fault tolerance with little programming cost.

References

[1]
A. Adya and all. Farsite: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment. SIGOPS Oper. Syst. Rev., 36(SI):1--14, 2002.
[2]
L. O. Alima, S. El-Ansary, P. Brand, and S. Haridi. DKS(N, k, f) A family of Low-Communication, Scalable and Fault-tolerant Infrastructures for P2P applications. In The 3rd International CGP2P Workshop, Tokyo, 2003.
[3]
W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, and I. Foster. The Globus Striped GridFTP Framework and Server. In Proceedings of Super Computing (SC05), 2005.
[4]
D. Anderson. BOINC: A System for Public-Resource Computing and Storage. In proceedings of the 5th IEEE/ACM International GRID Workshop, Pittsburgh, USA, 2004.
[5]
R. Anderson. The Eternity Service. In Proceedings of Pragocrypt '96, 1996.
[6]
N. Andrade, W. Cirne, F. Brasileiro, and P. Roisenberg. OurGrid: An Approach to Easily Assemble Grids with Equitable Resource Sharing. In Proceedings of the 9th Workshop on Job Scheduling Strategies for Parallel Processing, June 2003.
[7]
M. Antonioletti and all. The Design and Implementation of Grid Database Service in OGSA-DAI. Concurrency and Computation: Practice and Experience, 17:357--376, February 2005.
[8]
G. Antoniu. L. Bougé, and M. Jan. JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid. Scalable Computing: Practice and Experience, 6(3):45--55, September 2005.
[9]
A. Bassi, M. Beck, G. Fagg, T. Moore, J. S. Plank, M. Swany, and R. Wolski. The Internet BackPlane Protocol: A Study in Resource Sharing. In Second IEEE/ACM International Symposium on Cluster Computing and the Grid, Berlin, Germany, 2002.
[10]
R. Bolze and all. Grid5000: A Large Scale Highly Reconfigurable Experimental Grid Testbed. International Journal on High Peerformance Computing and Applications, 2006.
[11]
A. R. Butt, T. A. Johnson, Y. Zheng, and Y. C. Hu. Kosha: A Peer-to-Peer Enhancement for the Network File System. In Poceeding of International Symposium on SuperComputing SC '04, 2004.
[12]
B. Cohen. Incentives Build Robustness in BitTorrent. In Workshop on Economics of Peer-to-Peer Systems, Berkeley, 2003.
[13]
E. Deelman, G. Singh, M. P. Atkinson, A. Chervenak, N. P. C. Hong, C. Kesselman, S. Patil, L. Pearlman, and M.-H. Su. Grid-Based Metadata Services. In SSDBM04, Santorini, Greece, June 2004.
[14]
Enabling Grids for E-Science in Europe.
[15]
G. Fedak, C. Germain, V. Neri, and F. Cappello. XtremWeb: A Generic Global Computing Platform. In CCGRID '2001 Special Session Global Computing on Personal Devices, 2001.
[16]
Y. Fernandess and D. Malkhi. On Collaborative Content Distribution using Multi-Message Gossip. In Proceeding of IEEE IPDPS, Rhodes Island, 2006.
[17]
D. Gelernter. Generative Communications in Linda. ACM Transactions on Programming Languages and Systems, 1985.
[18]
C. Gkantsidis, J. Miller, and P. Rodriguez. Anatomy of a P2P Content Distribution System with Network Coding. In IPTPS '06, California, U.S.A., 2006.
[19]
C. Gkantsidis, J. Miller, and P. Rodriguez. Comprehensive View of a Live Network Coding P2P System. In ACM SIGCOMM/USENIX IMC '06, Brazil, 2006.
[20]
C. Gkantsidis and P. Rodriguez. Network Coding for Large Scale Content Distribution. In Proceedings of IEEE/INFOCOM 2005, Miami, USA, March 2005.
[21]
A. Iamnitchi, S. Doraimani, and G. Garzoglio. Filecules in High-Energy Physics: Characteristics and Impact on Resource Management. In proceeding of 15th IEEE International Symposium on High Performance Distributed Computing HPDC 15, Paris, 2006.
[22]
A. lamnitchi, S. Doraimani, and G. Garzoglio. Filecules in High-Energy Physics: Characteristics and Impact on Resource Management. In HPDC 2006, Paris, 2006.
[23]
H. Jin, M. Xiong, S. Wu, and D. Zou. Replica Based Distributed Metadata Management in Grid Environment. Computational Science - Lecture Notes in Computer Science, Springer-Verlag, 3994:1055--1062, 2006.
[24]
K. Keahey, K. Doering, and I. Foster. From Sandbox to Playground: Dynamic Virtual Environments in the Grid. In 5th International Workshop in Grid Computing (Grid 2004), Pittsburgh, 2004.
[25]
D. Kondo, F. Araujo, P. Malecot, P. Domingues, L. M. Silva, G. Fedak, and F. Cappello. Characterizing Result Errors in Internet Desktop Grids. In European Conference on Parallel and Distributed Computing (Euro-Par), 2007.
[26]
D. Kondo, A. Chien, and H. Casanova. Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids. In ACM Conference on High Performance Computing and Networking (SC '04), Pittsburgh, 2004.
[27]
T. Kosar and M. Livny. Stork: Making data placement a first class citizen in the grid. In of 24th IEEE International Conference on Distributed Computing Systems (ICDCS 2004), Tokyo, Japan, March 2004.
[28]
J. Kubiatowicz and all. OceanStore: An Architecture for Global-scale Persistent Storage. In Proceedings of ACM ASPLOS. ACM, November 2000.
[29]
M. J. Litzkow, M. Livny, and M. W. Mutka. Condor - A Hunter of Idle Workstations. In Proceedings of the 8th International Conference on Distributed Computing Systems (ICDCS), pages 104--111, Washington, DC, 1988. IEEE Computer Society.
[30]
J. Luna, M. Flouris. M. Marazakis, and A. Bilas. Providing security to the Desktop Data Grid. In 2nd Workshop on Desktop Grids and Volunteer Computing Systems (PCGrid '08), 2008.
[31]
P. Maymounkov and D. Mazières. Kademlia: A Peer-to-peer Information System Based on the XOR Metric. In Proceedings of the 1st International Workshop on Peer-to-Peer Systems (IPTPS '02). MIT, 2002.
[32]
E. Otoo, D. Rotem, and A. Romosan. Optimal File-Bundle Caching Algorithms for Data-Grids. In SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, page 6, Washington, DC, USA, 2004. IEEE Computer Society.
[33]
PPDG. From Fabric to Physics. Technical report, The Particle Physics Data Grid, 2006.
[34]
A. Reinefeld, F. Schintke, and T. Schatt. Scalable and Self-Optimizing Data Grids. Annual Review of Scalable Computing, Singapore University Press, 6:30--60, 2004.
[35]
A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, 2001.
[36]
E. Santos-Neto, W. Cirne, F. Brasileiro, and A. Lima. Exploiting Replication and Data Reuse to Efficiently Schedule Data-intensive Applications on Grids. In Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel Processing, 2004.
[37]
L. F. G. Sarmenta. Sabotage-Tolerance Mechanisms for Volunteer Computing Systems. Future Generation Computer Systems, 18(4):561--572, 2002.
[38]
G. Singh, S. Bharathi, A. Chervenak. E. Deelman, C. Kesselman, M. Manohar. S. Patil, and L. Pearlman. A metadata catalog service for data intensive applications. In Proceedings of SuperComputing '03, Phoenix, Arizona, USA, November 2003.
[39]
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In Proceedings of the ACM SIGCOMM '01 Conference, San Diego, California, August 2001.
[40]
O. Tatebe. Y. Morita, S. Matsuoka, N. Soda, and S. Sekiguchi. Grid Datafarm Architecture for Petascale Data Intensive Computing. In Proc. of the 2nd IEEE/ACM Symposium on Cluster Computing and the Grid (CCGrid '02), 2002.
[41]
S. Vazhkudai, X. Ma, V. Freeh, J. Strickland, N. Tammineedi, and S. L. Scott. FreeLoader: Scavenging Desktop Storage Resources for Scientific Data. In Proceedings of Supercomputing 2005 (SC '05), Seattle, 2005.
[42]
B. Wei, G. Fedak, and F. Cappello. Scheduling Independent Tasks Sharing Large Data Distributed with BitTorrent. In The 6th IEEE/ACM International Workshop on Grid Computing, 2005, Seatle, 2005.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing
November 2008
739 pages
ISBN:9781424428359

Sponsors

Publisher

IEEE Press

Publication History

Published: 15 November 2008

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SC '08
Sponsor:

Acceptance Rates

SC '08 Paper Acceptance Rate 59 of 277 submissions, 21%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Reliable MapReduce computing on opportunistic resourcesCluster Computing10.1007/s10586-011-0158-715:2(145-161)Online publication date: 24-Dec-2018
  • (2016)DCCPThe Journal of Supercomputing10.1007/s11227-015-1511-z72:7(2537-2564)Online publication date: 1-Jul-2016
  • (2016)Enabling collaborative MapReduce on the Cloud with a single-sign-on mechanismComputing10.1007/s00607-014-0390-098:1-2(55-72)Online publication date: 1-Jan-2016
  • (2015)SCOLARS-DVProceedings of the 5th International Workshop on Cloud Data and Platforms10.1145/2744210.2744214(1-6)Online publication date: 21-Apr-2015
  • (2013)Active dataProceedings of the 8th Parallel Data Storage Workshop10.1145/2538542.2538566(39-44)Online publication date: 17-Nov-2013
  • (2012)VMRProceedings of the 10th International Workshop on Middleware for Grids, Clouds and e-Science10.1145/2405136.2405137(1-6)Online publication date: 3-Dec-2012
  • (2012)Assessing MapReduce for Internet ComputingProceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing10.1109/Grid.2012.31(76-84)Online publication date: 20-Sep-2012
  • (2011)Graph-Cut Based Coscheduling Strategy Towards Efficient Execution of Scientific Workflows in Collaborative Cloud EnvironmentsProceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing10.1109/Grid.2011.14(34-41)Online publication date: 21-Sep-2011
  • (2010)MOONProceedings of the 19th ACM International Symposium on High Performance Distributed Computing10.1145/1851476.1851489(95-106)Online publication date: 21-Jun-2010
  • (2009)BLAST Application with Data-Aware Desktop Grid MiddlewareProceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid10.1109/CCGRID.2009.91(284-291)Online publication date: 18-May-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media