research-article

Open access

Streaming Data Reorganization at Scale with DeltaFS Indexed Massive Directories

Authors:

Charles D. Cranor,

Gregory R. Ganger,

Garth A. Gibson,

George Amvrosiadis,

Bradley W. Settlemyer,

Gary GriderAuthors Info & Claims

ACM Transactions on Storage (TOS), Volume 16, Issue 4

Article No.: 23, Pages 1 - 31

https://doi.org/10.1145/3415581

Published: 24 September 2020 Publication History

All formats PDF

Abstract

Complex storage stacks providing data compression, indexing, and analytics help leverage the massive amounts of data generated today to derive insights. It is challenging to perform this computation, however, while fully utilizing the underlying storage media. This is because, while storage servers with large core counts are widely available, single-core performance and memory bandwidth per core grow slower than the core count per die. Computational storage offers a promising solution to this problem by utilizing dedicated compute resources along the storage processing path. We present DeltaFS Indexed Massive Directories (IMDs), a new approach to computational storage. DeltaFS IMDs harvest available (i.e., not dedicated) compute, memory, and network resources on the compute nodes of an application to perform computation on data. We demonstrate the efficiency of DeltaFS IMDs by using them to dynamically reorganize the output of a real-world simulation application across 131,072 CPU cores. DeltaFS IMDs speed up reads by 1,740× while only slightly slowing down the writing of data during simulation I/O for in situ data processing.

References

[1]

Google. 2012. LevelDB. Retrieved from https://github.com/google/lev

[2]

Oracle. 2013. A Technical Overview of the Oracle Exadata Database Machine and Exadata Storage Server. Retrieved from https://www.oracle.com/technetwork/database/exadata/exadata-dbmachine-x4-twp-2076451.pdf.

[3]

IBM. 2014. IBM PureData System for Analytics Architecture, A Platform for High Performance Data Warehousing and Analytics. Retrieved from https://www.redbooks.ibm.com/redpapers/pdfs/redp4725.pdf.

[4]

LANL, NERSC, SNL. 2016. APEX Workflows. Retrieved from https://www.nersc.gov/assets/apex-workflows-v2.pdf.

[5]

LANL. 2016. LANL Trinity. Retrieved from http://www.lanl.gov/projects/trinity/.

[6]

SNIA. 2019. Computational Storage Architecture and Programming Model. Retrieved from https://www.snia.org/sites/default/files/technical_work/PublicReview/SNIA-Computational-Storage-Architecture-and-Programming-Model-0.3R1.pdf.

[7]

Anurag Acharya, Mustafa Uysal, and Joel Saltz. 1998. Active disks: Programming model, algorithms and evaluation. SIGOPS Oper. Syst. Rev. 32, 5 (Oct. 1998), 81--91.

Digital Library

[8]

Ashok Anand, Chitra Muthukrishnan, Steven Kappes, Aditya Akella, and Suman Nath. 2010. Cheap and large CAMs for high performance data-intensive networked systems. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI’10).

Digital Library

[9]

S. Atchley, D. Dillow, G. Shipman, P. Geoffray, J. M. Squyres, G. Bosilca, and R. Minnich. 2011. The common communication interface (CCI). In Proceedings of the IEEE Annual Symposium on High-Performance Interconnects (HOTI’11). 51--60.

[10]

Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’12). 53--64.

Digital Library

[11]

Utkarsh Ayachit, Andrew Bauer, Earl P. N. Duque, Greg Eisenhauer, Nicola Ferrier, Junmin Gu, Kenneth E. Jansen, Burlen Loring, Zarija Lukić, Suresh Menon, Dmitriy Morozov, Patrick O’Leary, Reetesh Ranjan, Michel Rasquin, Christopher P. Stone, Venkat Vishwanath, Gunther H. Weber, Brad Whitlock, Matthew Wolf, K. John Wu, and E. Wes Bethel. 2016. Performance analysis, design considerations, and applications of extreme-scale in situ infrastructures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’16). Article 79, 12 pages.

[12]

Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard P. Spillane, and Erez Zadok. 2012. Don’t thrash: How to cache your hash on flash. Proc. VLDB Endow. 5, 11 (July 2012), 1627--1637.

Digital Library

[13]

J. C. Bennett, H. Abbasi, P. T. Bremer, R. Grout, A. Gyulassy, T. Jin, S. Klasky, H. Kolla, M. Parashar, V. Pascucci, P. Pebay, D. Thompson, H. Yu, F. Zhang, and J. Chen. 2012. Combining in-situ and in-transit processing to enable extreme-scale scientific analysis. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’12). 1--9.

[14]

J. Bent, S. Faibish, J. Ahrens, G. Grider, J. Patchett, P. Tzelnic, and J. Woodring. 2012. Jitter-free co-processing on a prototype exascale storage stack. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’12). 1--5.

[15]

John Bent, Brad Settlemyer, and Gary Grider. 2016. Serving data to the lunatic fringe: The evolution of HPC storage. USENIX ;login: 41, 2 (June 2016).

[16]

D. Bigelow, S. Brandt, J. Bent, and H. B. Chen. 2010. Mahanaxar: Quality of service guarantees in high-bandwidth, real-time streaming data storage. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’10). 1--11.

[17]

Andrew D. Birrell and Bruce Jay Nelson. 1983. Implementing remote procedure calls. In Proceedings of the Ninth ACM Symposium on Operating Systems Principles (SOSP’83). 3–.

[18]

Burton H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (July 1970), 422--426.

Digital Library

[19]

S. Boboila, Y. Kim, S. S. Vazhkudai, P. Desnoyers, and G. M. Shipman. 2012. Active flash: Out-of-core data analytics on flash storage. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST 12). 1--12.

[20]

Jeff Bonwick, Matt Ahrens, Val Henson, Mark Maybee, and Mark Shellenbaum. 2003. The Zettabyte File System. Technical Report. Sun Microsystems.

[21]

K. J. Bowers, B. J. Albright, L. Yin, B. Bergen, and T. J. T. Kwan. 2008. Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Phys. Plasmas 15, 5 (2008), 7.

[22]

Surendra Byna, Jerry Chou, Oliver Rübel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, and Kesheng Wu. 2012. Parallel I/O, analysis, and visualization of a trillion particle simulation. In Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’12). Article 59, 12 pages.

Digital Library

[23]

Suren Byna, Robert Sisneros, Kalyana Chadalavada, and Quincey Koziol. 2015. Tuning parallel I/O on blue waters for writing 10 trillion particles. In Proceedings of the Cray User Group (CUG’15). Retrieved from https://cug.org/proceedings/cug2015_proceedings/includes/files/pap120-file2.pdf.

[24]

Suren Byna, A. Uselton, D. Knaak Prabhat, and Y. He. 2013. Trillion particles, 120,000 cores, and 350 TBs: Lessons learned from a hero I/O run on Hopper. In Proceedings of the Cray User Group (CUG’13). Retrieved from https://cug.org/proceedings/cug2013_proceedings/includes/files/pap107-file2.pdf.

[25]

P. Carns, W. Ligon, R. Ross, and P. Wyckoff. 2005. BMI: A network abstraction layer for parallel I/O. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’05). 1--8.

[26]

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 205--218.

Digital Library

[27]

C. Chen, M. Lang, L. Ionkov, and Y. Chen. 2016. Active burst-buffer: In-transit processing integrated into hierarchical storage. In Proceedings of the IEEE International Conference on Networking Architecture and Storage (NAS’16). 1--10.

[28]

Jacqueline H. Chen, Alok Choudhary, Bronis De Supinski, Matthew DeVries, Evatt R. Hawkes, Scott Klasky, Wei-Keng Liao, Kwan-Liu Ma, John Mellor-Crummey, Norbert Podhorszki, et al. 2009. Terascale direct numerical simulations of turbulent combustion using S3D. Comput. Sci. Discov. 2, 1 (2009), 015001.

[29]

Sangyeun Cho, Chanik Park, Hyunok Oh, Sungchan Kim, Youngmin Yi, and Gregory R. Ganger. 2013. Active disk meets flash: A case for intelligent SSDs. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing (ICS’13). 91--102.

[30]

Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel, Prabhat, and Rob D. Ryne. 2011. Parallel index and query for large scale data analysis. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’11). Article 30, 11 pages.

[31]

J. Chou, K. Wu, and Prabhat. 2011. FastQuery: A parallel indexing system for scientific data. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’11). 455--464.

Digital Library

[32]

Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal navigable key-value store. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). 79--94.

Digital Library

[33]

Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Symposium on Opearting Systems Design and Implementation (OSDI’04).

[34]

Peter J. Desnoyers and Prashant Shenoy. 2007. Hyperion: High volume stream archival for retrospective querying. In Proceedings of the 2007 USENIX Annual Technical Conference (USENIX ATC’07). Article 4, 14 pages.

[35]

Ananth Devulapalli, Iyyappa Murugandi, Da Xu, and Pete Wyckoff. 2009. Design of an Intelligent Object-based Storage Device. Technical Report. Ohio Supercomputer Center.

[36]

Jaeyoung Do, Yang-Suk Kee, Jignesh M. Patel, Chanik Park, Kwanghyun Park, and David J. DeWitt. 2013. Query processing on smart SSDs: Opportunities and challenges. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’13). 1221--1230.

[37]

Douglas Doerfler, Brian Austin, Brandon Cook, Jack Deslippe, Krishna Kandalla, and Peter Mendygral. 2017. Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC. In Proceedings of the Cray User Group (CUG’17). Retrieved from https://cug.org/proceedings/cug2017_proceedings/includes/files/pap117s2-file1.pdf.

[38]

Bin Dong, Surendra Byna, and Kesheng Wu. 2016. SDS-sort: Scalable dynamic skew-aware parallel sorting. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC’16). 57--68.

Digital Library

[39]

M. Dorier, G. Antoniu, F. Cappello, M. Snir, and L. Orf. 2012. Damaris: How to efficiently leverage multicore parallelism to achieve scalable, jitter-free I/O. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’12). 155--163.

[40]

Robert Escriva, Bernard Wong, and Emin Gün Sirer. 2012. HyperDex: A distributed, searchable key-value store. In Proceedings of the ACM SIGCOMM Conference on Applications Technologies Architectures and Protocols for Computer Communication (SIGCOMM’12). 25--36.

Digital Library

[41]

Bin Fan, Dave G. Andersen, Michael Kaminsky, and Michael D. Mitzenmacher. 2014. Cuckoo filter: Practically better than bloom. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies (CoNEXT’14). 75--88.

[42]

Hugh N. Greenberg, John Bent, and Gary Grider. 2015. MDHIM: A parallel key/value framework for HPC. In Proceedings of the 7th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’15).

Digital Library

[43]

P. Grun, S. Hefty, S. Sur, D. Goodell, R. D. Russell, H. Pritchard, and J. M. Squyres. 2015. A brief introduction to the openfabrics interfaces—A new network API for maximizing high performance application efficiency. In Proceedings of the IEEE Annual Symposium on High-Performance Interconnects (HOTI’15). 34--39.

[44]

Boncheol Gu, Andre S. Yoon, Duck-Ho Bae, Insoon Jo, Jinyoung Lee, Jonghyun Yoon, Jeong-Uk Kang, Moonsang Kwon, Chanho Yoon, Sangyeun Cho, Jaeheon Jeong, and Duckhyun Chang. 2016. Biscuit: A framework for near-data processing of big data workloads. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA’16). 153--165.

Digital Library

[45]

Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Analysis of HDFS under HBase: A Facebook messages case study. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). 199--212.

Digital Library

[46]

Larry Huston, Rahul Sukthankar, Rajiv Wickremesinghe, M. Satyanarayanan, Gregory R. Ganger, Erik Riedel, and Anastassia Ailamaki. 2004. Diamond: A storage architecture for early discard in interactive search. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04).

[47]

Junsu Im, Jinwook Bae, Chanwoo Chung, Arvind, and Sungjin Lee. 2020. PinK: High-speed in-storage key-value store with bounded tails. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’20).

[48]

H. V. Jagadish, P. P. S. Narayan, S. Seshadri, S. Sudarshan, and Rama Kanneganti. 1997. Incremental organization for data recording and warehousing. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB’97). 16--25.

Digital Library

[49]

Y. Jin, H. Tseng, Y. Papakonstantinou, and S. Swanson. 2017. KAML: A flexible, high-performance key-value SSD. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’17). 373--384.

[50]

Y. Kang, Y. Kee, E. L. Miller, and C. Park. 2013. Enabling cost-effective data processing with smart SSD. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’13). 1--12.

[51]

Yangwook Kang, Rekha Pitchumani, Pratik Mishra, Yang-suk Kee, Francisco Londono, Sangyoon Oh, Jongyeol Lee, and Daniel D. G. Lee. 2019. Towards building a high-performance, scale-in key-value storage system. In Proceedings of the 12th ACM International Conference on Systems and Storage (SYSTOR’19). 144--154.

[52]

Kimberly Keeton, David A. Patterson, and Joseph M. Hellerstein. 1998. A case for intelligent disks (IDISKs). SIGMOD Rec. 27, 3 (Sept. 1998), 42--52.

Digital Library

[53]

J. Kim, H. Abbasi, L. Chacón, C. Docan, S. Klasky, Q. Liu, N. Podhorszki, A. Shoshani, and K. Wu. 2011. Parallel in situ indexing for data-intensive computing. In Proceedings of the IEEE Symposium on Large Data Analysis and Visualization (LDAV’11). 65--72.

[54]

C. Lee, H. Kang, D. Park, S. Park, Y. Kim, J. Noh, W. Chung, and K. Park. 2019. iLSM-SSD: An intelligent LSM-tree-based key-value SSD for data analytics. In Proceedings of the IEEE 27th International Symposium on Modeling Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’19). 384--395.

[55]

S. Lee, J. Park, K. Fleming, Arvind, and J. Kim. 2011. Improving performance and lifetime of solid-state drives using hardware-accelerated compression. IEEE Trans. Consumer Electr. 57, 4 (Nov. 2011), 1732--1739.

[56]

M. Li, S. S. Vazhkudai, A. R. Butt, F. Meng, X. Ma, Y. Kim, C. Engelmann, and G. Shipman. 2010. Functional partitioning to optimize end-to-end performance on many-core architectures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’10). 1--12.

[57]

Siyang Li, Youyou Lu, Jiwu Shu, Yang Hu, and Tao Li. 2017. LocoFS: A loosely-coupled metadata service for distributed file systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’17). Article 4, 12 pages.

Digital Library

[58]

Xiaozhou Li, David G. Andersen, Michael Kaminsky, and Michael J. Freedman. 2014. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14). Article 27, 14 pages.

[59]

Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). 1--13.

Digital Library

[60]

N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn. 2012. On the role of burst buffers in leadership-class storage systems. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’12). 1--11.

[61]

J. Lofstead, I. Jimenez, C. Maltzahn, Q. Koziol, J. Bent, and E. Barton. 2016. DAOS and friends: A proposal for an exascale storage system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’16). 585--596.

[62]

J. Lofstead, F. Zheng, S. Klasky, and K. Schwan. 2009. Adaptable, metadata rich IO methods for portable high performance IO. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’09). 1--10.

[63]

Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th Usenix Conference on File and Storage Technologies (FAST’16). 133--148.

Digital Library

[64]

Chen Luo and Michael J. Carey. 2020. LSM-based storage techniques: A survey. VLDB J. 29, 1 (Jan. 2020), 393--418.

[65]

Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A scalable, lightweight, FTL-aware key-value store. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). 207--219.

Digital Library

[66]

M. Mitzenmacher. 2001. The power of two choices in randomized load balancing. IEEE Trans. Parallel Distrib. Syst. 12, 10 (Oct. 2001), 1094--1104.

Digital Library

[67]

Ron A. Oldfield, Gregory D. Sjaardema, Gerald F. Lofstead, II, and Todd Kordenbrock. 2012. Trilinos I/O support trios. Sci. Program. 20, 2 (Apr. 2012), 181--196.

[68]

Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Info. 33, 4 (June 1996), 351--385.

[69]

Andrey Ovsyannikov, Melissa Romanus, Brian Van Straalen, Gunther H. Weber, and David Trebotich. 2016. Scientific workflows at datawarp-speed: Accelerated data-intensive science using NERSC’s burst buffer. In Proceedings of the 1st Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS’16). 1--6.

[70]

Rasmus Pagh and Flemming Friche Rodler. 2004. Cuckoo hashing. J. Algor. 51, 2 (May 2004), 122--144.

Digital Library

[71]

Prashant Pandey, Michael A. Bender, Rob Johnson, and Rob Patro. 2017. A general-purpose counting filter: Making every bit count. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). 775--787.

Digital Library

[72]

Juan Piernas, Jarek Nieplocha, and Evan J. Felix. 2007. Evaluation of active storage strategies for the lustre parallel file system. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’07). Article 28, 10 pages.

[73]

Kai Ren and Garth Gibson. 2013. TABLEFS: Enhancing metadata efficiency in the local file system. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). 145--156.

[74]

Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A space-efficient key-value storage engine for semi-sorted data. Proc. VLDB Endow. 10, 13 (Sept. 2017), 2037--2048.

Digital Library

[75]

Kai Ren, Qing Zheng, Swapnil Patil, and Garth Gibson. 2014. IndexFS: Scaling file system metadata performance with stateless caching and bulk insertion. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’14). 237--248.

Digital Library

[76]

E. Riedel, C. Faloutsos, G. A. Gibson, and D. Nagle. 2001. Active disks for large-scale data processing. Computer 34, 6 (June 2001), 68--74.

Digital Library

[77]

Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1 (Feb. 1992), 26--52.

Digital Library

[78]

Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, and Qing Zheng. 2020. Mochi: Composing data services for high-performance computing environments. J. Comput. Sci. Technol. 35, 1, Article 121 (2020), 23 pages.

[79]

M. T. Runde, W. G. Stevens, P. A. Wortman, and J. A. Chandy. 2012. An active storage framework for object storage devices. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’12). 1--12.

[80]

Philip Schwan. 2003. Lustre: Building a file system for 1000-node clusters. In Proceedings of the Ottawa Linux Symposium (OLS’03). 380--386.

[81]

Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12). 217--228.

Digital Library

[82]

Pradeep Shetty, Richard Spillane, Ravikant Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 17--30.

Digital Library

[83]

Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Devesh Tiwari, Ali Anwar, Ali R. Butt, and Lavanya Ramakrishnan. 2015. AnalyzeThis: An analysis workflow-aware storage system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’15). Article 20, 12 pages.

Digital Library

[84]

A. Sodani, R. Gramunt, J. Corbal, H. S. Kim, K. Vinod, S. Chinthamani, S. Hutsell, R. Agarwal, and Y. C. Liu. 2016. Knights landing: Second-generation Intel Xeon phi product. IEEE Micro 36, 2 (Mar. 2016), 34--46.

Digital Library

[85]

S. W. Son, S. Lang, P. Carns, R. Ross, R. Thakur, B. Ozisikyilmaz, P. Kumar, W. Liao, and A. Choudhary. 2010. Enabling active storage on parallel I/O software stacks. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’10). 1--12.

[86]

J. Soumagne, D. Kimpe, J. Zounmevo, M. Chaarawi, Q. Koziol, A. Afsahi, and R. Ross. 2013. Mercury: Enabling remote procedure call for high-performance computing. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’13). 1--8.

[87]

Devesh Tiwari, Simona Boboila, Sudharshan S. Vazhkudai, Youngjae Kim, Xiaosong Ma, Peter J. Desnoyers, and Yan Solihin. 2013. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 119--132.

Digital Library

[88]

Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw. 2010. Accelerating parallel analysis of scientific simulation data via Zazen. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10).

Digital Library

[89]

V. Vishwanath, M. Hereld, V. Morozov, and M. E. Papka. 2011. Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’11). 1--11.

[90]

V. Vishwanath, M. Hereld, and M. E. Papka. 2011. Toward simulation-time data analysis and I/O acceleration on leadership-class systems. In Proceedings of the IEEE Symposium on Large Data Analysis and Visualization (LDAV’11). 9--14.

[91]

Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, and Steven Swanson. 2017. An experimental study of bitmap compression vs. inverted list compression. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). 993--1008.

Digital Library

[92]

Jianguo Wang, Dongchul Park, Yang-Suk Kee, Yannis Papakonstantinou, and Steven Swanson. 2016. SSD in-storage computing for list intersection. In Proceedings of the 12th International Workshop on Data Management on New Hardware (DaMoN’16). Article 4, 7 pages.

Digital Library

[93]

Sage A. Weil, Andrew W. Leung, Scott A. Brandt, and Carlos Maltzahn. 2007. RADOS: A scalable, reliable storage service for petabyte-scale storage clusters. In Proceedings of the 2nd International Workshop on Petascale Data Storage (PDSW’07). 35--44.

Digital Library

[94]

Louis Woods, Zsolt István, and Gustavo Alonso. 2014. Ibex: An intelligent storage engine with support for advanced SQL offloading. Proc. VLDB Endow. 7, 11 (July 2014), 963--974.

Digital Library

[95]

Kesheng Wu, Ekow J. Otoo, and Arie Shoshani. 2006. Optimizing bitmap indices with efficient compression. ACM Trans. Database Syst. 31, 1 (Mar. 2006), 1--38.

Digital Library

[96]

S. Wu, K. Lin, and L. Chang. 2018. KVSSD: Close integration of LSM trees and flash translation layer for write-efficient KV store. In Proceedings of the Design Automation Test in Europe Conference Exhibition (DATE’18). 563--568.

[97]

Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). 71--82.

Digital Library

[98]

X. Yu, M. Youill, M. Woicik, A. Ghanem, M. Serafini, A. Aboulnaga, and M. Stonebraker. 2020. PushdownDB: Accelerating a DBMS using S3 computation. In Proceedings of the IEEE 36th International Conference on Data Engineering (ICDE’20). 1802--1805.

[99]

Yulai Xie, K. Muniswamy-Reddy, D. Feng, D. D. E. Long, Yangwook Kang, Z. Niu, and Zhipeng Tan. 2011. Design and evaluation of Oasis: An active storage framework based on T10 OSD standard. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’11). 1--12.

[100]

Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G. Andersen, Michael Kaminsky, Kimberly Keeton, and Andrew Pavlo. 2018. SuRF: Practical range query filtering with fast succinct tries. In Proceedings of the International Conference on Management of Data (SIGMOD’18). 323--336.

Digital Library

[101]

F. Zheng, H. Abbasi, C. Docan, J. Lofstead, Q. Liu, S. Klasky, M. Parashar, N. Podhorszki, K. Schwan, and M. Wolf. 2010. PreDatA—Preparatory data analytics on peta-scale machines. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’10). 1--12.

[102]

F. Zheng, H. Yu, C. Hantas, M. Wolf, G. Eisenhauer, K. Schwan, H. Abbasi, and S. Klasky. 2013. GoldRush: Resource efficient in situ scientific data analytics using fine-grained interference aware execution. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’13). 1--12.

[103]

F. Zheng, H. Zou, G. Eisenhauer, K. Schwan, M. Wolf, J. Dayal, T. A. Nguyen, J. Cao, H. Abbasi, S. Klasky, N. Podhorszki, and H. Yu. 2013. FlexIO: I/O middleware for location-flexible scientific data analytics. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’13). 320--331.

[104]

Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Garth A. Gibson, Charles D. Cranor, Bradley W. Settlemyer, Gary Grider, and Fan Guo. 2017. Software-defined storage for fast trajectory queries using a DeltaFS indexed massive directory. In Proceedings of the 2nd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS’17). 7--12.

Digital Library

[105]

Qing Zheng, Kai Ren, Garth Gibson, Bradley W. Settlemyer, and Gary Grider. 2015. DeltaFS: Exascale file systems scale better without dedicated servers. In Proceedings of the 10th Parallel Data Storage Workshop (PDSW’15). 1--6.

Digital Library

[106]

Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, and Danny Harnik. 2014. Compression and SSDs: Where and how? In Proceedings of the 2nd Workshop on Interactions of NVM/Flash with Operating Systems and Workloads (INFLOW’14).

Cited By

Zang XGao WLi GFang HBan CHe ZSun HEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)A Baseline Investigation: Transformer-based Cross-view Baseline for Text-based Person SearchProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611916(7737-7746)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611916
Park IZheng QManno DYang SLee JBonnie DSettlemyer BKim YChung WGrider G(2023)KV-CSD: A Hardware-Accelerated Key-Value Store for Data-Intensive Applications2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00019(132-144)Online publication date: 31-Oct-2023
https://doi.org/10.1109/CLUSTER52292.2023.00019
Niu KHuang LHuang YWang PWang LZhang YMagalhães Jdel Bimbo ASatoh SSebe NAlameda-Pineda XJin QOria VToni L(2022)Cross-modal Co-occurrence Attributes Alignments for Person Search by LanguageProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3547753(4426-4434)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3503161.3547753
Show More Cited By

Index Terms

Streaming Data Reorganization at Scale with DeltaFS Indexed Massive Directories
1. Information systems
  1. Data management systems
    1. Data structures
      1. Data access methods
        Point lookups
      2. Data layout
        Record and block layout
    2. Database management system engines
      1. Stream management
  2. Information storage systems
    1. Storage architectures
      1. Distributed storage

Recommendations

Cost-effective, Energy-efficient, and Scalable Storage Computing for Large-scale AI Applications
Special Section on Computational Storage and Regular Papers

The growing volume of data produced continuously in the Cloud and at the Edge poses significant challenges for large-scale AI applications to extract and learn useful information from the data in a timely and efficient way. The goal of this article is ...
Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory
PDSW-DISCS '17: Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems

In this paper we introduce the Indexed Massive Directory, a new technique for indexing data within DeltaFS. With its design as a scalable, server-less file system for HPC platforms, DeltaFS scales file system metadata performance with application scale. ...
DeltaFS: exascale file systems scale better without dedicated servers
PDSW '15: Proceedings of the 10th Parallel Data Storage Workshop

High performance computing fault tolerance depends on scalable parallel file system performance. For more than a decade scalable bandwidth has been available from the object storage systems that underlie modern parallel file systems, and recently we ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage

ACM Transactions on Storage Volume 16, Issue 4

Special Section on Computational Storage and Regular Papers

November 2020

185 pages

ISSN:1553-3077

EISSN:1553-3093

DOI:10.1145/3426401

Editor:
Sam H. Noh
Ulsan National Institute of Science and Technology, Ulsan, Republic of Korea

Issue’s Table of Contents

Copyright © 2020 ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 September 2020

Accepted: 01 August 2020

Revised: 01 July 2020

Received: 01 February 2020

Published in TOS Volume 16, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Los Alamos National Laboratory

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
952
Total Downloads

Downloads (Last 12 months)213
Downloads (Last 6 weeks)18

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zang XGao WLi GFang HBan CHe ZSun HEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)A Baseline Investigation: Transformer-based Cross-view Baseline for Text-based Person SearchProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611916(7737-7746)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611916
Park IZheng QManno DYang SLee JBonnie DSettlemyer BKim YChung WGrider G(2023)KV-CSD: A Hardware-Accelerated Key-Value Store for Data-Intensive Applications2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00019(132-144)Online publication date: 31-Oct-2023
https://doi.org/10.1109/CLUSTER52292.2023.00019
Niu KHuang LHuang YWang PWang LZhang YMagalhães Jdel Bimbo ASatoh SSebe NAlameda-Pineda XJin QOria VToni L(2022)Cross-modal Co-occurrence Attributes Alignments for Person Search by LanguageProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3547753(4426-4434)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3503161.3547753
Soumagne JHenderson JChaarawi MFortner NBreitenfeld SLu SRobinson DPourmal ELombardi J(2022)Accelerating HDF5 I/O for Exascale Using DAOSIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309788433:4(903-914)Online publication date: 1-Apr-2022
https://doi.org/10.1109/TPDS.2021.3097884
Dashora RBabu M(2022)A Survey on Advancements of Real-Time Analytics Architecture ComponentsComputational Methods and Data Engineering10.1007/978-981-19-3015-7_41(547-559)Online publication date: 9-Sep-2022
https://doi.org/10.1007/978-981-19-3015-7_41
Qu LLiu MWu JGao ZNie LDiaz FShah CSuel TCastells PJones RSakai T(2021)Dynamic Modality Interaction Modeling for Image-Text RetrievalProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462829(1104-1113)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462829

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents