Vernik et al., 2018 - Google Patents
Stocator: Providing high performance and fault tolerance for apache spark over object storageVernik et al., 2018
View PDF- Document ID
- 3529582268699726904
- Author
- Vernik G
- Factor M
- Kolodner E
- Michiardi P
- Ofer E
- Pace F
- Publication year
- Publication venue
- 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
External Links
Snippet
Until now object storage has not been a first-class citizen of the Apache Hadoop ecosystem including Apache Spark. Hadoop connectors to object storage have been based on file semantics, an impedance mismatch, which leads to low performance and the need for an …
- 238000003860 storage 0 title abstract description 145
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/485—Task life-cycle, e.g. stopping, restarting, resuming execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/30144—Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30575—Replication, distribution or synchronisation of data between databases or within a distributed database; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30182—File system types
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/65—Update
- G06F8/68—Incremental; Differential
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
- G06F3/0601—Dedicated interfaces to storage systems
- G06F3/0628—Dedicated interfaces to storage systems making use of a particular technique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Harter et al. | Slacker: Fast distribution with lazy docker containers | |
US10089183B2 (en) | Method and apparatus for reconstructing and checking the consistency of deduplication metadata of a deduplication file system | |
US9703890B2 (en) | Method and system that determine whether or not two graph-like representations of two systems describe equivalent systems | |
US10430100B2 (en) | Transactional operations in multi-master distributed data management systems | |
Samadi et al. | Performance comparison between Hadoop and Spark frameworks using HiBench benchmarks | |
Bhatotia et al. | Slider: Incremental sliding window analytics | |
US20180253338A1 (en) | Operation efficiency management with respect to application compile-time | |
US20170277556A1 (en) | Distribution system, computer, and arrangement method for virtual machine | |
US11487727B2 (en) | Resolving versions in an append-only large-scale data store in distributed data management systems | |
Tan et al. | Hadoop framework: impact of data organization on performance | |
Miceli et al. | Programming abstractions for data intensive computing on clouds and grids | |
Moise et al. | Optimizing intermediate data management in MapReduce computations | |
Vernik et al. | Stocator: Providing high performance and fault tolerance for apache spark over object storage | |
Venner et al. | Pro apache hadoop | |
Pineda-Morales et al. | Managing hot metadata for scientific workflows on multisite clouds | |
Pan | The performance comparison of hadoop and spark | |
CN112753028B (en) | Client-side file system for remote repository | |
Saurabh et al. | Expelliarmus: Semantic-centric virtual machine image management in IaaS Clouds | |
Knob et al. | An unikernels provisioning architecture for OpenStack | |
Dong | Extending starfish to support the growing hadoop ecosystem | |
Blöcher et al. | ROME: All Overlays Lead to Aggregation, but Some Are Faster than Others | |
Merenstein | Techniques for Storage Performance Measurement and Data Management in Container-Native and Serverless Environments | |
Ren et al. | File system performance tuning for standard big data benchmarks | |
Sim et al. | An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays | |
Dory | Study and Comparison of Elastic Cloud Databases: Myth or Reality? |