Nothing Special   »   [go: up one dir, main page]

Wang et al., 2015 - Google Patents

Phase-reconfigurable shuffle optimization for Hadoop MapReduce

Wang et al., 2015

Document ID
2629380707928123039
Author
Wang J
Qiu M
Guo B
Zong Z
Publication year
Publication venue
IEEE Transactions on Cloud Computing

External Links

Snippet

Hadoop MapReduce is a leading open source framework that supports the realization of the Big Data revolution and serves as a pioneering platform in ultra large amount of information storing and processing. However, tuning a MapReduce system has become a difficult task …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Programme initiating; Programme switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Similar Documents

Publication Publication Date Title
Wang et al. Phase-reconfigurable shuffle optimization for Hadoop MapReduce
Zheng et al. PreDatA–preparatory data analytics on peta-scale machines
Hu et al. Flutter: Scheduling tasks closer to data across geo-distributed datacenters
Jha et al. A tale of two data-intensive paradigms: Applications, abstractions, and architectures
Li et al. MapReduce parallel programming model: a state-of-the-art survey
Slagter et al. An improved partitioning mechanism for optimizing massive data analysis using MapReduce
Gharaibeh et al. Efficient large-scale graph processing on hybrid CPU and GPU systems
Fu et al. An optimal locality-aware task scheduling algorithm based on bipartite graph modelling for spark applications
Zhang et al. Accelerate large-scale iterative computation through asynchronous accumulative updates
Alshammari et al. H2hadoop: Improving hadoop performance using the metadata of related jobs
Huang et al. Achieving load balance for parallel data access on distributed file systems
Tian et al. HScheduler: an optimal approach to minimize the makespan of multiple MapReduce jobs
US20210390405A1 (en) Microservice-based training systems in heterogeneous graphic processor unit (gpu) cluster and operating method thereof
Sun et al. Survey of distributed computing frameworks for supporting big data analysis
Song et al. Modulo based data placement algorithm for energy consumption optimization of MapReduce system
Senthilkumar et al. A survey on job scheduling in big data
Jarachanthan et al. Astrea: Auto-serverless analytics towards cost-efficiency and qos-awareness
dos Anjos et al. Smart: An application framework for real time big data analysis on heterogeneous cloud environments
Mohamed et al. Hadoop-MapReduce job scheduling algorithms survey
Es-Sabery et al. Big data solutions proposed for cluster computing systems challenges: A survey
Jing et al. MaMR: High-performance MapReduce programming model for material cloud applications
Fan et al. An evaluation model and benchmark for parallel computing frameworks
Chen et al. Pisces: optimizing multi-job application execution in mapreduce
Sreedhar et al. A survey on big data management and job scheduling
Tang et al. A network load perception based task scheduler for parallel distributed data processing systems