No abstract available.
Proceeding Downloads
Reconstructing hardware transactional memory for workload optimized systems
Workload optimized systems consisting of large number of general and special purpose cores, and with a support for shared memory programming, are slowly becoming prevalent. One of the major impediments for effective parallel programming on these systems ...
Enhanced adaptive insertion policy for shared caches
The LRU replacement policy is commonly used in the lastlevel caches of multiprocessors. However, LRU policy does not work well for memory intensive workloads which working set are greater than the available cache size. When a new arrival cache block is ...
A read-write aware replacement policy for phase change memory
Scaling DRAM will be increasingly difficult due to power and cost constraint. Phase Change Memory (PCM) is an emerging memory technology that can increase main memory capacity in a cost-effective and power-efficient manner. However, PCM incurs ...
Evaluating the performance and scalability of mapreduce applications on X10
MapReduce has been shown to be a simple and efficient way to harness the massive resources of clusters. Recently, researchers propose using partitioned global address space (PGAS) based language and runtime to ease the programming of large-scale ...
Comparing high level mapreduce query languages
The MapReduce parallel computational model is of increasing importance. A number of High Level Query Languages (HLQLs) have been constructed on top of the Hadoop MapReduce realization, primarily Pig, Hive, and JAQL. This paper makes a systematic ...
A semi-automatic scratchpad memory management framework for CMP
Previous research has demonstrated that scratchpad memory(SPM) consumes far less power and on-chip area than the traditional cache. As a software managed memory, SPM has been widely adopted in today's mainstream embedded processors. Traditional SPM ...
Parallel binomial valuation of american options with proportional transaction costs
We present a multi-threaded parallel algorithm that computes the ask and bid prices of American options with the asset transaction costs being taken into consideration. The parallel algorithm is based on the recombining binomial tree model, and is ...
A parallel analysis on scale invariant feature transform (SIFT) algorithm
With explosive growth of multimedia data on internet, the effective information retrieval from a large scale of multimedia data becomes more and more important. To retrieve these multimedia data automatically, some features in them must be extracted. ...
Modality conflict discovery for SOA security policies
This paper considers the problem of modality conflicts in security policies for Service-Oriented Architecture (SOA) environments. We describe the importance of this problem and present an algorithm for discovering modality conflicts with low overhead. ...
FPGA implementation of variable-precision floating-point arithmetic
This paper explores the capability of FPGA solutions to accelerate scientific applications with variable-precision floating-point (VP) arithmetic. First, we present a special-purpose Very Large Instruction Word (VLIW) architecture for VP arithmetic (VV-...
Optimization of N-queens solvers on graphics processors
While graphics processing units (GPUs) show high performance for problems with regular structures, they do not perform well for irregular tasks due to the mismatches between irregular problem structures and SIMD-like GPU architectures. In this paper, we ...
Partool: a feedback-directed parallelizer
We present a tool which gives detailed feedback to application developers on how their programs can be made amenable to parallelization. Also, the tool automatically parallelizes the code for a large number of constructs. Since the tool outputs a ...
MT-Profiler: a parallel dynamic analysis framework based on two-stage sampling
Dynamic instrumentation systems offer a valuable solution for program profiling and analysis, architectural simulation, and bug detection. However, the performance of target programs suffers great losses when they are instrumented by these systems. This ...