Nothing Special   »   [go: up one dir, main page]

skip to main content
Volume 20, Issue 3September 2023
Editor:
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
ISSN:1544-3566
EISSN:1544-3973
Reflects downloads up to 10 Nov 2024Bibliometrics
Skip Table Of Content Section
research-article
Open Access
ASM: An Adaptive Secure Multicore for Co-located Mutually Distrusting Processes
Article No.: 32, Pages 1–24https://doi.org/10.1145/3587480

With the ever-increasing virtualization of software and hardware, the privacy of user-sensitive data is a fundamental concern in computation outsourcing. Secure processors enable a trusted execution environment to guarantee security properties based on ...

research-article
Open Access
Turn-based Spatiotemporal Coherence for GPUs
Article No.: 33, Pages 1–27https://doi.org/10.1145/3593054

This article introduces turn-based spatiotemporal coherence. Spatiotemporal coherence is a novel coherence implementation that assigns write permission to epochs (or turns) as opposed to a processor core. This paradigm shift in the assignment of write ...

research-article
Open Access
Jointly Optimizing Job Assignment and Resource Partitioning for Improving System Throughput in Cloud Datacenters
Article No.: 34, Pages 1–24https://doi.org/10.1145/3593055

Colocating multiple jobs on the same server has been widely applied for improving resource utilization in cloud datacenters. However, the colocated jobs would contend for the shared resources, which could lead to significant performance degradation. An ...

research-article
Open Access
TNT: A Modular Approach to Traversing Physically Heterogeneous NOCs at Bare-wire Latency
Article No.: 35, Pages 1–25https://doi.org/10.1145/3597611

The ideal latency for on-chip network traversal would be the delay incurred from wire traversal alone. Unfortunately, in a realistic modular network, the latency for a packet to traverse the network is significantly higher than this wire delay. The main ...

research-article
Open Access
Accelerating Convolutional Neural Network by Exploiting Sparsity on GPUs
Article No.: 36, Pages 1–26https://doi.org/10.1145/3600092

The convolutional neural network (CNN) is an important deep learning method, which is widely used in many fields. However, it is very time consuming to implement the CNN where convolution usually takes most of the time. There are many zero values in ...

research-article
Open Access
GraphTune: An Efficient Dependency-Aware Substrate to Alleviate Irregularity in Concurrent Graph Processing
Article No.: 37, Pages 1–24https://doi.org/10.1145/3600091

With the increasing need for graph analysis, massive Concurrent iterative Graph Processing (CGP) jobs are usually performed on the common large-scale real-world graph. Although several solutions have been proposed, these CGP jobs are not coordinated with ...

research-article
Open Access
The Impact of Page Size and Microarchitecture on Instruction Address Translation Overhead
Article No.: 38, Pages 1–25https://doi.org/10.1145/3600089

As the volume of data processed by applications has increased, considerable attention has been paid to data address translation overheads, leading to the widespread use of larger page sizes (“superpages”) and multi-level translation lookaside buffers (...

research-article
Open Access
Cache Programming for Scientific Loops Using Leases
Article No.: 39, Pages 1–25https://doi.org/10.1145/3600090

Cache management is important in exploiting locality and reducing data movement. This article studies a new type of programmable cache called the lease cache. By assigning leases, software exerts the primary control on when and how long data stays in the ...

research-article
Open Access
MPU: Memory-centric SIMT Processor via In-DRAM Near-bank Computing
Article No.: 40, Pages 1–26https://doi.org/10.1145/3603113

With the growing number of data-intensive workloads, GPU, which is the state-of-the-art single-instruction-multiple-thread (SIMT) processor, is hindered by the memory bandwidth wall. To alleviate this bottleneck, previously proposed 3D-stacking near-bank ...

research-article
Open Access
rNdN: Fast Query Compilation for NVIDIA GPUs
Article No.: 41, Pages 1–25https://doi.org/10.1145/3603503

GPU database systems are an effective solution to query optimization, particularly with compilation and data caching. They fall short, however, in end-to-end workloads, as existing compiler toolchains are too expensive for use with short-running queries. ...

research-article
Open Access
Hierarchical Model Parallelism for Optimizing Inference on Many-core Processor via Decoupled 3D-CNN Structure
Article No.: 42, Pages 1–21https://doi.org/10.1145/3605149

The tremendous success of convolutional neural network (CNN) has made it ubiquitous in many fields of human endeavor. Many applications such as biomedical analysis and scientific data analysis involve analyzing volumetric data. This spawns huge demand for ...

research-article
Open Access
MFFT: A GPU Accelerated Highly Efficient Mixed-Precision Large-Scale FFT Framework
Article No.: 43, Pages 1–23https://doi.org/10.1145/3605148

Fast Fourier transform (FFT) is widely used in computing applications in large-scale parallel programs, and data communication is the main performance bottleneck of FFT and seriously affects its parallel efficiency. To tackle this problem, we propose a ...

research-article
Open Access
Approx-RM: Reducing Energy on Heterogeneous Multicore Processors under Accuracy and Timing Constraints
Article No.: 44, Pages 1–25https://doi.org/10.1145/3605214

Reducing energy consumption while providing performance and quality guarantees is crucial for computing systems ranging from battery-powered embedded systems to data centers. This article considers approximate iterative applications executing on ...

research-article
Open Access
SplitZNS: Towards an Efficient LSM-Tree on Zoned Namespace SSDs
Article No.: 45, Pages 1–26https://doi.org/10.1145/3608476

The Zoned Namespace (ZNS) Solid State Drive (SSD) is a nascent form of storage device that offers novel prospects for the Log Structured Merge Tree (LSM-tree). ZNS exposes erase blocks in SSD as append-only zones, enabling the LSM-tree to gain awareness ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.