Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleSeptember 2024
Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 3Article No.: 50, Pages 1–23https://doi.org/10.1145/3659207The increasing demand for computing power and the emergence of heterogeneous computing architectures have driven the exploration of innovative techniques to address current limitations in both the compute and memory subsystems. One such solution is the ...
- research-articleJuly 2024
Non-Fusion Based Coherent Cache Randomization Using Cross-Domain Accesses
ASIA CCS '24: Proceedings of the 19th ACM Asia Conference on Computer and Communications SecurityPages 186–202https://doi.org/10.1145/3634737.3645011Randomization has proven to be a effective defense against conflict-based side-channel attacks in a shared cache. It improves security by assigning a unique randomization scheme to each security domain, e.g., though a different hashing function. However, ...
- research-articleMay 2024
Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems
WWW '24: Companion Proceedings of the ACM Web Conference 2024Pages 284–291https://doi.org/10.1145/3589335.3648326Modern large-scale recommender systems are built upon computation-intensive infrastructure and usually suffer from a huge difference in traffic between peak and off-peak periods. In peak periods, it is challenging to perform real-time computation for ...
- research-articleMay 2024
No Clash on Cache: Observations from a Multi-tenant Ecommerce Platform
ICPE '24: Proceedings of the 15th ACM/SPEC International Conference on Performance EngineeringPages 258–266https://doi.org/10.1145/3629526.3645039Caching is a classic technique for improving system performance by reducing client-perceived latency and server load. However, cache management still needs to be improved and is even more difficult in multi-tenant systems. To shed light on these problems ...
-
- research-articleJanuary 2024
CFP: A Coherence-Free Processor Design
Journal of Computer Science and Technology (JCST), Volume 39, Issue 1Pages 99–102https://doi.org/10.1007/s11390-023-3964-5AbstractThis paper presents the design of a Coherence-Free Processor (CFP) that enables a scalable multiprocessor by eliminating cache coherence operations in both hardware and software. The CFP uses a coherence-free cache (CFC) that can improve the cost-...
- research-articleSeptember 2023
ZGaming: Zero-Latency 3D Cloud Gaming by Image Prediction
ACM SIGCOMM '23: Proceedings of the ACM SIGCOMM 2023 ConferencePages 710–723https://doi.org/10.1145/3603269.3604819In cloud gaming, interactive latency is one of the most important factors in users' experience. Although the interactive latency can be reduced through typical network infrastructures like edge caching and congestion control, the interactive latency of ...
- research-articleJune 2023
DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU
Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 2Article No.: 166, Pages 1–24https://doi.org/10.1145/3589311Recently Graph Neural Networks (GNNs) have achieved great success in many applications. The mini-batch training has become the de-facto way to train GNNs on giant graphs. However, the mini-batch generation task is extremely expensive which slows down the ...
- short-paperJune 2023
RBGC: Repurpose the Buffer of Fixed Graphics Pipeline to Enhance GPU Cache
GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023Pages 173–177https://doi.org/10.1145/3583781.3590305The limited cache size of GPU in general-purpose computing hinders the execution efficiency of thousands of concurrent threads. Several techniques have been proposed to increase the cache size per thread, such as repurposing shared memory and register ...
- research-articleFebruary 2023
A Study of Early Aggregation in Database Query Processing on FPGAs
FPGA '23: Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate ArraysPages 55–65https://doi.org/10.1145/3543622.3573194In database query processing, aggregation is an operator by which data with a common property is grouped and expressed in a summary form. Early aggregation is a popular method for improving the performance of the aggregation operator. In this paper, we ...
- research-articleNovember 2022
Predicting reuse interval for optimized web caching: an LSTM-based machine learning approach
SC '22: Proceedings of the International Conference on High Performance Computing, Networking, Storage and AnalysisArticle No.: 86, Pages 1–15Caching techniques are widely used in the era of cloud computing from applications, such as Web caches to infrastructures, Memcached and memory caches in computer architectures. Prediction of cached data can greatly help improve cache management and hit ...
- research-articleDecember 2023
Merging Similar Patterns for Hardware Prefetching
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on MicroarchitecturePages 1012–1026https://doi.org/10.1109/MICRO56248.2022.00071One critical challenge of designing an efficient prefetcher is to strike a balance between performance and hardware overhead. Some state-of-the-art prefetchers achieve very high performance at the price of a very large storage requirement, which makes ...
- research-articleAugust 2022
VMIFresh: Efficient and Fresh Caches for Virtual Machine Introspection
ARES '22: Proceedings of the 17th International Conference on Availability, Reliability and SecurityArticle No.: 1, Pages 1–9https://doi.org/10.1145/3538969.3539002Virtual machine introspection (VMI) is the process of extracting knowledge about the inner state of a virtual machine from the outside. Traditional passive introspection mechanisms have proved themselves ineffective in many application domains due to ...
- research-articleAugust 2022
Evolving Skyrmion Racetrack Memory as Energy-Efficient Last-Level Cache Devices
ISLPED '22: Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and DesignArticle No.: 8, Pages 1–6https://doi.org/10.1145/3531437.3539709Skyrmion racetrack memory (SK-RM) has been regarded as a promising alternative to replace static random-access memory (SRAM) as a large-size on-chip cache device with high memory density. Different from other nonvolatile random-access memories (NVRAMs),...
- research-articleJuly 2022
Performance Analysis and Modelling of Concurrent Multi-access Data Structures
SPAA '22: Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and ArchitecturesPages 333–344https://doi.org/10.1145/3490148.3538578The major impediment to scaling concurrent data structures is memory contention when accessing shared data structure access-points, leading to thread serialisation, hindering parallelism. Aiming to address this challenge, significant amount of work in ...
- rfcJune 2022
RFC 9211: The Cache-Status HTTP Response Header Field
To aid debugging, HTTP caches often append header fields to a response, explaining how they handled the request in an ad hoc manner. This specification defines a standard mechanism to do so that is aligned with HTTP's caching model.
- research-articleMay 2022
Building a Fast and Efficient LSM-tree Store by Integrating Local Storage with Cloud Storage
- research-articleMay 2022
REMOC: efficient request managements for on-chip memories of GPUs
CF '22: Proceedings of the 19th ACM International Conference on Computing FrontiersPages 1–11https://doi.org/10.1145/3528416.3530229The on-chip memories of GPUs, including the register file, shared memory and L1 cache, can provide high bandwidth and low latency access for the temporary storage of data. The capacity of L1 cache can be increased by using the registers/shared memory ...
- research-articleApril 2022
MagNet: Cooperative Edge Caching by Automatic Content Congregating
WWW '22: Proceedings of the ACM Web Conference 2022Pages 3280–3288https://doi.org/10.1145/3485447.3512146Nowadays, the surge of Internet contents and the need for high Quality of Experience (QoE) put the backbone network under unprecedented pressure. The emerging edge caching solutions help ease the pressure by caching contents closer to users. However, ...
- research-articleFebruary 2022
Accelerating SSSP for Power-Law Graphs
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysPages 190–200https://doi.org/10.1145/3490422.3502358The single-source shortest path (SSSP) problem is one of the most important and well-studied graph problems widely used in many application domains, such as road navigation, neural image reconstruction, and social network analysis. Although we have ...