Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (58)

Search Parameters:
Keywords = coded cache

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 762 KiB  
Article
BTIP: Branch Triggered Instruction Prefetcher Ensuring Timeliness
by Wenhai Lin, Yiquan Lin, Yiquan Chen, Shishun Cai, Zhen Jin, Jiexiong Xu, Yuzhong Zhang and Wenzhi Chen
Electronics 2024, 13(21), 4323; https://doi.org/10.3390/electronics13214323 - 4 Nov 2024
Viewed by 557
Abstract
In CPU microarchitecture, caches store frequently accessed instructions and data by exploiting their locality, reducing memory access latency and improving application performance. However, contemporary applications with large code footprints often experience frequent Icache misses, which significantly degrade performance. Although Fetch-Directed Instruction Prefetching (FDIP) [...] Read more.
In CPU microarchitecture, caches store frequently accessed instructions and data by exploiting their locality, reducing memory access latency and improving application performance. However, contemporary applications with large code footprints often experience frequent Icache misses, which significantly degrade performance. Although Fetch-Directed Instruction Prefetching (FDIP) has been widely adopted in commercial processors to reduce Icache misses, our analysis reveals that FDIP still suffers from Icache misses caused by branch mispredictions and late prefetch, leaving considerable opportunity for performance optimization. Priority-Directed Instruction Prefetching (PDIP) has been proposed to reduce Icache misses caused by branch mispredictions in FDIP. However, it neglects Icache misses due to late prefetch and suffers from high storage overhead. In this paper, we proposed a branch-triggered instruction prefetcher (BTIP), which aims to prefetch Icache lines that FDIP cannot efficiently handle, including the Icache misses due to branch misprediction and late prefetch. We also introduce a novel Branch Target Buffer (BTB) organization, BTIP BTB, which stores prefetch metadata and reuses information from existing BTB entries, effectively reducing storage overhead. We implemented BTIP on the Champsim simulator and evaluated BTIP in detail using traces from the 1st Instruction Prefetching Championship (IPC-1). Our evaluation shows that BTIP outperforms both FDIP and PDIP. Specifically, BTIP reduces Icache misses by 38.0% and improves performance by 5.1% compared to FDIP. Additionally, BTIP outperforms PDIP by 1.6% while using only 41.9% of the storage space required by PDIP. Full article
(This article belongs to the Special Issue Computer Architecture & Parallel and Distributed Computing)
Show Figures

Figure 1

Figure 1
<p>Composition of an entry in conventional BTB.</p>
Full article ">Figure 2
<p>The process of BTB predicting a branch target.</p>
Full article ">Figure 3
<p>Overview of decoupled front-end microarchitecture design.</p>
Full article ">Figure 4
<p>Icache load miss per kilo instructions with FDIP enabled.</p>
Full article ">Figure 5
<p>The IPC speedup of ideal Icache with FDIP enabled and double-sized Icache. Applications are sorted by the speedup value of the ideal Icache. Avg_all represents the averaged speedup across all applications, while Avg_top10 indicates the averaged speedup for the top 10 applications with the highest speedup using the ideal Icache.</p>
Full article ">Figure 6
<p>The proportion of Icache misses caused by branch mispredictions and late prefetches.</p>
Full article ">Figure 7
<p>The probability that metadata in the PDIP table are also present in the BTB.</p>
Full article ">Figure 8
<p>The overall microarchitecture of BTIP, where the gray blocks are newly added components or modified components.</p>
Full article ">Figure 9
<p>Steps for building and storing prefetch metadata.</p>
Full article ">Figure 10
<p>The modified BTB structure and entry composition.</p>
Full article ">Figure 11
<p>The IPC improvement of all prefetchers over no prefetch.</p>
Full article ">Figure 12
<p>The Icache miss per kilo instructions.</p>
Full article ">Figure 13
<p>Prefetch accuracy.</p>
Full article ">Figure 14
<p>Breakdown of improvements on IPC. FDIP is the baseline.</p>
Full article ">Figure 15
<p>IPC Improvement of BTIP over FDIP with different Icache sizes.</p>
Full article ">
20 pages, 1346 KiB  
Article
MNCATM: A Multi-Layer Non-Uniform Coding-Based Adaptive Transmission Method for 360° Video
by Xiang Li, Junfeng Nie, Xinmiao Zhang, Chengrui Li, Yichen Zhu, Yang Liu, Kun Tian and Jia Guo
Electronics 2024, 13(21), 4200; https://doi.org/10.3390/electronics13214200 - 26 Oct 2024
Viewed by 659
Abstract
With the rapid development of multimedia services and smart devices, 360-degree video has enhanced the user viewing experience, ushering in a new era of immersive human–computer interaction. These technologies are increasingly integrating everyday life, including gaming, education, and healthcare. However, the uneven spatiotemporal [...] Read more.
With the rapid development of multimedia services and smart devices, 360-degree video has enhanced the user viewing experience, ushering in a new era of immersive human–computer interaction. These technologies are increasingly integrating everyday life, including gaming, education, and healthcare. However, the uneven spatiotemporal distribution of wireless resources presents significant challenges for the transmission of ultra-high-definition 360-degree video streaming. To address this issue, this paper proposes a multi-layer non-uniform coding-based adaptive transmission method for 360° video (MNCATM). This method optimizes video caching and transmission by dividing non-uniform tiles and leveraging users’ dynamic field of view (FoV) information and the multi-bitrate characteristics of video content. First, the video transmission process is formalized and modeled, and an adaptive transmission optimization framework for a non-uniform video is proposed. Based on this, the optimization problem required by the paper is summarized, and an algorithm is proposed to solve the problem. Simulation experiments demonstrate that the proposed method, MNCATM, outperforms existing transmission schemes in terms of bandwidth utilization and user quality of experience (QoE). MNCATM can effectively utilize network bandwidth, reduce latency, improve transmission efficiency, and maximize user experience quality. Full article
Show Figures

Figure 1

Figure 1
<p>Non-uniformly encoded 360-degree video.</p>
Full article ">Figure 2
<p>A 360° video service scenario architecture.</p>
Full article ">Figure 3
<p>QoE comparison of different methods under normal bandwidth.</p>
Full article ">Figure 4
<p>QoE comparison of different methods under good bandwidth.</p>
Full article ">Figure 5
<p>QoE comparison of different methods under poor bandwidth.</p>
Full article ">Figure 6
<p>PSNR comparison of different methods under normal bandwidth.</p>
Full article ">Figure 7
<p>PSNR comparison of different methods under good bandwidth.</p>
Full article ">Figure 8
<p>PSNR comparison of different methods under poor bandwidth.</p>
Full article ">Figure 9
<p>SSIM comparison of different methods under normal bandwidth.</p>
Full article ">Figure 10
<p>SSIM comparison of different methods under good bandwidth.</p>
Full article ">Figure 11
<p>SSIM comparison of different methods under poor bandwidth.</p>
Full article ">Figure 12
<p>MOS comparison of different methods under normal bandwidth.</p>
Full article ">Figure 13
<p>MOS comparison of different methods under good bandwidth.</p>
Full article ">Figure 14
<p>MOS comparison of different methods under poor bandwidth.</p>
Full article ">
20 pages, 305 KiB  
Article
Revisiting Database Indexing for Parallel and Accelerated Computing: A Comprehensive Study and Novel Approaches
by Maryam Abbasi, Marco V. Bernardo, Paulo Váz, José Silva and Pedro Martins
Information 2024, 15(8), 429; https://doi.org/10.3390/info15080429 - 24 Jul 2024
Viewed by 1566
Abstract
While the importance of indexing strategies for optimizing query performance in database systems is widely acknowledged, the impact of rapidly evolving hardware architectures on indexing techniques has been an underexplored area. As modern computing systems increasingly leverage parallel processing capabilities, multi-core CPUs, and [...] Read more.
While the importance of indexing strategies for optimizing query performance in database systems is widely acknowledged, the impact of rapidly evolving hardware architectures on indexing techniques has been an underexplored area. As modern computing systems increasingly leverage parallel processing capabilities, multi-core CPUs, and specialized hardware accelerators, traditional indexing approaches may not fully capitalize on these advancements. This comprehensive experimental study investigates the effects of hardware-conscious indexing strategies tailored for contemporary and emerging hardware platforms. Through rigorous experimentation on a real-world database environment using the industry-standard TPC-H benchmark, this research evaluates the performance implications of indexing techniques specifically designed to exploit parallelism, vectorization, and hardware-accelerated operations. By examining approaches such as cache-conscious B-Tree variants, SIMD-optimized hash indexes, and GPU-accelerated spatial indexing, the study provides valuable insights into the potential performance gains and trade-offs associated with these hardware-aware indexing methods. The findings reveal that hardware-conscious indexing strategies can significantly outperform their traditional counterparts, particularly in data-intensive workloads and large-scale database deployments. Our experiments show improvements ranging from 32.4% to 48.6% in query execution time, depending on the specific technique and hardware configuration. However, the study also highlights the complexity of implementing and tuning these techniques, as they often require intricate code optimizations and a deep understanding of the underlying hardware architecture. Additionally, this research explores the potential of machine learning-based indexing approaches, including reinforcement learning for index selection and neural network-based index advisors. While these techniques show promise, with performance improvements of up to 48.6% in certain scenarios, their effectiveness varies across different query types and data distributions. By offering a comprehensive analysis and practical recommendations, this research contributes to the ongoing pursuit of database performance optimization in the era of heterogeneous computing. The findings inform database administrators, developers, and system architects on effective indexing practices tailored for modern hardware, while also paving the way for future research into adaptive indexing techniques that can dynamically leverage hardware capabilities based on workload characteristics and resource availability. Full article
(This article belongs to the Special Issue Advances in High Performance Computing and Scalable Software)
23 pages, 931 KiB  
Article
Genetic Algorithm-Based Cooperative Coding and Caching Data Dissemination Scheme in Multi-UAV-Enabled Internet of Vehicles
by Ke Xiao, Jie Hu, Chunlin Li, Wenjie Ji, Jinkun Xu and Huang Du
Sensors 2024, 24(14), 4443; https://doi.org/10.3390/s24144443 - 9 Jul 2024
Viewed by 701
Abstract
Unmanned Aerial Vehicles (UAVs) have emerged as efficient tools in disaster-stricken areas, facilitating efficient data dissemination for post-disaster rescue operations. However, the limited onboard energy of UAVs imposes significant constraints on their operational lifespan, thereby presenting substantial challenges for efficient data dissemination. Therefore, [...] Read more.
Unmanned Aerial Vehicles (UAVs) have emerged as efficient tools in disaster-stricken areas, facilitating efficient data dissemination for post-disaster rescue operations. However, the limited onboard energy of UAVs imposes significant constraints on their operational lifespan, thereby presenting substantial challenges for efficient data dissemination. Therefore, this work investigates a data dissemination scheme to enhance the UAVs’ bandwidth efficiency in multi-UAV-enabled Internet of Vehicles, thereby reducing UAVs’ energy consumption and improving overall system performance when UAVs hover along designated flight trajectories for data dissemination. Specifically, first, we present a software-defined network-based framework for data dissemination in multi-UAV-enabled IoV. According to this framework, we formulate a problem called C2BS (Coding-based Cooperative Broadcast Scheduling) that focuses on optimizing the UAVs’ bandwidth efficiency by leveraging the combined benefits of coding and caching. Furthermore, we demonstrate the NP-hardness of the C2BS problem by employing a polynomial time reduction technique on the simultaneous matrix completion problem. Then, inspired by the benefits offered by genetic algorithms, we propose a novel approach called the Genetic algorithm-based Cooperative Scheduling (GCS) algorithm to address the C2BS problem. This approach encompasses a coding scheme for representing individuals, a fitness function for assessing individuals, operators (i.e., crossover and mutation) for generating offspring, a local search technique to enhance search performance, and a repair operator employed to rectify infeasible solutions. Additionally, we present an analysis of the time complexity for the GCS algorithm. Finally, we present a simulation model to evaluate the performance. Experimental findings provide evidence of the excellence of the proposed scheme. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Figure 1
<p>SDN-based data dissemination framework in multi-UAV-enabled IoV.</p>
Full article ">Figure 2
<p>Flow chart of the GCS algorithm.</p>
Full article ">Figure 3
<p>Illustration of the individual <math display="inline"><semantics> <msubsup> <mi>λ</mi> <mi>i</mi> <mn>1</mn> </msubsup> </semantics></math> in the 1st generation <math display="inline"><semantics> <msub> <mo>Λ</mo> <mn>1</mn> </msub> </semantics></math>.</p>
Full article ">Figure 4
<p>Simulation scenario.</p>
Full article ">Figure 5
<p>ASD across varying cache ratios.</p>
Full article ">Figure 6
<p>ASP across varying cache ratios.</p>
Full article ">Figure 7
<p>SR across varying cache ratios.</p>
Full article ">Figure 8
<p>ASD across varying database sizes.</p>
Full article ">Figure 9
<p>ASP across varying database sizes.</p>
Full article ">Figure 10
<p>SR across varying database sizes.</p>
Full article ">Figure 11
<p>ASD across varying data sizes.</p>
Full article ">Figure 12
<p>ASP across varying data sizes.</p>
Full article ">Figure 13
<p>SR across varying data sizes.</p>
Full article ">
21 pages, 2048 KiB  
Article
GvdsSQL: Heterogeneous Database Unified Access Technology for Wide-Area Environments
by Jing Shang, Limin Xiao, Zhihui Wu, Jinqian Yang, Zhiwen Xiao, Jinquan Wang, Yifei Zhang, Xuguang Chen, Jibin Wang and Huiyang Li
Electronics 2024, 13(8), 1521; https://doi.org/10.3390/electronics13081521 - 17 Apr 2024
Cited by 1 | Viewed by 894
Abstract
In a wide area environment, leveraging a unified interface for the management of diverse databases is appealing. Nonetheless, variations in access and operation across heterogeneous databases pose challenges in abstracting a unified access model while preserving specific database operations. Simultaneously, intricate deployment and [...] Read more.
In a wide area environment, leveraging a unified interface for the management of diverse databases is appealing. Nonetheless, variations in access and operation across heterogeneous databases pose challenges in abstracting a unified access model while preserving specific database operations. Simultaneously, intricate deployment and network conditions in wide-area environments create obstacles for forwarding database requests and achieving high-performance access. To address these challenges, this paper introduces a technology for unified access to heterogeneous databases in wide-area environments, termed Global Virtual Data Space SQL (GvdsSQL). Initially, this paper implements a unified data access mechanism for heterogeneous databases through metadata extraction, abstracts the unified access model, and accomplishes identification and forwarding of fundamental database operations. Secondly, the paper introduces a mechanism for expanding database operations through code generation. This mechanism achieves compatibility for special database operations by injecting rules to generate code. Lastly, this paper implements a multilevel caching mechanism for query results in wide-area databases utilizing semantic analysis. Through intelligent analysis of operation statements, it achieves precise management of cache items, enhancing wide-area access performance. The performance is improved by approximately 35% and 240% compared to similar methods. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>Architecture of GvdsSQL, which is a unified access technology for heterogeneous databases in wide-area environment. GvdsSQL consists of three main components, metadata extraction data access, metadata extraction data access, and semantic analysis multilevel caching.</p>
Full article ">Figure 2
<p>Overview of the code generation process. First, user input is formatted, and appropriate code snippets are filtered from the code database. Then, a code wrapper is used to complete the code packaging.</p>
Full article ">Figure 3
<p>Overview of semantic analysis multilevel cache system. It includes an SQL statement recognition module, multilevel cache queues, cache item positioning index, and publish–subscribe module.</p>
Full article ">Figure 4
<p>Overview of multilevel cache which includes four levels: database name, entity database name, table name, and column name prefix tree.</p>
Full article ">Figure 5
<p>Database connection information management.</p>
Full article ">Figure 6
<p>Database information visualization.</p>
Full article ">Figure 7
<p>System extension plugin generation.</p>
Full article ">Figure 8
<p>Throughput comparison of GvdsSQL, FDW, and QuickSQL.</p>
Full article ">Figure 9
<p>Operation latency comparison of GvdsSQL, FDW, and QuickSQL.</p>
Full article ">Figure 10
<p>Operation latency comparison of GvdsSQL and FDW under different write operation ratios.</p>
Full article ">
14 pages, 2738 KiB  
Article
Two-Dimensional Protection Code for Virtual Page Information in Translation Lookaside Buffers
by Xin Gao, Naiyuan Cui, Jiawei Nian, Hongjin Liu and Mengfei Yang
Electronics 2024, 13(7), 1320; https://doi.org/10.3390/electronics13071320 - 1 Apr 2024
Viewed by 832
Abstract
Severe conditions such as high-energy particle strikes may induce soft errors in on-chip memory, like cache and translation lookaside buffers (TLBs). As the key component of virtual-to-physical address translation, TLB directly affects processor performance. To protect the virtual page information stored in TLB, [...] Read more.
Severe conditions such as high-energy particle strikes may induce soft errors in on-chip memory, like cache and translation lookaside buffers (TLBs). As the key component of virtual-to-physical address translation, TLB directly affects processor performance. To protect the virtual page information stored in TLB, several studies have introduced error detection or correction codes. However, most schemes proposed for data TLB cannot support the detection of multi-bit upsets, which is reported as a serious issue in modern fault-tolerant processors due to the downscaling of CMOS process technology. In this paper, we propose a new two-dimensional protection technique, called the matrix protection tag (MaP-Tag), to provide stronger error detection and correction capability to TLB virtual page information. Our proposal reorganizes virtual page information into a matrix and employs adjacent check bits and interleaved check bits for error detection. The two-dimensional design offers one-bit error detection, burst multi-bit error detection, and even multi-bit error correction in some cases. The simulation results show that our proposal can detect almost all error patterns when injected with up to eight bit-flips. Furthermore, the technique provides a better error correction rate than conventional single error correction (SEC) codes. The reliability calculation shows that our proposal is powerful in both error detection and correction with affordable storage overhead. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

Figure 1
<p>Illustration of soft errors in VPN information stored in TLB. (<b>a</b>) Correct TLB access, (<b>b</b>) false positive case, and (<b>c</b>) false negative case. The bits and corresponding outputs marked by red color exemplify the problems caused by bit flips.</p>
Full article ">Figure 2
<p>Normalized IPC results with incremental error rates in TLBs. Injected soft errors can incur significant performance loss and even system crashes.</p>
Full article ">Figure 3
<p>The architectural design of the two-dimensional MaP-Tag scheme.</p>
Full article ">Figure 4
<p>A practical design of the MaP-Tag scheme for a RISC-V Sv39 virtual memory system.</p>
Full article ">Figure 5
<p>The detection rate results in single-/multi-bit error patterns.</p>
Full article ">Figure 6
<p>The correction rate results in single-/multi-bit error patterns.</p>
Full article ">Figure 7
<p>The reliability of detection results. MaP-Tag has the best capability of fault detection among all tested schemes. Values of Parity-1 and SEC-1 at 800 days are extracted for comparison.</p>
Full article ">Figure 8
<p>The reliability of correction results. MaP-Tag has the approximate capability of error correction to conventional SEC protection schemes. Values of SEC-1 and MaP-Tag at 500 days are extracted for comparison.</p>
Full article ">
33 pages, 506 KiB  
Article
Fundamental Limits of Coded Caching in Request-Robust D2D Communication Networks
by Wuqu Wang, Zhe Tao, Nan Liu and Wei Kang
Entropy 2024, 26(3), 250; https://doi.org/10.3390/e26030250 - 12 Mar 2024
Viewed by 1202
Abstract
D2D coded caching, originally introduced by Ji, Caire, and Molisch, significantly improves communication efficiency by applying the multi-cast technology proposed by Maddah-Ali and Niesen to the D2D network. Most prior works on D2D coded caching are based on the assumption that all users [...] Read more.
D2D coded caching, originally introduced by Ji, Caire, and Molisch, significantly improves communication efficiency by applying the multi-cast technology proposed by Maddah-Ali and Niesen to the D2D network. Most prior works on D2D coded caching are based on the assumption that all users will request content at the beginning of the delivery phase. However, in practice, this is often not the case. Motivated by this consideration, this paper formulates a new problem called request-robust D2D coded caching. The considered problem includes K users and a content server with access to N files. Only r users, known as requesters, request a file each at the beginning of the delivery phase. The objective is to minimize the average and worst-case delivery rate, i.e., the average and worst-case number of broadcast bits from all users among all possible demands. For this novel D2D coded caching problem, we propose a scheme based on uncoded cache placement and exploiting common demands and one-shot delivery. We also propose information-theoretic converse results under the assumption of uncoded cache placement. Furthermore, we adapt the scheme proposed by Yapar et al. for uncoded cache placement and one-shot delivery to the request-robust D2D coded caching problem and prove that the performance of the adapted scheme is order optimal within a factor of two under uncoded cache placement and within a factor of four in general. Finally, through numerical evaluations, we show that the proposed scheme outperforms known D2D coded caching schemes applied to the request-robust scenario for most cache size ranges. Full article
(This article belongs to the Special Issue Information Theory and Network Coding II)
Show Figures

Figure 1

Figure 1
<p>System model for request-robust D2D coded caching problem when there are 3 users. In this realization, User 2 does not request. Solid and dotted lines indicate placement and delivery phases, respectively.</p>
Full article ">Figure 2
<p>Consider the request-robust D2D coded caching problem from <a href="#sec2dot1-entropy-26-00250" class="html-sec">Section 2.1</a> where <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>10</mn> <mspace width="3.33333pt"/> <mi>a</mi> <mi>n</mi> <mi>d</mi> <mspace width="3.33333pt"/> <mi>K</mi> <mo>=</mo> <mn>30</mn> </mrow> </semantics></math>. The figure above is for the tradeoff between memory size and the maximum worst-case delivery rate for different requester numbers. The figure below shows the tradeoff between memory size and the maximum average delivery rate under uniform demand for different requester numbers. The scheme and converse proposed by Ji et al. [<a href="#B8-entropy-26-00250" class="html-bibr">8</a>] are both adapted to this request-robust D2D scenario.</p>
Full article ">
13 pages, 251 KiB  
Article
Efficiency of Various Tiling Strategies for the Zuker Algorithm Optimization
by Piotr Blaszynski, Marek Palkowski, Wlodzimierz Bielecki and Maciej Poliwoda
Mathematics 2024, 12(5), 728; https://doi.org/10.3390/math12050728 - 29 Feb 2024
Viewed by 917
Abstract
This paper focuses on optimizing the Zuker RNA folding algorithm, a bioinformatics task with non-serial polyadic dynamic programming and non-uniform loop dependencies. The intricate dependence pattern is represented using affine formulas, enabling the automatic application of tiling strategies via the polyhedral method. Three [...] Read more.
This paper focuses on optimizing the Zuker RNA folding algorithm, a bioinformatics task with non-serial polyadic dynamic programming and non-uniform loop dependencies. The intricate dependence pattern is represented using affine formulas, enabling the automatic application of tiling strategies via the polyhedral method. Three source-to-source compilers—PLUTO, TRACO, and DAPT—are employed, utilizing techniques such as affine transformations, the transitive closure of dependence relation graphs, and space–time tiling to generate cache-efficient codes, respectively. A dedicated transpose code technique for non-serial polyadic dynamic programming codes is also examined. The study evaluates the performance of these optimized codes for speed-up and scalability on multi-core machines and explores energy efficiency using RAPL. The paper provides insights into related approaches and outlines future research directions within the context of bioinformatics algorithm optimization. Full article
(This article belongs to the Special Issue Numerical Algorithms: Computer Aspects and Related Topics)
Show Figures

Figure 1

Figure 1
<p>Speed-up for AMD Epyc 7542 and 64 threads.</p>
Full article ">Figure 2
<p>Speed-up for Intel Xeon Gold 6240 and 36 threads.</p>
Full article ">Figure 3
<p>Execution times for different thread configurations and size = 3000 for AMD Epyc.</p>
Full article ">Figure 4
<p>Execution times for different thread configurations and size = 3000 for Intel Xeon Gold 6240.</p>
Full article ">
13 pages, 612 KiB  
Article
Hierarchical Cache-Aided Networks for Linear Function Retrieval
by Lingyu Zhang, Yun Kong, Youlong Wu and Minquan Cheng
Entropy 2024, 26(3), 195; https://doi.org/10.3390/e26030195 - 25 Feb 2024
Viewed by 1138
Abstract
In a hierarchical caching system, a server is connected to multiple mirrors, each of which is connected to a different set of users, and both the mirrors and the users are equipped with caching memories. All the existing schemes focus on single file [...] Read more.
In a hierarchical caching system, a server is connected to multiple mirrors, each of which is connected to a different set of users, and both the mirrors and the users are equipped with caching memories. All the existing schemes focus on single file retrieval, i.e., each user requests one file. In this paper, we consider the linear function retrieval problem, i.e., each user requests a linear combination of files, which includes single file retrieval as a special case. We propose a new scheme that reduces the transmission load of the first hop by jointly utilizing the two layers’ cache memories, and we show that our scheme achieves the optimal load for the second hop in some cases. Full article
(This article belongs to the Special Issue Advances in Multiuser Information Theory)
Show Figures

Figure 1

Figure 1
<p>The <math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>K</mi> <mn>1</mn> </msub> <mo>,</mo> <mo> </mo> <msub> <mi>K</mi> <mn>2</mn> </msub> <mo>;</mo> <mo> </mo> <msub> <mi>M</mi> <mn>1</mn> </msub> <mo>,</mo> <mo> </mo> <msub> <mi>M</mi> <mn>2</mn> </msub> <mo>;</mo> <mo> </mo> <mi>N</mi> <mo>)</mo> </mrow> </semantics></math> hierarchical caching system with <math display="inline"><semantics> <mrow> <mi>N</mi> <mspace width="-0.166667em"/> <mo>=</mo> <mspace width="-0.166667em"/> <mn>4</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>K</mi> <mn>1</mn> </msub> <mspace width="-0.166667em"/> <mo>=</mo> <mspace width="-0.166667em"/> <msub> <mi>K</mi> <mn>2</mn> </msub> <mspace width="-0.166667em"/> <mo>=</mo> <mspace width="-0.166667em"/> <mn>2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>M</mi> <mn>1</mn> </msub> <mspace width="-0.166667em"/> <mo>=</mo> <mspace width="-0.166667em"/> <mn>2</mn> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <msub> <mi>M</mi> <mn>2</mn> </msub> <mspace width="-0.166667em"/> <mo>=</mo> <mspace width="-0.166667em"/> <mn>1</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 2
<p><math display="inline"><semantics> <msub> <mi>R</mi> <mn>1</mn> </msub> </semantics></math> on <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>200</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>K</mi> <mn>1</mn> </msub> <mspace width="-0.166667em"/> <mo>=</mo> <mspace width="-0.166667em"/> <mn>20</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>K</mi> <mn>2</mn> </msub> <mspace width="-0.166667em"/> <mo>=</mo> <mspace width="-0.166667em"/> <mn>10</mn> </mrow> </semantics></math>.</p>
Full article ">
23 pages, 663 KiB  
Article
Joint Trajectory Design and Resource Optimization in UAV-Assisted Caching-Enabled Networks with Finite Blocklength Transmissions
by Yang Yang and Mustafa Cenk Gursoy
Drones 2024, 8(1), 12; https://doi.org/10.3390/drones8010012 - 4 Jan 2024
Cited by 2 | Viewed by 1808
Abstract
In this study, we design and analyze a reliability-oriented downlink wireless network assisted by unmanned aerial vehicles (UAVs). This network employs non-orthogonal multiple access (NOMA) transmission and finite blocklength (FBL) codes. In the network, ground user equipments (GUEs) request content from a remote [...] Read more.
In this study, we design and analyze a reliability-oriented downlink wireless network assisted by unmanned aerial vehicles (UAVs). This network employs non-orthogonal multiple access (NOMA) transmission and finite blocklength (FBL) codes. In the network, ground user equipments (GUEs) request content from a remote base station (BS), and there are no direct connections between the BS and the GUEs. To address this, we employ a UAV with a limited caching capacity to assist the BS in completing the communication. The UAV can either request uncached content from the BS and then serve the GUEs or directly transmit cached content to the GUEs. In this paper, we first introduce the decoding error rate within the FBL regime and explore caching policies for the UAV. Subsequently, we formulate an optimization problem aimed at minimizing the average maximum end-to-end decoding error rate across all GUEs while considering the coding length and maximum UAV transmission power constraints. We propose a two-step alternating optimization scheme embedded within a deep deterministic policy gradient (DDPG) algorithm to jointly determine the UAV trajectory and transmission power allocations, as well as blocklength of downloading phase, and our numerical results show that the combined learning-optimization algorithm efficiently addresses the considered problem. In particular, it is shown that a well-designed UAV trajectory, relaxing the FBL constraint, increasing the cache size, and providing a higher UAV transmission power budget all lead to improved performance. Full article
Show Figures

Figure 1

Figure 1
<p>An illustration of the considered network.</p>
Full article ">Figure 2
<p>System topology and frame structure.</p>
Full article ">Figure 3
<p>An illustration of the caching list.</p>
Full article ">Figure 4
<p>An example of the caching list with popularity.</p>
Full article ">Figure 5
<p>Influence of blocklength constraint.</p>
Full article ">Figure 6
<p>Influence of cache size limitation.</p>
Full article ">Figure 7
<p>Optimized UAV trajectory.</p>
Full article ">Figure 8
<p>Convergence.</p>
Full article ">Figure 9
<p>Hybrid FDMA-NOMA.</p>
Full article ">Figure 10
<p><math display="inline"><semantics> <msub> <mi>m</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>m</mi> <mn>2</mn> </msub> </semantics></math>.</p>
Full article ">
13 pages, 4668 KiB  
Article
A Universal-Verification-Methodology-Based Testbench for the Coverage-Driven Functional Verification of an Instruction Cache Controller
by Cong Liu, Xinyu Xu, Zhenjiao Chen and Binghao Wang
Electronics 2023, 12(18), 3821; https://doi.org/10.3390/electronics12183821 - 9 Sep 2023
Cited by 1 | Viewed by 2544
Abstract
The Cache plays an important role in computer architecture by reducing the access time of the processor and improving its performance. The hardware design of the Cache is complex and it is challenging to verify its functions, so the traditional Verilog-based verification method [...] Read more.
The Cache plays an important role in computer architecture by reducing the access time of the processor and improving its performance. The hardware design of the Cache is complex and it is challenging to verify its functions, so the traditional Verilog-based verification method is no longer applicable. This paper proposes a comprehensive and efficient verification testbench based on the SystemVerilog language and universal verification methodology (UVM) for an instruction Cache (I-Cache) controller. Corresponding testcases are designed for each feature of the I-Cache controller and automatically executed using a python script on an electronic design automation (EDA) tool. After simulating a large number of testcases, the statistics reveal that the module’s code coverage is 99.13%. Additionally, both the function coverage and the assertion coverage of the module reach 100%. Our results demonstrate that these coverage metrics meet the requirements and ensure the thoroughness of function verification. Furthermore, the established verification testbench exhibits excellent scalability and reusability, making it easily applicable to higher-level verification scenarios. Full article
(This article belongs to the Special Issue Feature Papers in Circuit and Signal Processing)
Show Figures

Figure 1

Figure 1
<p>Relationship between CPU, Cache, and main memory.</p>
Full article ">Figure 2
<p>Structural block diagram of the I-Cache controller.</p>
Full article ">Figure 3
<p>The structure of a basic UVM verification testbench.</p>
Full article ">Figure 4
<p>The architecture of the proposed verification testbench.</p>
Full article ">Figure 5
<p>The workflow of the reference model.</p>
Full article ">Figure 6
<p>The inheritance relationship between two classes.</p>
Full article ">Figure 7
<p>The execution sequence of the UVM phase.</p>
Full article ">Figure 8
<p>Waveform diagram of sequential fetch instruction.</p>
Full article ">Figure 9
<p>Waveform diagram of freeze mode.</p>
Full article ">Figure 10
<p>Results of code coverage.</p>
Full article ">Figure 11
<p>Results of function coverage.</p>
Full article ">Figure 12
<p>Results of assertion coverage.</p>
Full article ">
16 pages, 1152 KiB  
Article
NDN-Based Coded Caching Strategy for Satellite Networks
by Zhiguo Liu, Xiaoyong Jin, Yifan Li and Luxi Zhang
Electronics 2023, 12(18), 3756; https://doi.org/10.3390/electronics12183756 - 6 Sep 2023
Cited by 2 | Viewed by 1140
Abstract
To solve the transmission correlation issue arising from traditional named data networking (NDN) caching during segmenting contents, where users must obtain data blocks related to the requested file blocks to recover the source data, in this paper, we propose an NDN-based coded caching [...] Read more.
To solve the transmission correlation issue arising from traditional named data networking (NDN) caching during segmenting contents, where users must obtain data blocks related to the requested file blocks to recover the source data, in this paper, we propose an NDN-based coded caching strategy for low Earth orbit (LEO) satellite networks, using coding operations to remove the transmission correlation of data. To achieve efficient content distribution, the satellite node’s coded package is strategically placed with the objectives of minimizing the backhaul link load and content acquisition latency. The optimization problem is solved by using a multi-colony ant colony algorithm, enabling fast content retrieval. We designed a cluster cooperative distribution mechanism to simplify the satellite network’s management and reduce the load of satellite links for content distribution when covered by multiple satellite nodes. Finally, we compare the multi-objective coded caching (CCMO) introduced in this paper with the most popular (MP) strategy and the random-based coded caching (CCR) strategy; the average cache hit ratio, backhaul link load, and content acquisition delay are compared and analyzed. The results show that CCMO performs better. Full article
Show Figures

Figure 1

Figure 1
<p>Architecture of the satellite networks.</p>
Full article ">Figure 2
<p>The structure of the cluster head node’s cache table.</p>
Full article ">Figure 3
<p>The structure of the naming dictionary tree.</p>
Full article ">Figure 4
<p>The structure of the original interest package and coded interest package.</p>
Full article ">Figure 5
<p>Distribution diagram of Pareto solutions.</p>
Full article ">Figure 6
<p>The relationship between the average content acquisition delay and the number of iterations.</p>
Full article ">Figure 7
<p>Content acquisition delay diagram under the number of satellite nodes accessed by different users.</p>
Full article ">Figure 8
<p>Backhaul link load diagram under the number of satellite nodes accessed by different users.</p>
Full article ">Figure 9
<p>Comparison of the average cache hit rates of different cache strategies.</p>
Full article ">Figure A1
<p>Flow chart of the distributed coded packages.</p>
Full article ">
16 pages, 463 KiB  
Article
Multi-User PIR with Cyclic Wraparound Multi-Access Caches
by Kanishak Vaidya and Balaji Sundar Rajan
Entropy 2023, 25(8), 1228; https://doi.org/10.3390/e25081228 - 18 Aug 2023
Viewed by 1270
Abstract
We consider the problem of multi-access cache-aided multi-user Private Information Retrieval (MACAMuPIR) with cyclic wraparound cache access. In MACAMuPIR, several files are replicated across multiple servers. There are multiple users and multiple cache nodes. When the network is not congested, servers fill these [...] Read more.
We consider the problem of multi-access cache-aided multi-user Private Information Retrieval (MACAMuPIR) with cyclic wraparound cache access. In MACAMuPIR, several files are replicated across multiple servers. There are multiple users and multiple cache nodes. When the network is not congested, servers fill these cache nodes with the content of the files. During peak network traffic, each user accesses several cache nodes. Every user wants to retrieve one file from the servers but does not want the servers to know their demands. This paper proposes a private retrieval scheme for MACAMuPIR and characterizes the transmission cost for multi-access systems with cyclic wraparound cache access. We formalize privacy and correctness constraints and analyze transmission costs. The scheme outperforms the previously known dedicated cache setup, offering efficient and private retrieval. Results demonstrate the effectiveness of the multi-access approach. Our research contributes an efficient, privacy-preserving solution for multi-user PIR, advancing secure data retrieval from distributed servers. Full article
(This article belongs to the Special Issue Information Theory for Distributed Systems)
Show Figures

Figure 1

Figure 1
<p>Multi-access coded caching setup with cyclic wraparound cache access with four users, four helper cache and two servers. Each user is accessing two adjacent helper caches.</p>
Full article ">Figure 2
<p>Comparison of transmission costs for dedicated cache (dotted lines) and multi-access (solid lines) with cyclic wraparound cache access. Here, we take <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>8</mn> </mrow> </semantics></math> users and cache nodes.</p>
Full article ">Figure 3
<p>Transmission cost for <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>8</mn> <mo>,</mo> <mi>L</mi> <mo>=</mo> <mn>2</mn> <mo>,</mo> <mi>S</mi> <mo>=</mo> <mn>2</mn> <mo>,</mo> <mi>N</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>. Multi-access setup with cyclic wraparound cache access incur transmission cost only as high as dedicated cache setup with equal total memory in both systems.</p>
Full article ">
19 pages, 631 KiB  
Article
Research on Cache Coherence Protocol Verification Method Based on Model Checking
by Yiqiang Zhao, Boning Shi, Qizhi Zhang, Yidong Yuan and Jiaji He
Electronics 2023, 12(16), 3420; https://doi.org/10.3390/electronics12163420 - 11 Aug 2023
Viewed by 2098
Abstract
This paper analyzes the underlying logic of the processor’s behavior level code. It proposes an automatic model construction and formal verification method for the cache consistency protocol with the aim of ensuring data consistency in the processor and the correctness of the cache [...] Read more.
This paper analyzes the underlying logic of the processor’s behavior level code. It proposes an automatic model construction and formal verification method for the cache consistency protocol with the aim of ensuring data consistency in the processor and the correctness of the cache function. The main idea of this method is to analyze the register transfer level (RTL) code directly at the module level and variable level, and extract the key modules and key variables according to the code information. Then, based on key variables, conditional behavior statements are retrieved from the code, and unnecessary statements are deleted. The model construction and simplification of related core states are completed automatically, while also simultaneously generating the attribute library to be verified, using “white list” as the construction strategy. Finally, complete cache consistency protocol verification is implemented in the model detector UPPAAL. Ultimately, this mechanism reduces the 142 state-transition path-guided global states of the cache module to be verified into 4 core functional states driven by consistency protocol implementation, effectively reducing the complexity of the formal model, and extracting 32 verification attributes into 6 verification attributes, reducing the verification time cost by 76.19%. Full article
(This article belongs to the Special Issue Computer-Aided Design for Hardware Security and Trust)
Show Figures

Figure 1

Figure 1
<p>Multicore processor architecture.</p>
Full article ">Figure 2
<p>MESI cache coherence protocol.</p>
Full article ">Figure 3
<p>MESI cache coherence protocol.</p>
Full article ">Figure 4
<p>State transfer diagram.</p>
Full article ">Figure 5
<p>Automated formal modeling process.</p>
Full article ">Figure 6
<p>Tracking of key variables.</p>
Full article ">Figure 7
<p>The whole process automation.</p>
Full article ">Figure 8
<p>Cache consistency model.</p>
Full article ">Figure 9
<p>Comparison of time before and after simplification.</p>
Full article ">
13 pages, 407 KiB  
Communication
Optimal Designs of SVC-Based Content Placement and Delivery in Wireless Caching Networks
by Xuewei Zhang, Lin Zhang, Yuan Ren, Jing Jiang and Junxuan Wang
Sensors 2023, 23(10), 4823; https://doi.org/10.3390/s23104823 - 17 May 2023
Cited by 1 | Viewed by 1330
Abstract
To allieviate the heavy traffic burden over backhaul links and improve the user’s quality of service (QoS), edge caching plays an important role in wireless networks. This paper investigated the optimal designs of content placement and transmission in wireless caching networks. The contents [...] Read more.
To allieviate the heavy traffic burden over backhaul links and improve the user’s quality of service (QoS), edge caching plays an important role in wireless networks. This paper investigated the optimal designs of content placement and transmission in wireless caching networks. The contents to be cached and requested were encoded into individual layers by scalable video coding (SVC), and different sets of layers can provide different viewing qualities to end users. The demanded contents were provided by helpers caching the requested layers, or by the macro-cell base station (MBS) otherwise. In the content placement phase, this work formulated and solved the delay minimization problem. In the content transmission phase, the sum rate optimization problem was established. To effectively solve the nonconvex problem, the methods of semi-definite relaxation (SDR), successive convex approximation (SCA), and arithmetic-geometric mean (AGM) inequality were adopted, after which the original problem was transformed into the convex form. The numerical results show that the transmission delay is reduced by caching contents at helpers. Moreover, the fast convergence of the proposed algorithm for solving the sum rate maximization problem is presented, and the sum rate gain of edge caching is also revealed, as compared to the benchmark scheme without content caching. Full article
(This article belongs to the Special Issue 6G Space-Air-Ground Communication Networks and Key Technologies)
Show Figures

Figure 1

Figure 1
<p>The overall framework of this paper.</p>
Full article ">Figure 2
<p>In the considered network, we investigated the downlink scenario for content transmissions, including <span class="html-italic">M</span> helpers and <span class="html-italic">K</span> users. Assume all helpers cache the same files.</p>
Full article ">Figure 3
<p>The delay performance with varying cache sizes of helpers.</p>
Full article ">Figure 4
<p>The delay performance with varying <math display="inline"><semantics> <msub> <mi>R</mi> <mi>b</mi> </msub> </semantics></math>.</p>
Full article ">Figure 5
<p>Convergence behavior of Algorithm 1.</p>
Full article ">Figure 6
<p>The sum rate performance with varying <math display="inline"><semantics> <msubsup> <mi>P</mi> <mrow> <mi>max</mi> </mrow> <mi>B</mi> </msubsup> </semantics></math>.</p>
Full article ">
Back to TopTop