- Sponsor:
- sigops
No abstract available.
Proceeding Downloads
Effective Performance Issue Diagnosis with Value-Assisted Cost Profiling
Diagnosing performance issues is often difficult, especially when they occur only during some program executions. Profilers can help with performance debugging, but are ineffective when the most costly functions are not the root causes of performance ...
Foxhound: Server-Grade Observability for Network-Augmented Applications
There is a growing move to offload functionality, e.g., TCP or key-value stores, into programmable networks - either on SmartNICs or programmable switches. While offloading promises significant performance boosts, these programmable devices often ...
OFence: Pairing Barriers to Find Concurrency Bugs in the Linux Kernel
Knowing which functions may execute concurrently is key to finding concurrency-related bugs. Existing tools infer the possibility of concurrency using dynamic analysis or by pairing functions that use the same locks. Code that relies on more relaxed ...
Pocket: ML Serving from the Edge
One of the major challenges in serving ML applications is the resource pressure introduced by the underlying ML frameworks. This becomes a bigger problem at resource-constrained, multi-tenant edge server locations, where it is necessary to scale to a ...
Efficient and Safe I/O Operations for Intermittent Systems
Task-based intermittent software systems always re-execute peripheral input/output (I/O) operations upon power failures since tasks have all-or-nothing semantics. Re-executed I/O wastes significant time and energy and risks memory inconsistency. This ...
ICE: Collaborating Memory and Process Management for User Experience on Resource-limited Mobile Devices
Mobile devices with limited resources are prevalent as they have a relatively low price. Providing a good user experience with limited resources has been a big challenge. This paper found that foreground applications are often unexpectedly interfered ...
Diagnosing Kernel Concurrency Failures with AITIA
Kernel concurrency failures are notoriously difficult to identify and diagnose their fundamental reason, the root cause. Kernel concurrency bugs frequently involve challenging patterns such as multi-variable races, data races with asynchronous kernel ...
WAFFLE: Exposing Memory Ordering Bugs Efficiently with Active Delay Injection
Concurrency bugs are difficult to detect, reproduce, and diagnose, as they manifest under rare timing conditions. Recently, active delay injection has proven efficient for exposing one such type of bug --- thread-safety violations --- with low ...
Model Checking Guided Testing for Distributed Systems
Distributed systems have become the backbone of cloud computing. Incorrect system designs and implementations can greatly impair the reliability of distributed systems. Although a distributed system design modelled in the formal specification can be ...
MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks
We study training of Graph Neural Networks (GNNs) for large-scale graphs. We revisit the premise of using distributed training for billion-scale graphs and show that for graphs that fit in main memory or the SSD of a single machine, out-of-core ...
Accelerating Graph Mining Systems with Subgraph Morphing
Graph mining applications analyze the structural properties of large graphs. These applications are computationally expensive because finding structural patterns requires checking subgraph isomorphism, which is NP-complete. This paper exploits the sub-...
TEA: A General-Purpose Temporal Graph Random Walk Engine
- Chengying Huan,
- Shuaiwen Leon Song,
- Santosh Pandey,
- Hang Liu,
- Yongchao Liu,
- Baptiste Lepers,
- Changhua He,
- Kang Chen,
- Jinlei Jiang,
- Yongwei Wu
Many real-world graphs are temporal in nature, where the temporal information indicates when a particular edge is changed (e.g., edge insertion and deletion). Performing random walks on such temporal graphs is of paramount value. The state-of-the-art ...
ALT: Breaking the Wall between Data Layout and Loop Optimizations for Deep Learning Compilation
- Zhiying Xu,
- Jiafan Xu,
- Hongding Peng,
- Wei Wang,
- Xiaoliang Wang,
- Haoran Wan,
- Haipeng Dai,
- Yixu Xu,
- Hao Cheng,
- Kun Wang,
- Guihai Chen
Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware. Current deep compilers typically predetermine layouts of tensors and then optimize loops of operators. However, such unidirectional and one-...
REFL: Resource-Efficient Federated Learning
Federated Learning (FL) enables distributed training by learners using local data, thereby enhancing privacy and reducing communication. However, it presents numerous challenges relating to the heterogeneity of the data distribution, device ...
Tabi: An Efficient Multi-Level Inference System for Large Language Models
Today's trend of building ever larger language models (LLMs), while pushing the performance of natural language processing, adds significant latency to the inference stage. We observe that due to the diminishing returns of adding parameters to LLMs, a ...
Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access
As deep learning (DL) inference has been widely adopted for building user-facing applications in many domains, it is increasingly important for DL inference servers to achieve high throughput while preserving bounded latency. DL inference requests can ...
DiLOS: Do Not Trade Compatibility for Performance in Memory Disaggregation
Memory disaggregation has replaced the landscape of dat-acenters by physically separating compute and memory nodes, achieving improved utilization. As early efforts, kernel paging-based approaches offer transparent virtual memory abstraction for remote ...
vTMM: Tiered Memory Management for Virtual Machines
The memory demand of virtual machines (VMs) is increasing, while the traditional DRAM-only memory system has limited capacity and high power consumption. The tiered memory system can effectively expand the memory capacity and increase the cost ...
Making Dynamic Page Coalescing Effective on Virtualized Clouds
Using huge pages has become a mainstream method to reduce address translation overhead for big memory workloads in modern computer systems. To create huge pages, system software usually uses page coalescing methods to dynamically combine contiguous ...
Omni-Paxos: Breaking the Barriers of Partial Connectivity
Omni-Paxos is a system for state machine replication that is completely resilient to partial network partitions, a major source of service disruptions in recent years. Omni-Paxos achieves its resilience through a decoupled design that separates the ...
CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections
- Yiduo Wang,
- Yufei Wu,
- Cheng Li,
- Pengfei Zheng,
- Biao Cao,
- Yan Sun,
- Fei Zhou,
- Yinlong Xu,
- Yao Wang,
- Guangjun Xie
There is a fundamental tension between metadata scalability and POSIX semantics within distributed file systems. The bottleneck lies in the coordination, mainly locking, used for ensuring strong metadata consistency, namely, atomicity and isolation. ...
OLPart: Online Learning based Resource Partitioning for Colocating Multiple Latency-Critical Jobs on Commodity Computers
Colocating multiple jobs on the same server has been a commonly used approach for improving resource utilization in cloud environments. However, performance interference due to the contention over shared resources makes resource partitioning an ...
Palette Load Balancing: Locality Hints for Serverless Functions
- Mania Abdi,
- Samuel Ginzburg,
- Xiayue Charles Lin,
- Jose Faleiro,
- Gohar Irfan Chaudhry,
- Inigo Goiri,
- Ricardo Bianchini,
- Daniel S Berger,
- Rodrigo Fonseca
Function-as-a-Service (FaaS) serverless computing enables a simple programming model with almost unbounded elasticity. Unfortunately, current FaaS platforms achieve this flexibility at the cost of lower performance for data-intensive applications ...
With Great Freedom Comes Great Opportunity: Rethinking Resource Allocation for Serverless Functions
Current serverless offerings give users limited flexibility for configuring the resources allocated to their function invocations. This simplifies the interface for users to deploy server-less computations but creates deployments that are resource ...
Groundhog: Efficient Request Isolation in FaaS
Security is a core responsibility for Function-as-a-Service (FaaS) providers. The prevailing approach isolates concurrent executions of functions in separate containers. However, successive invocations of the same function commonly reuse the runtime ...
Understanding and Optimizing Workloads for Unified Resource Management in Large Cloud Platforms
To fully utilize computing resources, cloud providers such as Google and Alibaba choose to co-locate online services with batch processing applications in their data centers. By implementing unified resource management policies, different types of ...
Fail through the Cracks: Cross-System Interaction Failures in Modern Cloud Systems
Modern cloud systems are orchestrations of independent and interacting (sub-)systems, each specializing in important services (e.g., data processing, storage, resource management, etc.). Hence, cloud system reliability is affected not only by the ...
LogGrep: Fast and Cheap Cloud Log Storage by Exploiting both Static and Runtime Patterns
In cloud systems, near-line logs are mainly used for debugging, which means they prefer a low query latency for a better user experience, and like any other logs, they also prefer a low overall cost including storage cost to store compressed logs and ...
Aggregate VM: Why Reduce or Evict VM's Resources When You Can Borrow Them From Other Nodes?
- Ho-Ren Chuang,
- Karim Manaouil,
- Tong Xing,
- Antonio Barbalace,
- Pierre Olivier,
- Balvansh Heerekar,
- Binoy Ravindran
Hardware resource fragmentation is a common issue in data centers. Traditional solutions based on migration or overcommitment are unacceptably slow, and modern commercial or research solutions like Spot VM may reduce or evict VM's resources anytime. ...
R2C: AOCR-Resilient Diversity with Reactive and Reflective Camouflage
Address-oblivious code reuse, AOCR for short, poses a substantial security risk, as it remains unchallenged. If neglected, adversaries have a reliable way to attack systems, offering an operational and profitable strategy. AOCR's authors conclude that ...
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
EuroSys '21 | 181 | 38 | 21% |
EuroSys '20 | 234 | 43 | 18% |
EuroSys '18 | 262 | 43 | 16% |
EuroSys '16 | 180 | 38 | 21% |
EuroSys '14 | 147 | 27 | 18% |
EuroSys '13 | 143 | 28 | 20% |
EuroSys '11 | 161 | 24 | 15% |
Overall | 1,308 | 241 | 18% |