Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2024
Parallel Iterative Mistake Minimization (IMM) clustering algorithm for shared-memory systems
ICPP '24: Proceedings of the 53rd International Conference on Parallel ProcessingPages 1–10https://doi.org/10.1145/3673038.3673057This paper addresses the problem of deriving explanations in the form of compact decision trees for cluster assignments made by the well-known K-means method. It introduces two versions of the Iterative Mistake Minimization (IMM) algorithm, both ...
- research-articleJanuary 2024
Scalable High-Quality Hypergraph Partitioning
ACM Transactions on Algorithms (TALG), Volume 20, Issue 1Article No.: 9, Pages 1–54https://doi.org/10.1145/3626527Balanced hypergraph partitioning is an NP-hard problem with many applications, e.g., optimizing communication in distributed data placement problems. The goal is to place all nodes across k different blocks of bounded size, such that hyperedges span as ...
- research-articleJanuary 2024JUST ACCEPTED
Spatial/Temporal Locality-based Load-sharing in Speculative Discrete Event Simulation on Multi-core Machines
ACM Transactions on Modeling and Computer Simulation (TOMACS), Just Accepted https://doi.org/10.1145/3639703Shared-memory multi-processor/multi-core machines have become a reference for many application contexts. In particular, the recent literature on speculative parallel discrete event simulation has reshuffled the architectural organization of simulation ...
- ArticleAugust 2023
Transactional-Turn Causal Consistency
AbstractFunction-as-a-Service (FaaS, serverless) computing systems use an actor-like model that executes a function asynchronously, atomically and in an isolated context. However, a function must often also access state, e.g., memory or a database. This ...
- research-articleJune 2023
Effective Access to the Committed Global State in Speculative Parallel Discrete Event Simulation on Multi-core Machines
SIGSIM-PADS '23: Proceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationPages 107–117https://doi.org/10.1145/3573900.3591117Output production and predicate detection are critical in speculative parallel discrete event simulation, since they need to take place accessing past state values—which have become committed—rather than the current state of the simulation objects, ...
-
- research-articleJune 2022
Spatial/Temporal Locality-based Load-sharing in Speculative Discrete Event Simulation on Multi-core Machines
SIGSIM-PADS '22: Proceedings of the 2022 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationPages 81–92https://doi.org/10.1145/3518997.3531026The recent literature has reshuffled the architectural organization of speculative parallel discrete event simulation systems for shared-memory multi-core machines. A core aspect has been the full sharing of the workload at the level of individual ...
- research-articleApril 2022
What’s Decidable About Causally Consistent Shared Memory?
ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 44, Issue 2Article No.: 8, Pages 1–55https://doi.org/10.1145/3505273While causal consistency is one of the most fundamental consistency models weaker than sequential consistency, the decidability of safety verification for (finite-state) concurrent programs running under causally consistent shared memories is still ...
- research-articleDecember 2021
Efficient computation of Hash Hirschberg protein alignment utilizing hyper threading multi‐core sharing technology
CAAI Transactions on Intelligence Technology (CIT2), Volume 7, Issue 2Pages 278–291https://doi.org/10.1049/cit2.12070AbstractDue to current technology enhancement, molecular databases have exponentially grown requesting faster efficient methods that can handle these amounts of huge data. Therefore, Multi‐processing CPUs technology can be used including physical and ...
- research-articleJune 2020
SB-Fetch: synchronization aware hardware prefetching for chip multiprocessors
ICS '20: Proceedings of the 34th ACM International Conference on SupercomputingArticle No.: 15, Pages 1–12https://doi.org/10.1145/3392717.3392735Shared-memory, multi-threaded applications often require programmers to insert thread synchronization primitives (i.e. locks, barriers, and condition variables) in critical sections to synchronize data access between processes. Scaling performance ...
Decidable verification under a causally consistent shared memory
PLDI 2020: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and ImplementationPages 211–226https://doi.org/10.1145/3385412.3385966Causal consistency is one of the most fundamental and widely used consistency models weaker than sequential consistency. In this paper, we study the verification of safety properties for finite-state concurrent programs running under a causally ...
- research-articleSeptember 2019
Scalable Kernelization for Maximum Independent Sets
ACM Journal of Experimental Algorithmics (JEA), Volume 24Article No.: 1.16, Pages 1–22https://doi.org/10.1145/3355502The most efficient algorithms for finding maximum independent sets in both theory and practice use reduction rules to obtain a much smaller problem instance called a kernel. The kernel can then be solved quickly using exact or heuristic algorithms—or by ...
- research-articleJune 2019
Persistent Non-Blocking Binary Search Trees Supporting Wait-Free Range Queries
SPAA '19: The 31st ACM Symposium on Parallelism in Algorithms and ArchitecturesPages 275–286https://doi.org/10.1145/3323165.3323197This paper presents the first implementation of a search tree data structure in an asynchronous shared-memory system that provides a wait-free algorithm for executing range queries on the tree, in addition to non-blocking algorithms for Insert, Delete ...
- announcementJuly 2018
Brief Announcement: 2D-Stack -- A Scalable Lock-Free Stack Design that Continuously Relaxes Semantics for Better Performance
PODC '18: Proceedings of the 2018 ACM Symposium on Principles of Distributed ComputingPages 407–409https://doi.org/10.1145/3212734.3212794We briefly describe an efficient lock-free concurrent stack design with tunable and tenable relaxed semantics to allow for better performance. The design is tunable and allow for a continuous monotonic trade of weaker semantics for better throughput ...
- research-articleJuly 2018
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
SPAA '18: Proceedings of the 30th on Symposium on Parallelism in Algorithms and ArchitecturesPages 393–404https://doi.org/10.1145/3210377.3210414There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest ...
- research-articleApril 2016
Shared-memory parallelization of the fast marching method using an overlapping domain-decomposition approach
HPC '16: Proceedings of the 24th High Performance Computing SymposiumArticle No.: 18, Pages 1–8https://doi.org/10.22360/SpringSim.2016.HPC.052The fast marching method is used to compute a monotone front propagation of anisotropic nature by solving the eikonal equation. Due to the sequential nature of the original algorithm, parallel approaches presented so far were unconvincing. In this work, ...
- abstractFebruary 2016
Teaching Parallel Computing Concepts with OpenMP (Abstract Only)
SIGCSE '16: Proceedings of the 47th ACM Technical Symposium on Computing Science EducationPages 712–713https://doi.org/10.1145/2839509.2844681OpenMP is an industry-standard, platform-independent parallel programming library built into all modern C and C++ compilers. Unlike complex parallel platforms, OpenMP is designed to make it relatively easy to add parallelism to existing sequential ...
- research-articleJuly 2015
The Price of being Adaptive
PODC '15: Proceedings of the 2015 ACM Symposium on Principles of Distributed ComputingPages 183–192https://doi.org/10.1145/2767386.2767428Mutual exclusion is a fundamental distributed coordination problem. Shared-memory mutual exclusion research focuses on local-spin algorithms and uses the remote memory references (RMRs) metric. To ensure the correctness of concurrent algorithms in ...
Cache-Efficient Aggregation: Hashing Is Sorting
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataPages 1123–1136https://doi.org/10.1145/2723372.2747644For decades researchers have studied the duality of hashing and sorting for the implementation of the relational operators, especially for efficient aggregation. Depending on the underlying hardware and software architecture, the specifically ...
- research-articleJuly 2014
Computing Petaflops over Terabytes of Data: The Case of Genome-Wide Association Studies
ACM Transactions on Mathematical Software (TOMS), Volume 40, Issue 4Article No.: 27, Pages 1–22https://doi.org/10.1145/2560421In many scientific and engineering applications, one has to solve not one but multiple instances of the same problem. Often times, these problems are linked in a way that allows intermediate results to be reused. A characteristic example for this class ...
- research-articleMarch 2014
Large-scale network simulation: leveraging the strengths of modern SMP-based compute clusters
SIMUTools '14: Proceedings of the 7th International ICST Conference on Simulation Tools and TechniquesPages 31–40https://doi.org/10.4108/icst.simutools.2014.254622Parallelization is crucial for efficient execution of large-scale network simulation. Today's computing clusters commonly used for that purpose are built from a large amount of multi-processor machines. The traditional approach to utilize all CPU cores ...