Real-time multi-criteria decision-making applications in fields like high-speed algorithmic trading, emergency response, and disaster management have driven the development of new types of preference queries. This is an example of a skyline search. Multi-criteria decision-making utilizes the skyline operator to extract highly significant tuples or useful data points from extensive sets of multi-dimensional databases. The user’s settings determine the results, which include all tuples whose attribute vector remains undefeated by another tuple. The extracted tuples are commonly known as the skyline set. Lately, there has been a growing trend in research studies to perform skyline queries on data stream applications. These queries consist of extracting desired records from sliding windows and removing outdated records from incoming data sets that do not meet user requirements. The datasets in these applications are extremely large and exhibit a wide range of dimensions that vary over time. Consequently, the skyline query is considered a computationally demanding task, with the challenge of achieving a real-time response within an acceptable duration. We must transport and process enormous quantities of data. Traditional skyline algorithms have faced new challenges due to limitations in data transmission bandwidth and latency. The transfer of vast quantities of data would affect performance, power efficiency, and reliability. Consequently, it is imperative to make alterations to the computer paradigm. Parallel skyline queries have attracted the attention of both scholars and the business sector. The study of skyline queries has focused on sequential algorithms and parallel implementations for multicore processors, primarily due to their widespread use. While previous research has focused on sequential algorithms, there is a limitation to comprehensive studies that specifically address modern parallel processors. While numerous articles have been published regarding the parallelization of regular skyline queries, there is a limited amount of research dedicated specifically to the parallel processing of continuous skyline queries. This study introduces PRSS, a continuous skyline technique for multicore processors specifically designed for sliding window-based data streams. The efficacy of the proposed parallel implementation is demonstrated through tests conducted on both real-world and synthetic datasets, encompassing various point distributions, arrival rates, and window widths. The experimental results for a dataset characterized by a large number of dimensions and cardinality demonstrate significant acceleration.
Data availability
synthetic data: the standard skyline data generator. real data: public repository under creative commons license. We have also uploaded our PRSS algorithm C++ source code on GitHub.
Our research was entirely self-funded, and we did not receive any financial support or funding from external parties.
PhD student Khames Walid designed the algorithm, implemented the program, analyzed the results, and prepared the manuscript. HadjAli Allel and Lagha Mohand as PhD supervisors reviewed the results and contributed to the manuscript corrections.
Ethics declarations
Conflict of interest
We declare that the authors have no Conflict of interest as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.
for PRSS Algorithm Development and Validation Using Synthetic and Open-Source Real-World Data. Research Objective: The objective of this study is to develop and validate a Skyline algorithm for processing streaming data. The research does not involve human or animal participants or the use of sensitive data. Instead, the algorithm was developed and validated using both synthetic data generated from the open-source skyline data generator software (http://pgfoundry.org/projects/randdataset) and real-world data available in a public repository under the Creative Commons license.
