Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJune 2021
Compliant Geo-distributed Query Processing
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataPages 181–193https://doi.org/10.1145/3448016.3453687In this paper, we address the problem of compliant geo-distributed query processing. In particular, we focus on dataflow policies that impose restrictions on movement of data across geographical or institutional borders. Traditional ways to distributed ...
- research-articleMay 2020
Prompt: Dynamic Data-Partitioning for Distributed Micro-batch Stream Processing Systems
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of DataPages 2455–2469https://doi.org/10.1145/3318464.3389713Advances in real-world applications require high-throughput processing over large data streams. Micro-batching has been proposed to support the needs of these applications. In micro-batching, the processing and batching of the data are interleaved, ...
- research-articleJanuary 2020
A distributed architecture for large scale news and social media processing
International Journal of Web Engineering and Technology (IJWET), Volume 15, Issue 4Pages 383–406https://doi.org/10.1504/ijwet.2020.114029When designing a data processing and analytics pipeline for data streams, it is important to provide the data load and be able to successfully balance it over the available resources. This can be achieved more easily if small processing modules, which ...
- research-articleOctober 2018
Spark-parSketch: A Massively Distributed Indexing of Time Series Datasets
- Oleksandra Levchenko,
- Djamel-Edine Yagoubi,
- Reza Akbarinia,
- Florent Masseglia,
- Boyan Kolev,
- Dennis Shasha
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementPages 1951–1954https://doi.org/10.1145/3269206.3269226A growing number of domains (finance, seismology, internet-of-things, etc.) collect massive time series. When the number of series grow to the hundreds of millions or even billions, similarity queries become intractable on a single machine. Further, ...
- research-articleMay 2018
Meta-Dataflows: Efficient Exploratory Dataflow Jobs
- Raul Castro Fernandez,
- William Culhane,
- Pijika Watcharapichat,
- Matthias Weidlich,
- Victoria Lopez Morales,
- Peter Pietzuch
SIGMOD '18: Proceedings of the 2018 International Conference on Management of DataPages 1157–1172https://doi.org/10.1145/3183713.3183760Distributed dataflow systems such as Apache Spark and Apache Flink are used to derive new insights from large datasets. While they efficiently execute concrete data processing workflows, expressed as dataflow graphs, they lack generic support for ...
- research-articleDecember 2015
GPSInsights: Towards an efficient framework for storing and mining massive vehicle location data
SoICT '15: Proceedings of the 6th International Symposium on Information and Communication TechnologyPages 25–31https://doi.org/10.1145/2833258.2833282Intelligent Transport System (ITS) has seen growing interest in collecting vehicle location data in order to build up real-time traffic monitoring and analytic systems. However handling these data creates challenges, as they are massive in volume and ...
- research-articleMay 2015
Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataPages 1357–1369https://doi.org/10.1145/2723372.2742790The broad success of Hadoop has led to a fast-evolving and diverse ecosystem of application engines that are building upon the YARN resource management layer. The open-source implementation of MapReduce is being slowly replaced by a collection of ...
- research-articleNovember 2014
Distributed Graph Summarization
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementPages 799–808https://doi.org/10.1145/2661829.2661862Graph has been a ubiquitous and essential data representation to model real world objects and their relationships. Today, large amounts of graph data have been generated by various applications. Graph summarization techniques are crucial in uncovering ...
- research-articleOctober 2014
Stochastic models for cloud computing performance evaluation
CEE-SECR '14: Proceedings of the 10th Central and Eastern European Software Engineering Conference in RussiaArticle No.: 20, Pages 1–6https://doi.org/10.1145/2687233.2687256For cloud systems a set of stochastic models taking account of system context actualization is proposed. These models are based on queueing theory and extend its applications by studying the multichannel systems with warm-up and hyperexponential phase-...
- short-paperJuly 2013
Energy-efficient fault-tolerant data storage & processing in dynamic networks
MobiHoc '13: Proceedings of the fourteenth ACM international symposium on Mobile ad hoc networking and computingPages 281–286https://doi.org/10.1145/2491288.2491325With the advance of mobile devices, cloud computing has enabled people to access data and computing resources without spatiotemporal constraints. A common assumption is that mobile devices are well connected to remote data centers and the data centers ...
- ArticleNovember 2011
Introducing a novel data management approach for distributed large scale data processing in future computer clouds
ICONIP'11: Proceedings of the 18th international conference on Neural Information Processing - Volume Part IIPages 391–398https://doi.org/10.1007/978-3-642-24958-7_46Deployment of pattern recognition applications for large-scale data sets is an open issue that needs to be addressed. In this paper, an attempt is made to explore new methods of partitioning and distributing data, that is, resource virtualization in the ...
- posterSeptember 2011
Distributed human activity data processing using HASC tool
UbiComp '11: Proceedings of the 13th international conference on Ubiquitous computingPages 603–604https://doi.org/10.1145/2030112.2030234To accelerate and simplify human activity recognition research, we have been developing a data processing tool named "HASC Tool." As the activity corpus becomes huge, it is not simple to handle the large number of files because it takes a lot of time to ...
- ArticleJuly 2010
Advanced Bio-inspired Plausibility Checking in a Wireless Sensor Network Using Neuro-immune Systems: Autonomous Fault Diagnosis in an Intelligent Transportation System
SENSORCOMM '10: Proceedings of the 2010 Fourth International Conference on Sensor Technologies and ApplicationsPages 108–114https://doi.org/10.1109/SENSORCOMM.2010.24Recent developments in wireless sensing technology lead to implement advanced algorithms for distributed data processing in various applications; intelligent transportation system is one of the main applications of the advanced networked sensing ...
- articleJuly 2006
Secure multiparty computation of approximations
ACM Transactions on Algorithms (TALG), Volume 2, Issue 3Pages 435–472https://doi.org/10.1145/1159892.1159900Approximation algorithms can sometimes provide efficient solutions when no efficient exact computation is known. In particular, approximations are often useful in a distributed setting where the inputs are held by different parties and may be extremely ...
- ArticleJune 2005
PXI-based architecture for real time data acquisition and distributed dynamical data processing
This paper describes an architecture model for data acquisition systems based on Compact PCI platforms. The aim is to increase real time data processing capabilities in experimental environments such as nuclear fusion devices (e.g. ITER). The model has ...
- articleJuly 1989
Measurement of computer system/information system performance
ACM SIGMIS Database: the DATABASE for Advances in Information Systems (SIGMIS), Volume 20, Issue 2Pages 27–30https://doi.org/10.1145/1017914.1017919This paper reports the results of a field study conducted to obtain managers perceptions of their Management Information Systems (MIS). The basic study involved responses from over 100 MIS/DP managers, users, and other business professionals. The ...
- articleDecember 1979
An analysis of the impact of distributed data processing on organizations in the 1980's
"Distributed Data Processing" (DDP) is examined from a technological, organizational, and economic perspective. DDP projections are made for the 1980's as are recommendations for capitalizing on and minimizing the risks of this emerging technology.
- articleDecember 1979
End users as application developers
The demand for new or expanded computer-based information systems far exceeds the capacity of present DP organizations to meet this demand. Assuming that a massive expansion of DP personnel is not feasible, one solution is to make existing computer ...