Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-031-68323-7guideproceedingsBook PagePublication PagesConference Proceedingsacm-pubtype
Big Data Analytics and Knowledge Discovery: 26th International Conference, DaWaK 2024, Naples, Italy, August 26–28, 2024, Proceedings
2024 Proceeding
  • Editors:
  • Robert Wrembel,
  • Silvia Chiusano,
  • Gabriele Kotsis,
  • A Min Tjoa,
  • Ismail Khalil
Publisher:
  • Springer-Verlag
  • Berlin, Heidelberg
Conference:
International Conference on Big Data Analytics and Knowledge DiscoveryNaples, Italy26 August 2024
ISBN:
978-3-031-68322-0
Published:
18 September 2024

Reflects downloads up to 18 Dec 2024Bibliometrics
Abstract

No abstract available.

front-matter
Front Matter
Pages i–xxii
back-matter
Back Matter
Article
Front Matter
Page 1
Article
LiteSelect: A Lightweight Adaptive Learning Algorithm for Online Index Selection
Abstract

Using appropriately selected indexes can dramatically improve the performance of query workloads in database systems. Typically, the access patterns of the workloads in real-world applications change frequently. This poses the challenge of ...

Article
IDAGEmb: An Incremental Data Alignment Based on Graph Embedding
Abstract

In the evolving digital environments, information systems are faced with a myriad of challenges such as data heterogeneity, the dynamic nature of data and integration complexities. These challenges impact on decision-making and data integration ...

Article
Learning Paradigms and Modelling Methodologies for Digital Twins in Process Industry
Abstract

Central to the digital transformation of the process industry are Digital Twins (DTs), virtual replicas of physical manufacturing systems that combine sensor data with sophisticated data-based or physics-based models, or a combination thereof, to ...

Article
Front Matter
Page 49
Article
MultiMatch: Low-Resource Generalized Entity Matching Using Task-Conditioned Hyperadapters in Multitask Learning
Abstract

Generalized Entity Matching (GEM) is a variant of entity matching that identifies whether entity descriptions from diverse data sources with heterogeneous data formats refer to the same real-world entity. State-of-the-art single-task fine-tuning ...

Article
Embedding-Based Data Matching for Disparate Data Sources
Abstract

Dealing with heterogeneous sources is an important challenge in the field of knowledge discovery and management. Schema matching methods are employed to solve this problem using three approaches: schema-based, instance-based, or a combination. ...

Article
Subtree Similarity Search Based on Structure and Text
Abstract

Given a query tree, the subtree similarity search problem is finding all subtrees in a document tree that are similar to the query tree. The previous scan-based method extracts candidate subtrees based on the size difference, which only considers ...

Article
Front Matter
Page 89
Article
Towards Hybrid Embedded Feature Selection and Classification Approach with Slim-TSF
Abstract

Traditional solar flare forecasting approaches have mostly relied on physics-based or data-driven models using solar magnetograms, treating flare predictions as a point-in-time classification problem. This approach has limitations, particularly in ...

Article
Evaluation of High Sparsity Strategies for Efficient Binary Classification
Abstract

In the dynamic landscape of Artificial Intelligence (AI) advancements, particularly in the development of compact and highly efficient models for space-constrained environments, the strategic sparsification of neural networks takes center stage. ...

Article
Incremental SMOTE with Control Coefficient for Classifiers in Data Starved Medical Applications
Abstract

Prediction models for data-starved medical applications lag behind general machine learning solutions, despite their potential to improve early interventions. This is largely due to the assumption that optimization approaches are applied on a ...

Article
Exploring Evaluation Metrics for Binary Classification in Data Analysis: the Worthiness Benchmark Concept
Abstract

In binary data classification, the main goal is to determine if elements belong to one of two classes. Various metrics assess the efficacy of classification models, making it essential to analyze and compare these metrics to select the most ...

Article
Front Matter
Page 127
Article
Exploring Causal Chain Identification: Comprehensive Insights from Text and Knowledge Graphs
Abstract

During real-world reasoning, the logic path is generally not explicitly articulated. An appropriate causal chain can offer abundant informative details to depict a logical pathway, which is also beneficial in preventing ambiguity problems during ...

Article
Towards Regional Explanations with Validity Domains for Local Explanations
Abstract

The field of explainability in machine learning has become very prolific and numerous explanation methods have emerged during the last decade. Local explanations are of major interest because they are intelligible and claim to be locally faithful ...

Article
Analyzing a Decade of Evolution: Trends in Natural Language Processing
Abstract

Natural Language Processing (NLP) stands at the forefront of the rapidly evolving landscape of Machine Learning, witnessing the emergence and evolution of diverse methodologies over the past decade. This study delves into the dynamic trends within ...

Article
Improving Serendipity for Collaborative Metric Learning Based on Mutual Proximity
Abstract

Today, in web space, where content is constantly expanding, recommendation systems that enable users to explore information passively have become essential technologies, and their accuracy is significantly improving. However, recent studies have ...

Article
Ada2vec: Adaptive Representation Learning for Large-Scale Dynamic Heterogeneous Networks
Abstract

Representation learning generates the embedding vector of an object based on its relationships with others in a network. The generated vectors are inputs to various downstream machine learning tasks, such as classification, clustering and ...

Article
Differentially-Private Neural Network Training with Private Features and Public Labels
Abstract

Training neural networks (NN) with differential privacy (DP) protection has been extensively studied in the past decade, with the DP-SGD (stochastic gradient descent) mechanism representing the benchmark approach. Conventional DP-SGD assumes that ...

Article
Front Matter
Page 223
Article
Series2Graph++: Distributed Detection of Correlation Anomalies in Multivariate Time Series
Abstract

Multivariate time series are a form of real-valued sequence data that simultaneously record different time-dependent variables. They originate mostly from multi-sensor setups and serve a variety of important analytical purposes, including the ...

Article
Anomaly Detection from Time Series Under Uncertainty
Abstract

Anomalies in data can cause potential issues in downstream tasks, making their detection critical. Data collection processes for continuous data are often defective and imprecise. For example, sensors are resource-constrained devices, raising ...

Article
Comparison of Measures for Characterizing the Difficulty of Time Series Classification
Abstract

The performance of machine learning algorithms is influenced both by their characteristics and parameterization as well as by the properties of the data they are trained and evaluated on. The latter aspect is often neglected. In this paper, we ...

Article
Dynamic Time Warping for Phase Recognition in Tribological Sensor Data
Abstract

This paper analyzes the potential of dynamic time warping (DTW) for recognizing phases of tribological sensor data. The three classes in these time series—run-in, constant wear, and divergent wear—are distinguished by their long-term trend and ...

Article
Front Matter
Page 251
Article
Putting Co-Design-Supporting Data Lakes to the Test: An Evaluation on AEC Case Studies
Abstract

Leveraging data from various stakeholders in the architecture, engineering, and construction (AEC) industry is an essential prerequisite to harness the potential of digitization and Artificial Intelligence (AI) in addressing major challenges such ...

Article
Creating and Querying Data Cubes in Python Using PyCube
Abstract

Data cubes are used for analyzing large data sets usually contained in data warehouses. The most popular data cube tools use graphical user interfaces (GUI) to do the data analysis. Traditionally this was necessary since data analysts were not ...

Contributors
  • Poznan University of Technology
  • Polytechnic of Turin
  • Johannes Kepler University Linz
  • Vienna University of Technology
  • Johannes Kepler University Linz
Index terms have been assigned to the content through auto-classification.
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations