Keyword: dataset : Search

abstract

From Ground to Sky: Flying-motion Generation via Motion Dataset Adaptation

VRST '24: Proceedings of the 30th ACM Symposium on Virtual Reality Software and TechnologyArticle No.: 59, Pages 1–2https://doi.org/10.1145/3641825.3689507

We conducted a study utilizing a lightweight generative network to create flying motions. The existing datasets used for training did not include any data on flying motions. Therefore, we selected certain classes from the existing motion datasets and ...

poster

Brief Introduction of the OpenPack Dataset and Lessons Learned from Organizing Activity Recognition Challenge Using the Dataset

UbiComp '24: Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous ComputingPages 116–120https://doi.org/10.1145/3675094.3677597

This paper presents an overview of OpenPack, a comprehensive dataset developed for recognizing packaging work activities, and discusses an activity recognition competition using this dataset. The availability of sensor datasets for recognizing work ...

research-article

Toward Attribute-Controlled Fashion Image Captioning

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 20, Issue 9Article No.: 280, Pages 1–18https://doi.org/10.1145/3671000

Fashion image captioning is a critical task in the fashion industry that aims to automatically generate product descriptions for fashion items. However, existing fashion image captioning models predict a fixed caption for a particular fashion item once ...

short-paper

Free

JUST ACCEPTED

Social-sum-Mal: A Dataset for Abstractive Text Summarization in Malayalam

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Just Accepted https://doi.org/10.1145/3696107

Abstractive text summarization techniques for Malayalam language is still in its infancy. The lack of benchmarked datasets for this task is one of the constraints in developing and testing good models. Malayalam has seven nominal case forms, two nominal ...

research-article

Free

UniTSyn: A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing

ISSTA 2024: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and AnalysisPages 1061–1072https://doi.org/10.1145/3650212.3680342

The remarkable capability of large language models (LLMs) in generating high-quality code has drawn increasing attention in the software testing community. However, existing code LLMs often demonstrate unsatisfactory capabilities in generating accurate, ...

Article

Introducing LCC’s NavProc 1.0 Corpus: Annotated Procedural Texts in the Naval Domain

Text, Speech, and DialoguePages 252–266https://doi.org/10.1007/978-3-031-70563-2_20

Abstract

In this work, we introduce the NavProc 1.0 Corpus – a medium-scale, annotated corpus of procedural texts within the naval domain – for use as a first step in modeling procedural structures derived from real-world data sources. In particular, we ...

research-article

Open Access

AutoTherm: A Dataset and Benchmark for Thermal Comfort Estimation Indoors and in Vehicles

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Volume 8, Issue 3Article No.: 96, Pages 1–49https://doi.org/10.1145/3678503

Thermal comfort inside buildings is a well-studied field where human judgment for thermal comfort is collected and may be used for automatic thermal comfort estimation. However, indoor scenarios are rather static in terms of thermal state changes and, ...

Article

BRESSAY: A Brazilian Portuguese Dataset for Offline Handwritten Text Recognition

Document Analysis and Recognition - ICDAR 2024Pages 315–333https://doi.org/10.1007/978-3-031-70536-6_19

Abstract

This work introduces the BRESSAY dataset, a novel contribution to the field of offline Handwritten Text Recognition (HTR), specifically targeting Brazilian Portuguese. Despite significant advancements in HTR, challenges remain due to the ...

research-article

Open Access

Image Similarity Using an Ensemble of Context-Sensitive Models

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data MiningPages 1758–1769https://doi.org/10.1145/3637528.3672004

Image similarity has been extensively studied in computer vision. In recent years, machine-learned models have shown their ability to encode more semantics than traditional multivariate metrics. However, in labelling semantic similarity, assigning a ...

research-article

Free

LaDe: The First Comprehensive Last-mile Express Dataset from Industry

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data MiningPages 5991–6002https://doi.org/10.1145/3637528.3671548

Real-world last-mile express datasets are crucial for research in logistics, supply chain management, and spatio-temporal data mining. Despite a plethora of algorithms developed to date, no widely accepted, publicly available last-mile express dataset ...

research-article

Open Access

GothX: a generator of customizable, legitimate and malicious IoT network traffic

CSET '24: Proceedings of the 17th Cyber Security Experimentation and Test WorkshopPages 65–73https://doi.org/10.1145/3675741.3675753

In recent years, machine learning-based anomaly detection (AD) has become an important measure against security threats from Internet of Things (IoT) networks. Machine learning (ML) models for network traffic AD require datasets to be trained, evaluated ...

survey

Free

JUST ACCEPTED

How to Improve Video Analytics with Action Recognition: A Survey

ACM Computing Surveys (CSUR), Just Accepted https://doi.org/10.1145/3679011

Action recognition refers to the process of categorizing a video by identifying and classifying the specific actions it encompasses. Videos originate from several domains, and within each domain of video analysis, comprehending actions holds paramount ...

research-article

Unveiling the 5G Mid-Band Landscape: From Network Deployment to Performance and Application QoE

ACM SIGCOMM '24: Proceedings of the ACM SIGCOMM 2024 ConferencePages 358–372https://doi.org/10.1145/3651890.3672269

5G in mid-bands has become the dominant deployment of choice in the world. We present - to the best of our knowledge - the first comprehensive and comparative cross-country measurement study of commercial mid-band 5G deployments in Europe and the U.S., ...

research-article

Building a Production-Ready Keyword Detection System on a Real-World Audio

Automatic Control and Computer Sciences (ACCS), Volume 58, Issue 4Pages 454–458https://doi.org/10.3103/S0146411624700561

Abstract

This paper deals with the problem of creating a keyword spotting (KWS) system with real-world audio data. The paper describes the different methods used to build KWS systems, deep learning models such as convolutional neural networks (CNN), ...

research-article

Open Access

JUST ACCEPTED

B-TTDb: A Database of Turkish Tweets for Predicting the Top One Hundred Emojis

Yiltan Bitirim

ACM Transactions on the Web (TWEB), Just Accepted https://doi.org/10.1145/3681783

Emoji prediction is an important research task that focuses on finding the most appropriate emoji(s) quickly and effortlessly for a specific text. Now that Turkish is on the list of the top 20 most spoken languages in the world and there are a ...

short-paper

Open Access

What do Users Really Ask Large Language Models? An Initial Log Analysis of Google Bard Interactions in the Wild

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 2703–2707https://doi.org/10.1145/3626772.3657914

Advancements in large language models (LLMs) have changed information retrieval, offering users a more personalised and natural search experience with technologies like OpenAI ChatGPT, Google Bard (Gemini), or Microsoft Copilot. Despite these ...

research-article

JDivPS: A Diversified Product Search Dataset

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1152–1161https://doi.org/10.1145/3626772.3657888

The diversification of product search aims to offer diverse products to satisfy different user intents. Existing diversified product search approaches mainly relied on datasets sourced from online platforms. However, these datasets often present ...

research-article

Open Access

OpenSiteRec: An Open Dataset for Site Recommendation

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1483–1493https://doi.org/10.1145/3626772.3657875

As a representative information retrieval task, site recommendation, which aims at predicting the optimal sites for a brand or an institution to open new branches in an automatic data-driven way, is beneficial and crucial for brand development in modern ...

research-article

Open Access

Exploring Multi-Scenario Multi-Modal CTR Prediction with a Large Scale Dataset

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1232–1241https://doi.org/10.1145/3626772.3657865

Click-through rate (CTR) prediction plays a crucial role in recommendation systems, with significant impact on user experience and platform revenue generation. Despite the various public CTR datasets available due to increasing interest from both ...

research-article

Open Access

CivilSum: A Dataset for Abstractive Summarization of Indian Court Decisions

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 2241–2250https://doi.org/10.1145/3626772.3657859

Extracting relevant information from legal documents is a challenging task due to the technical complexity and volume of their content. These factors also increase the costs of annotating large datasets, which are required to train state-of-the-art ...

Applied Filters

People

Names

Institutions

Authors

Editors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences