Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- abstractOctober 2024
From Ground to Sky: Flying-motion Generation via Motion Dataset Adaptation
VRST '24: Proceedings of the 30th ACM Symposium on Virtual Reality Software and TechnologyArticle No.: 59, Pages 1–2https://doi.org/10.1145/3641825.3689507We conducted a study utilizing a lightweight generative network to create flying motions. The existing datasets used for training did not include any data on flying motions. Therefore, we selected certain classes from the existing motion datasets and ...
- posterOctober 2024
Brief Introduction of the OpenPack Dataset and Lessons Learned from Organizing Activity Recognition Challenge Using the Dataset
UbiComp '24: Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous ComputingPages 116–120https://doi.org/10.1145/3675094.3677597This paper presents an overview of OpenPack, a comprehensive dataset developed for recognizing packaging work activities, and discusses an activity recognition competition using this dataset. The availability of sensor datasets for recognizing work ...
- research-articleSeptember 2024
Toward Attribute-Controlled Fashion Image Captioning
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 20, Issue 9Article No.: 280, Pages 1–18https://doi.org/10.1145/3671000Fashion image captioning is a critical task in the fashion industry that aims to automatically generate product descriptions for fashion items. However, existing fashion image captioning models predict a fixed caption for a particular fashion item once ...
- short-paperSeptember 2024JUST ACCEPTED
Social-sum-Mal: A Dataset for Abstractive Text Summarization in Malayalam
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Just Accepted https://doi.org/10.1145/3696107Abstractive text summarization techniques for Malayalam language is still in its infancy. The lack of benchmarked datasets for this task is one of the constraints in developing and testing good models. Malayalam has seven nominal case forms, two nominal ...
UniTSyn: A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing
ISSTA 2024: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and AnalysisPages 1061–1072https://doi.org/10.1145/3650212.3680342The remarkable capability of large language models (LLMs) in generating high-quality code has drawn increasing attention in the software testing community. However, existing code LLMs often demonstrate unsatisfactory capabilities in generating accurate, ...
-
- ArticleSeptember 2024
Introducing LCC’s NavProc 1.0 Corpus: Annotated Procedural Texts in the Naval Domain
AbstractIn this work, we introduce the NavProc 1.0 Corpus – a medium-scale, annotated corpus of procedural texts within the naval domain – for use as a first step in modeling procedural structures derived from real-world data sources. In particular, we ...
- research-articleSeptember 2024
AutoTherm: A Dataset and Benchmark for Thermal Comfort Estimation Indoors and in Vehicles
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Volume 8, Issue 3Article No.: 96, Pages 1–49https://doi.org/10.1145/3678503Thermal comfort inside buildings is a well-studied field where human judgment for thermal comfort is collected and may be used for automatic thermal comfort estimation. However, indoor scenarios are rather static in terms of thermal state changes and, ...
- ArticleSeptember 2024
BRESSAY: A Brazilian Portuguese Dataset for Offline Handwritten Text Recognition
- Arthur F. S. Neto,
- Byron L. D. Bezerra,
- Sávio S. Araújo,
- Wiliane M. A. S. Souza,
- Kléberson F. Alves,
- Macileide F. Oliveira,
- Samara V. S. Lins,
- Hugo J. F. Hazin,
- Pedro H. V. Rocha,
- Alejandro H. Toselli
Document Analysis and Recognition - ICDAR 2024Pages 315–333https://doi.org/10.1007/978-3-031-70536-6_19AbstractThis work introduces the BRESSAY dataset, a novel contribution to the field of offline Handwritten Text Recognition (HTR), specifically targeting Brazilian Portuguese. Despite significant advancements in HTR, challenges remain due to the ...
- research-articleAugust 2024
Image Similarity Using an Ensemble of Context-Sensitive Models
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data MiningPages 1758–1769https://doi.org/10.1145/3637528.3672004Image similarity has been extensively studied in computer vision. In recent years, machine-learned models have shown their ability to encode more semantics than traditional multivariate metrics. However, in labelling semantic similarity, assigning a ...
- research-articleAugust 2024
LaDe: The First Comprehensive Last-mile Express Dataset from Industry
- Lixia Wu,
- Haomin Wen,
- Haoyuan Hu,
- Xiaowei Mao,
- Yutong Xia,
- Ergang Shan,
- Jianbin Zheng,
- Junhong Lou,
- Yuxuan Liang,
- Liuqing Yang,
- Roger Zimmermann,
- Youfang Lin,
- Huaiyu Wan
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data MiningPages 5991–6002https://doi.org/10.1145/3637528.3671548Real-world last-mile express datasets are crucial for research in logistics, supply chain management, and spatio-temporal data mining. Despite a plethora of algorithms developed to date, no widely accepted, publicly available last-mile express dataset ...
- research-articleAugust 2024
GothX: a generator of customizable, legitimate and malicious IoT network traffic
CSET '24: Proceedings of the 17th Cyber Security Experimentation and Test WorkshopPages 65–73https://doi.org/10.1145/3675741.3675753In recent years, machine learning-based anomaly detection (AD) has become an important measure against security threats from Internet of Things (IoT) networks. Machine learning (ML) models for network traffic AD require datasets to be trained, evaluated ...
- surveyAugust 2024JUST ACCEPTED
How to Improve Video Analytics with Action Recognition: A Survey
Action recognition refers to the process of categorizing a video by identifying and classifying the specific actions it encompasses. Videos originate from several domains, and within each domain of video analysis, comprehending actions holds paramount ...
- research-articleAugust 2024
Unveiling the 5G Mid-Band Landscape: From Network Deployment to Performance and Application QoE
- Rostand A. K. Fezeu,
- Claudio Fiandrino,
- Eman Ramadan,
- Jason Carpenter,
- Lilian Coelho de Freitas,
- Faaiq Bilal,
- Wei Ye,
- Joerg Widmer,
- Feng Qian,
- Zhi-Li Zhang
ACM SIGCOMM '24: Proceedings of the ACM SIGCOMM 2024 ConferencePages 358–372https://doi.org/10.1145/3651890.36722695G in mid-bands has become the dominant deployment of choice in the world. We present - to the best of our knowledge - the first comprehensive and comparative cross-country measurement study of commercial mid-band 5G deployments in Europe and the U.S., ...
- research-articleAugust 2024
Building a Production-Ready Keyword Detection System on a Real-World Audio
Automatic Control and Computer Sciences (ACCS), Volume 58, Issue 4Pages 454–458https://doi.org/10.3103/S0146411624700561AbstractThis paper deals with the problem of creating a keyword spotting (KWS) system with real-world audio data. The paper describes the different methods used to build KWS systems, deep learning models such as convolutional neural networks (CNN), ...
- research-articleJuly 2024JUST ACCEPTED
B-TTDb: A Database of Turkish Tweets for Predicting the Top One Hundred Emojis
Emoji prediction is an important research task that focuses on finding the most appropriate emoji(s) quickly and effortlessly for a specific text. Now that Turkish is on the list of the top 20 most spoken languages in the world and there are a ...
- short-paperJuly 2024
What do Users Really Ask Large Language Models? An Initial Log Analysis of Google Bard Interactions in the Wild
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 2703–2707https://doi.org/10.1145/3626772.3657914Advancements in large language models (LLMs) have changed information retrieval, offering users a more personalised and natural search experience with technologies like OpenAI ChatGPT, Google Bard (Gemini), or Microsoft Copilot. Despite these ...
- research-articleJuly 2024
JDivPS: A Diversified Product Search Dataset
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1152–1161https://doi.org/10.1145/3626772.3657888The diversification of product search aims to offer diverse products to satisfy different user intents. Existing diversified product search approaches mainly relied on datasets sourced from online platforms. However, these datasets often present ...
- research-articleJuly 2024
OpenSiteRec: An Open Dataset for Site Recommendation
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1483–1493https://doi.org/10.1145/3626772.3657875As a representative information retrieval task, site recommendation, which aims at predicting the optimal sites for a brand or an institution to open new branches in an automatic data-driven way, is beneficial and crucial for brand development in modern ...
- research-articleJuly 2024
Exploring Multi-Scenario Multi-Modal CTR Prediction with a Large Scale Dataset
- Zhaoxin Huan,
- Ke Ding,
- Ang Li,
- Xiaolu Zhang,
- Xu Min,
- Yong He,
- Liang Zhang,
- Jun Zhou,
- Linjian Mo,
- Jinjie Gu,
- Zhongyi Liu,
- Wenliang Zhong,
- Guannan Zhang,
- Chenliang Li,
- Fajie Yuan
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1232–1241https://doi.org/10.1145/3626772.3657865Click-through rate (CTR) prediction plays a crucial role in recommendation systems, with significant impact on user experience and platform revenue generation. Despite the various public CTR datasets available due to increasing interest from both ...
- research-articleJuly 2024
CivilSum: A Dataset for Abstractive Summarization of Indian Court Decisions
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 2241–2250https://doi.org/10.1145/3626772.3657859Extracting relevant information from legal documents is a challenging task due to the technical complexity and volume of their content. These factors also increase the costs of annotating large datasets, which are required to train state-of-the-art ...