default search action
ACM SIGMOD Conference 2024: Santiago, Chile - Companion Volume
- Pablo Barceló, Nayat Sánchez-Pi, Alexandra Meliou, S. Sudarshan:
Companion of the 2024 International Conference on Management of Data, SIGMOD/PODS 2024, Santiago AA, Chile, June 9-15, 2024. ACM 2024
Keynotes
- Ricardo Baeza-Yates:
The Limitations of Data, Machine Learning and Us. 1-2 - Xin Luna Dong:
The Journey to a Knowledgeable Assistant with Retrieval-Augmented Generation (RAG). 3 - Peter A. Boncz:
Making Data Management Better with Vectorized Query Processing. 4
Industry Session 1: Query Engines
- Andrew Lamb, Yijie Shen, Daniël Heres, Jayjeet Chakraborty, Mehmet Ozan Kabak, Liang-Chi Hsieh, Chao Sun:
Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine. 5-17 - Nicolas Bruno, César A. Galindo-Legaria, Milind Joshi, Esteban Calvo Vargas, Kabita Mahapatra, Sharon Ravindran, Guoheng Chen, Ernesto Cervantes Juárez, Beysim Sezgin:
Unified Query Optimization in the Fabric Data Warehouse. 18-30 - Julian Hyde, John Fremlin:
Measures in SQL. 31-40 - Yuxing Han, Haoyu Wang, Lixiang Chen, Yifeng Dong, Xing Chen, Benquan Yu, Chengcheng Yang, Weining Qian:
ByteCard: Enhancing ByteDance's Data Warehouse with Learned Cardinality Estimation. 41-54 - Jialin Ding, Matt Abrams, Sanghita Bandyopadhyay, Luciano Di Palma, Yanzhu Ji, Davide Pagano, Gopal Paliwal, Panos Parchas, Pascal Pfeil, Orestis Polychroniou, Gaurav Saxena, Aamer Shah, Amina Voloder, Sherry Xiao, Davis Zhang, Tim Kraska:
Automated Multidimensional Data Layouts in Amazon Redshift. 55-67 - Suratna Budalakoti, Mohamed Ziauddin, Andrew Witkowski, You Jung Kim, Ramarajan Krishnamachari, Alan Wood:
Automated Clustering Recommendation With Database Zone Maps. 68-79
Industry Session 2: LLMs and ML Applications
- Ahmed Metwally, Michael Shum:
Similarity Joins of Sparse Features. 80-92 - Chao Zhang, Yuren Mao, Yijiang Fan, Yu Mi, Yunjun Gao, Lu Chen, Dongfang Lou, Jinshu Lin:
FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis. 93-105 - Xianchun Bao, Zian Bao, Bie Binbin, Qingsong Duan, Wenfei Fan, Hui Lei, Daji Li, Wei Lin, Peng Liu, Zhicong Lv, Mingliang Ouyang, Shuai Tang, Yaoshu Wang, Qiyuan Wei, Min Xie, Jing Zhang, Xin Zhang, Runxiao Zhao, Shuping Zhou:
Rock: Cleaning Data by Embedding ML in Logic Rules. 106-119 - Daoyuan Chen, Yilun Huang, Zhijian Ma, Hesen Chen, Xuchen Pan, Ce Ge, Dawei Gao, Yuexiang Xie, Zhaoyang Liu, Jinyang Gao, Yaliang Li, Bolin Ding, Jingren Zhou:
Data-Juicer: A One-Stop Data Processing System for Large Language Models. 120-134 - Javier de la Rúa Martínez, Fabio Buso, Antonios Kouzoupis, Alexandru A. Ormenisan, Salman Niazi, Davit Bzhalava, Kenneth Mak, Victor Jouffrey, Mikael Ronström, Raymond Cunningham, Ralfs Zangis, Dhananjay Mukhedkar, Ayushman Khazanchi, Vladimir Vlassov, Jim Dowling:
The Hopsworks Feature Store for Machine Learning. 135-147 - Changlong Yu, Xin Liu, Jefferson Maia, Yang Li, Tianyu Cao, Yifan Gao, Yangqiu Song, Rahul Goutam, Haiyang Zhang, Bing Yin, Zheng Li:
COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon. 148-160
Industry Session 3: Cloud Storage
- Shikun Tian, Zhonghao Lu, Haizhen Zhuo, Xiaojing Tang, Peiyi Hong, Shenglong Chen, Dayi Yang, Ying Yan, Zhiyong Jiang, Hui Zhang, Guofei Jiang:
LETUS: A Log-Structured Efficient Trusted Universal BlockChain Storage. 161-174 - Pavan Edara, Jonathan Forbesj, Bigang Li:
Vortex: A Stream-oriented Storage Engine For Big Data Analytics. 175-187 - David Kalmuk, Christian Garcia-Arellano, Ronald Barber, Richard Sidle, Kostas Rakopoulos, Hamdi Roumani, William Minor, Alexander Cheung, Robert C. Hooper, Matthew Emmerton, Zach Hoggard, Scott Walkty, Patrick Pérez, Aleksandrs Santars, Michael Chen, Matthew Olan, Daniel C. Zilio, Imran Sayyid, Humphrey Li, Ketan Rampurkar, Krishna K. Ramachandran, Yiren Shen:
Native Cloud Object Storage in Db2 Warehouse: Implementing a Fast and Cost-Efficient Cloud Storage Architecture. 188-200 - Yupu Zhang, Guanglin Cong, Jihan Qu, Ran Xu, Yuan Fu, Weiqi Li, Feiran Hu, Jing Liu, Wenliang Zhang, Kai Zheng:
ESTELLE: An Efficient and Cost-effective Cloud Log Engine. 201-213 - Jianjun Deng, Jianan Lu, Hua Fan, Chaoyang Liu, Shi Cheng, Cuiyun Fu, Wenchao Zhou:
TimeCloth: Fast Point-in-Time Database Recovery in The Cloud. 214-226
Indusrty Session 4: Cloud Databases
- Olga Poppe, Pankaj Arora, Sakshi Sharma, Jie Chen, Sachin Pandit, Rahul Sawhney, Vaishali Jhalani, Willis Lang, Qun Guo, Anupriya Inumella, Sanjana Dulipeta Sridhar, Dheren Gala, Nilesh Rathi, Morgan Oslake, Alexandru Chirica, Sarika Iyer, Prateek Goel, Ajay Kalhan:
Proactive Resume and Pause of Resources for Microsoft Azure SQL Database Serverless. 227-240 - Anna Pavlenko, Joyce Cahoon, Yiwen Zhu, Brian Kroth, Michael Nelson, Andrew Carter, David Liao, Travis Wright, Jesús Camacho-Rodríguez, Karla Saur:
Vertically Autoscaling Monolithic Applications with CaaSPER: Scalable Container-as-a-Service Performance Enhanced Resizing Algorithm for the Cloud. 241-254 - Wei Li, Jiachi Zhang, Ye Yin, Yan Li, Zhanyang Zhu, Wenchao Zhou, Liang Lin, Feifei Li:
Flux: Decoupled Auto-Scaling for Heterogeneous Query Workload in Alibaba AnalyticDB. 255-268 - Vikram Nathan, Vikramank Y. Singh, Zhengchun Liu, Mohammad Rahman, Andreas Kipf, Dominik Horn, Davide Pagano, Gaurav Saxena, Balakrishnan Narayanaswamy, Tim Kraska:
Intelligent Scaling in Amazon Redshift. 269-279 - Ziniu Wu, Ryan Marcus, Zhengchun Liu, Parimarjan Negi, Vikram Nathan, Pascal Pfeil, Gaurav Saxena, Mohammad Rahman, Balakrishnan Narayanaswamy, Tim Kraska:
Stage: Query Execution Time Prediction in Amazon Redshift. 280-294
Industry Session 5: Cloud Database Architecture
- Xinjun Yang, Yingqiang Zhang, Hao Chen, Feifei Li, Bo Wang, Jing Fang, Chuan Sun, Yuhui Wang:
PolarDB-MP: A Multi-Primary Cloud-Native Database via Disaggregated Shared Memory. 295-308 - Yacine Taleb, Kevin McGehee, Nan Yan, Shawn Wang, Stefan C. Müller, Allen Samuels:
Amazon MemoryDB: A Fast and Durable Memory-First Cloud Database. 309-320 - Josep Aguilar-Saborit, Raghu Ramakrishnan, Kevin Bocksrocker, Alan Halverson, Konstantin Kosinsky, Ryan O'Connor, Nadejda Poliakova, Moe Shafiei, Haris Mahmood Ansari, Bogdan Crivat, Conor Cunningham, Taewoo Kim, Phil Kon-Kim, Ishan Rajesh Madan, Blazej Matuszyk, Matt Miles, Sumin Mohanan, Cristian Petculescu, Emma Rose-Wirshing, Elias Yousefi, Amin Abadi:
Extending Polaris to Support Transactions. 321-333 - Justin J. Levandoski, Garrett Casto, Mingge Deng, Rushabh Desai, Pavan Edara, Thibaud Hottelier, Amir Hormati, Anoop Johnson, Jeff Johnson, Dawid Kurzyniec, Sam McVeety, Prem Ramanathan, Gaurav Saxena, Vidya Shanmugan, Yuri Volobuev:
BigLake: BigQuery's Evolution toward a Multi-Cloud Lakehouse. 334-346 - Tobias Schmidt, Andreas Kipf, Dominik Horn, Gaurav Saxena, Tim Kraska:
Predicate Caching: Query-Driven Secondary Indexing for Cloud Data Warehouses. 347-359
Industry Session 6: Graph Data Management
- Wei Zhang, Cheng Chen, Qiange Wang, Wei Wang, Shijiao Yang, Bingyu Zhou, Huiming Zhu, Chao Chen, Yongjun Zhao, Yingqian Hu, Miaomiao Cheng, Meng Li, Hongfei Tan, Mengjin Liu, Hexiang Lin, Shuai Zhang, Lei Zhang:
BG3: A Cost Effective and I/O Efficient Graph Database in Bytedance. 360-372 - Stefano Ceri, Anna Bernasconi, Alessia Gagliardi, Davide Martinenghi, Luigi Bellomarini, Davide Magnanimi:
PG-Triggers: Triggers for Property Graphs. 373-385 - Tao He, Shuxian Hu, Longbin Lai, Dongze Li, Neng Li, Xue Li, Lexiao Liu, Xiaojian Luo, Bingqing Lyu, Ke Meng, Sijie Shen, Li Su, Lei Wang, Jingbo Xu, Wenyuan Yu, Weibin Zeng, Lei Zhang, Siyuan Zhang, Jingren Zhou, Xiaoli Zhou, Diwen Zhu:
GraphScope Flex: LEGO-like Graph Computing Stack. 386-399 - Hao Xu, Juan A. Colmenares:
Bouncer: Admission Control with Response Time Objectives for Low-latency Online Data Systems. 400-413 - Wentao Zhang, Guochen Yan, Yu Shen, Ling Yang, Yangyu Tao, Bin Cui, Jian Tang:
NPA: Improving Large-scale Graph Neural Networks with Non-parametric Attention. 414-427
Demonstrations Group A
- Kevin Dharmawan, Chirag A. Kawediya, Yue Gong, Zaki Indra Yudhistira, Zhiru Zhu, Sainyam Galhotra, Adila Alfa Krisnadhi, Raul Castro Fernandez:
Demonstration of Ver: View Discovery in the Wild. 428-431 - Zhijia Chen, Lihong He, Arjun Mukherjee, Eduard C. Dragut:
Comquest: Large Scale User Comment Crawling and Integration. 432-435 - Ethan Seow, Yan Tong, Eli Baum, Sam Buxbaum, Muhammad Faisal, John Liagouris, Vasiliki Kalavri, Mayank Varia:
QueryShield: Cryptographically Secure Analytics in the Cloud. 436-439 - Jiebing Ma, Sourav S. Bhowmick, Lester Tay, Byron Choi:
SIERRA: A Counterfactual Thinking-based Visual Interface for Property Graph Query Construction. 440-443 - Markos Markakis, An Bo Chen, Brit Youngmann, Trinity Gao, Ziyu Zhang, Rana Shahout, Peter Baile Chen, Chunwei Liu, Ibrahim Sabek, Michael J. Cafarella:
Sawmill: From Logs to Causal Diagnosis of Large Systems. 444-447 - Kyle Bossonney, Vicente Calisto, Cristian Riveros, Gustavo Toro, Nicolás Van Sint Jan, Domagoj Vrgoc:
Demonstrating REmatch: A Novel RegEx Engine for Finding all Matches. 448-451 - Susan B. Davidson, Tova Milo, Kathy Razmadze, Gal Zeevi:
ASQP-RL Demo: Learning Approximation Sets for Exploratory Queries. 452-455 - Chenyang Zhang, Junxiong Peng, Chen Xu, Quanqing Xu, Chuanhui Yang:
IMBridge: Impedance Mismatch Mitigation between Database Engine and Prediction Query Execution. 456-459 - Sangoh Lee, Kyoungmin Kim, Wook-Shin Han:
ASM in Action: Fast and Practical Learned Cardinality Estimation. 460-463 - Andrew Bell, João Fonseca, Julia Stoyanovich:
The Game Of Recourse: Simulating Algorithmic Recourse over Time to Improve Its Reliability and Fairness. 464-467 - Amin Kamali, Verena Kantere, Calisto Zuzarte, Vincent Corvinelli:
RobOpt: A Tool for Robust Workload Optimization Based on Uncertainty-Aware Machine Learning. 468-471 - Matthias Urban, Carsten Binnig:
Demonstrating CAESURA: Language Models as Multi-Modal Query Planners. 472-475 - Yicong Huang, Zuozhi Wang, Chen Li:
Demonstration of Udon: Line-by-line Debugging of User-Defined Functions in Data Workflows. 476-479 - Zhiyu Liang, Chen Liang, Zheng Liang, Hongzhi Wang, Bo Zheng:
UniTS: A Universal Time Series Analysis Framework Powered by Self-Supervised Representation Learning. 480-483 - Sibei Chen, Hanbing Liu, Waiting Jin, Xiangyu Sun, Xiaoyao Feng, Ju Fan, Xiaoyong Du, Nan Tang:
ChatPipe: Orchestrating Data Preparation Pipelines by Optimizing Human-ChatGPT Interactions. 484-487
Demonstrations Group B
- Denys Herasymuk, Falaah Arif Khan, Julia Stoyanovich:
Responsible Model Selection with Virny and VirnyView. 488-491 - Riccardo Tommasini, Christopher Rost, Angela Bonifati, Emanuele Della Valle, Erhard Rahm, Keith W. Hare, Stefan Plantikow, Petra Selmer, Hannes Voigt:
Property Graph Stream Processing In Action with Seraph. 492-495 - Domagoj Vrgoc, Carlos Rojas, Renzo Angles, Marcelo Arenas, Vicente Calisto, Benjamín Farias, Sebastián Ferrada, Tristan Heuer, Aidan Hogan, Gonzalo Navarro, Alexander Pinto, Juan L. Reutter, Henry Rosales-Méndez, Etienne Toussaint:
MillenniumDB: A Multi-modal, Multi-model Graph Database. 496-499 - Yuhao Deng, Deng Qiyan, Chengliang Chai, Lei Cao, Nan Tang, Ju Fan, Jiayi Wang, Ye Yuan, Guoren Wang:
IDE: A System for Iterative Mislabel Detection. 500-503 - Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Yuanchun Zhou, Mingjie Tang, Jianguo Wang:
A Demonstration of GPTuner: A GPT-Based Manual-Reading Database Tuning System. 504-507 - Victor Giannakouris, Immanuel Trummer:
Demonstrating λ-Tune: Exploiting Large Language Models for Workload-Adaptive Database System Tuning. 508-511 - Tingyang Chen, Dazhuo Qiu, Yinghui Wu, Arijit Khan, Xiangyu Ke, Yunjun Gao:
User-friendly, Interactive, and Configurable Explanations for Graph Neural Networks with Graph Views. 512-515 - Ilaria Battiston, Kriti Kathuria, Peter A. Boncz:
OpenIVM: a SQL-to-SQL Compiler for Incremental Computations. 516-519 - Shreya Shankar, Aditya G. Parameswaran:
Building Reactive Large Language Model Pipelines with Motion. 520-523 - Yue Gong, Raul Castro Fernandez:
Demonstrating Nexus for Correlation Discovery over Collections of Spatio-Temporal Tabular Data. 524-527 - Jiwon Chang, Christina Dionysio, Fatemeh Nargesian, Matthias Boehm:
PLUTUS: Understanding Data Distribution Tailoring for Machine Learning. 528-531 - Gereon Dusella, Haralampos Gavriilidis, Laert Nuhu, Volker Markl, Eleni Tzirita Zacharatou:
Multi-Backend Zonal Statistics Execution with Raven. 532-535 - Sanad Saha, Nischal Aryal, Leilani Battle, Arash Termehchy:
ShiftScope: Adapting Visualization Recommendations to Users' Dynamic Data Focus. 536-539 - Zhaoheng Li, Supawit Chockchowwat, Hanxi Fang, Ribhav Sahu, Sumay Thakurdesai, Kantanat Pridaphatrakun, Yongjoo Park:
Demonstration of ElasticNotebook: Migrating Live Computational Notebook States. 540-543
Panels
- Angela Bonifati, M. Tamer Özsu, Yuanyuan Tian, Hannes Voigt, Wenyuan Yu, Wenjie Zhang:
The Future of Graph Analytics. 544-545 - Carlo Curino, Raghu Ramakrishnan:
AI for Systems. 546
Tutorials
- Xupeng Miao, Zhihao Jia, Bin Cui:
Demystifying Data Management for Large Language Models. 547-555 - Faeze Faghih, Tobias Ziegler, Zsolt István, Carsten Binnig:
SmartNICs in the Cloud: The Why, What and How of In-network Processing for Data-Intensive Applications. 556-560 - Rong Zhu, Lianggui Weng, Bolin Ding, Jingren Zhou:
Learned Query Optimizer: What is New and What is Next. 561-569 - Mohammad Javad Amiri, Divyakant Agrawal, Amr El Abbadi, Boon Thau Loo:
Distributed Transaction Processing in Untrusted Environments. 570-579 - Raul Castro Fernandez, Arnab Nandi:
Responsible Sharing of Spatiotemporal Data. 580-584 - Aidan Hogan, Domagoj Vrgoc:
Querying Graph Databases at Scale. 585-589 - Sourav S. Bhowmick, S. H. Annabel Chen, Divesh Srivastava:
Cognitive Psychology Meets Data Management: State of the Art and Future Directions. 590-596 - James Jie Pan, Jianguo Wang, Guoliang Li:
Vector Database Management Techniques and Systems. 597-604 - Angela Bonifati, Riccardo Tommasini:
An Overview of Continuous Querying in (Modern) Data Systems. 605-612 - Dirk Habich, Johannes Pietrzyk:
SIMDified Data Processing - Foundations, Abstraction, and Advanced Techniques. 613-621 - Gao Cong, Jingyi Yang, Yue Zhao:
Machine Learning for Databases: Foundations, Paradigms, and Open problems. 622-629 - Xuan Luo, Jian Pei:
Applications and Computation of the Shapley Value in Databases and Machine Learning. 630-635 - Prashant Pandey, Martín Farach-Colton, Niv Dayan, Huanchen Zhang:
Beyond Bloom: A Tutorial on Future Feature-Rich Filters. 636-644
Workshop Summaries
- Carsten Binnig, Nesime Tatbul:
International Workshop on Data Management on New Hardware (DaMoN). 645-646 - Danica Porobic, Tianzheng Wang:
Second Workshop on Simplicity in Management of Data (SiMoD). 647-648 - Rajesh Bordawekar, Oded Shmueli, Yael Amsterdamer, Renata Borovica-Gajic, Donatella Firmani:
Seventh International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM). 649-650 - Madelon Hulsebos, Matteo Interlandi, Shreya Shankar:
Eighth Workshop on Data Management for End-to-End Machine Learning (DEEM). 651-652 - Olaf Hartig, Zoi Kaoudi:
GRADES-NDA'24: 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). 653-654 - Philippe Cudré-Mauroux, Andrea Ko, Robert Wrembel:
Fourth International Workshop on Big Data in Emergent Distributed Environments (BiDEDE). 655-656 - Jean-Daniel Fekete, Kexin Rong, Behrooz Omidvar-Tehrani, Roee Shraga:
Eighth Workshop on Human-In-the-Loop Data Analytics (HILDA). 657-658 - Daphne Miedema, Sourav S. Bhowmick, Michael Liut:
Third International Workshop on Data Systems Education (DataEd'24). 659-660 - Abolfazl Asudeh, Sainyam Galhotra, Amir Gilad, Babak Salimi, Brit Youngmann:
First Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI (GUIDE-AI). 661-662 - Ibrahim Sabek, Immanuel Trummer, Stefan Prestel:
First Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications (Q-Data). 663-664 - Anja Gruenheid, Manuel Rigger:
Tenth International Workshop on Testing Database Systems (DBTest). 665-666
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.