Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Lindorm-UWC: An Ultra-Wide-Column Database for Internet of Vehicles

Published: 08 November 2024 Publication History

Abstract

In the Internet of Vehicle (IoV) systems, intelligent vehicles generate huge amounts of data that supports diverse services and applications. In practice, database systems are deployed in the cloud to manage data uploaded from the vehicle side and provide real-time query capacities. However, existing database systems are ill-suited because IoV data contains a large number of metrics and is written at an extremely high throughput. To better understand IoV data and corresponding challenges to underlying database systems, we conduct the first extensive empirical study of real-world IoV workloads. According to our findings from the study, we design Lindorm-UWC as a superior database for IoV systems. It implements a distributed architecture and a cold/hot data separation mechanism to accommodate massive amounts of IoV data. In each data partition, it deploys an ultra-wide-column storage engine to efficiently handle the query and ingestion of multi-metric data. We evaluate Lindorm-UWC under different data scales and various types of query. Our experimental results show that it can always achieve higher write throughput (over 79% increase) and competitive query performance compared to various alternative solutions. Lindorm-UWC has been serving IoV enterprise customers on Alibaba Cloud since 2019, managing tens of petabytes of IoV data.

References

[1]
2023. 2023 Global Automotive Connectivity Executive Survey. https://www.mckinsey.com/industries/automotive-and-assembly/our-insights/corporate-business-building-to-unlock-value-in-automotive-connectivity. Last accessed: 2024-07-14.
[2]
2024. Alibaba Cloud OSS. https://www.alibabacloud.com/product/object-storage-service. Last accessed: 2024-07-14.
[3]
2024. Alibaba ECS. https://www.alibabacloud.com/product/ecs. Last accessed: 2024-07-14.
[4]
2024. AlibabaCloud HBase. https://www.alibabacloud.com/product/hbase. Last accessed: 2024-07-14.
[5]
2024. AlibabaCloud InfluxDB. https://www.alibabacloud.com/product/hitsdb_influxdb. Last accessed: 2024-07-14.
[6]
2024. AlibabaCloud Lindorm. https://www.alibabacloud.com/product/lindorm. Last accessed: 2024-07-14.
[7]
2024. AlibabaCloud MongoDB. https://www.alibabacloud.com/product/apsaradb-for-mongodb. Last accessed: 2024-07-14.
[8]
2024. Apache HBase. https://hbase.apache.org/. Last accessed: 2024-07-14.
[9]
2024. Apache Hive. https://hive.apache.org/. Last accessed: 2024-07-14.
[10]
2024. Apache Parquet. https://parquet.apache.org/. Last accessed: 2024-07-14.
[11]
2024. Apache ZooKeeper. https://zookeeper.apache.org/. Last accessed: 2024-07-14.
[12]
2024. DB-Engines Ranking of Time Series DBMS. https://db-engines.com/en/ranking/time+series+dbms. Last accessed: 2024-07-14.
[13]
2024. InfluxDB. https://docs.influxdata.com/influxdb/v2.6/. Last accessed: 2024-07-14.
[14]
2024. MongoDB. https://www.mongodb.com/. Last accessed: 2024-07-14.
[15]
2024. Prometheus. https://prometheus.io/. Last accessed: 2024-07-14.
[16]
2024. Prometheus Node exporter. https://github.com/prometheus/node_exporter. Last accessed: 2024-07-14.
[17]
2024. RocksDB. https://rocksdb.org. Last accessed: 2024-07-14.
[18]
2024. TDengine. https://tdengine.com/. Last accessed: 2024-07-14.
[19]
2024. Time Series Benchmark Suite. https://github.com/timescale/tsbs. Last accessed: 2024-07-14.
[20]
Saif Al-Sultan, Moath M Al-Doori, Ali H Al-Bayatti, and Hussien Zedan. 2014. A comprehensive survey on vehicular ad hoc network. Journal of network and computer applications 37 (2014), 380--392.
[21]
Ansif Arooj, Muhammad Shoaib Farooq, Aftab Akram, Razi Iqbal, Ashutosh Sharma, and Gaurav Dhiman. 2022. Big data processing and analysis in internet of vehicles: architecture, taxonomy, and open research challenges. Archives of Computational Methods in Engineering 29, 2 (2022), 793--829.
[22]
Hao Chen, Chaoyi Ruan, Cheng Li, Xiaosong Ma, and Yinlong Xu. 2021. {SpanDB}: A fast, {Cost-Effective}{LSM-tree} based {KV} store on hybrid storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21). 17--32.
[23]
JiuJun Cheng, JunLu Cheng, MengChu Zhou, FuQiang Liu, ShangCe Gao, and Cong Liu. 2015. Routing in internet of vehicles: A review. IEEE Transactions on Intelligent Transportation Systems 16, 5 (2015), 2339--2352.
[24]
Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2018. Optimal bloom filters and adaptive merging for LSM-trees. ACM Transactions on Database Systems (TODS) 43, 4 (2018), 1--48.
[25]
Christian Garcia-Arellano, Hamdi Roumani, Richard Sidle, Josh Tiefenbach, Kostas Rakopoulos, Imran Sayyid, Adam Storm, Ronald Barber, Fatma Ozcan, Daniel Zilio, et al. 2020. Db2 event store: a purpose-built IoT database engine. Proceedings of the VLDB Endowment 13, 12 (2020), 3299--3312.
[26]
Chaochen Hu, Zihan Sun, Chao Li, Yong Zhang, and Chunxiao Xing. 2023. Survey of Time Series Data Generation in IoT. Sensors 23, 15 (2023), 6976.
[27]
Dongxu Huang, Qi Liu, Qiu Cui, Zhuhe Fang, Xiaoyu Ma, Fei Xu, Li Shen, Liu Tang, Yuxing Zhou, Menglong Huang, et al. 2020. TiDB: a Raft-based HTAP database. Proceedings of the VLDB Endowment 13, 12 (2020), 3072--3084.
[28]
Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-Engine: An optimized storage engine for large-scale E-commerce transaction processing. In Proceedings of the 2019 International Conference on Management of Data. 651--665.
[29]
InfluxData Inc. 2024. InfluxDB TSM. https://docs.influxdata.com/influxdb/v1/concepts/storage_engine/. Last accessed: 2024-07-14.
[30]
InfluxData Inc. 2024. InfluxQL. https://docs.influxdata.com/influxdb/v1/query_language/. Last accessed: 2024-07-14.
[31]
IoTDB. 2024. Application of Apache IoTDB in the Construction of Chang'an Intelligent Automobile Data Platform. https://www.timecho.com/archives/2022-iotdb-summit-chang-an-qi-che-huang-li. Last accessed: 2024-03-13.
[32]
Baofeng Ji, Xueru Zhang, Shahid Mumtaz, Congzheng Han, Chunguo Li, Hong Wen, and Dan Wang. 2020. Survey on the internet of vehicles: Network architectures and applications. IEEE Communications Standards Magazine 4, 1 (2020), 34--41.
[33]
Xiaowei Jiang, Yuejun Hu, Yu Xiang, Guangran Jiang, Xiaojun Jin, Chen Xia, Weihua Jiang, Jun Yu, Haitao Wang, Yuan Jiang, et al. 2020. Alibaba hologres: A cloud-native service for hybrid serving/analytical processing. Proceedings of the VLDB Endowment 13, 12 (2020), 3272--3284.
[34]
Yuyuan Kang, Xiangdong Huang, Shaoxu Song, Lingzhe Zhang, Jialin Qiao, Chen Wang, Jianmin Wang, and Julian Feinauer. 2022. Separation or not: On handing out-of-order time-series data in leveled lsm-tree. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 3340--3352.
[35]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: a decentralized structured storage system. ACM SIGOPS operating systems review 44, 2 (2010), 35--40.
[36]
Qizhong Mao, Steven Jacobs, Waleed Amjad, Vagelis Hristidis, Vassilis J Tsotras, and Neal E Young. 2021. Comparison and evaluation of state-of-the-art LSM merge policies. The VLDB Journal 30 (2021), 361--378.
[37]
Zhisheng Niu, S Shen, QY Zhang, et al. 2017. Space-air-ground integrated vehicular network for immersive driving experience. Chinese J. Internet of Things 1, 2 (2017), 17--27.
[38]
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33 (1996), 351--385.
[39]
Tuomas Pelkonen, Scott Franklin, Justin Teller, Paul Cavallaro, Qi Huang, Justin Meza, and Kaushik Veeraraghavan. 2015. Gorilla: A fast, scalable, in-memory time series database. Proceedings of the VLDB Endowment 8, 12 (2015), 1816--1827.
[40]
Meikel Poess, Raghunath Nambiar, Karthik Kulkarni, Chinmayi Narasimhadevara, Tilmann Rabl, and Hans-Arno Jacobsen. 2018. Analysis of tpcx-iot: The first industry standard benchmark for iot gateway systems. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 1519--1530.
[41]
Ashwini Raina, Jianan Lu, Asaf Cidon, and Michael J Freedman. 2023. Efficient Compactions between Storage Tiers with PrismDB. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3. 179--193.
[42]
Chunhui Shen, Qianyu Ouyang, Feibo Li, Zhipeng Liu, Longcheng Zhu, Yujie Zou, Qing Su, Tianhuan Yu, Yi Yi, Jianhong Hu, et al. 2023. Lindorm TSDB: A Cloud-Native Time-Series Database for Large-Scale Monitoring Systems. Proceedings of the VLDB Endowment 16, 12 (2023), 3715--3727.
[43]
Xuanhua Shi, Zezhao Feng, Kaixi Li, Yongluan Zhou, Hai Jin, Yan Jiang, Bingsheng He, Zhijun Ling, and Xin Li. 2020. ByteSeries: an in-memory time series database for large-scale monitoring systems. In Proceedings of the 11th ACM Symposium on Cloud Computing. 60--73.
[44]
Chen Wang, Jialin Qiao, Xiangdong Huang, Shaoxu Song, Haonan Hou, Tian Jiang, Lei Rui, Jianmin Wang, and Jiaguang Sun. 2023. Apache IoTDB: A time series database for IoT applications. Proceedings of the ACM on Management of Data 1, 2 (2023), 1--27.
[45]
Jianying Wang, Tongliang Li, Haoze Song, Xinjun Yang, Wenchao Zhou, Feifei Li, Baoyue Yan, Qianqian Wu, Yukun Liang, ChengJun Ying, et al. 2023. PolarDBIMCI: A cloud-native HTAP database system at alibaba. Proceedings of the ACM on Management of Data 1, 2 (2023), 1--25.
[46]
Zhiqi Wang and Zili Shao. 2022. TimeUnion: An Efficient Architecture with Unified Data Model for Timeseries Management Systems on Hybrid Cloud Storage. In Proceedings of the 2022 International Conference on Management of Data. 1418--1432.
[47]
Fangchun Yang, Shangguang Wang, Jinglin Li, Zhihan Liu, and Qibo Sun. 2014. An overview of Internet of Vehicles. China Communications 11, 10 (2014), 1--15.
[48]
Lei Yang, Hong Wu, Tieying Zhang, Xuntao Cheng, Feifei Li, Lei Zou, Yujie Wang, Rongyao Chen, Jianying Wang, and Gui Huang. 2020. Leaper: A learned prefetcher for cache invalidation in LSM-tree based storage engines. Proceedings of the VLDB Endowment 13, 12 (2020), 1976--1989.
[49]
Hobin Yoon, Juncheng Yang, Sveinn Fannar Kristjansson, Steinn E Sigurdarson, Ymir Vigfusson, and Ada Gavrilovska. 2018. Mutant: Balancing storage cost and latency in lsm-tree data stores. In Proceedings of the ACM Symposium on Cloud Computing. 162--173.
[50]
Teng Zhang, Jian Tan, Xin Cai, Jianying Wang, Feifei Li, and Jianling Sun. 2022. SA-LSM: optimize data layout for LSM-tree based storage using survival analysis. Proceedings of the VLDB Endowment 15, 10 (2022), 2161--2174.

Index Terms

  1. Lindorm-UWC: An Ultra-Wide-Column Database for Internet of Vehicles
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 17, Issue 12
    August 2024
    837 pages
    • Editors:
    • Meihui Zhang,
    • Cyrus Shahabi
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 08 November 2024
    Published in PVLDB Volume 17, Issue 12

    Check for updates

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media