Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3457784.3457806acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicscaConference Proceedingsconference-collections
research-article

Classification of Inundation Level using Tweets in Indonesian Language

Published: 30 July 2021 Publication History

Abstract

Extreme flood events are expected to occur more frequently as climate change has yet to show signs of improvement. This has the potential to lead to higher rainfall and floods that would come more quickly. Early warning systems may sometimes fail to provide quick information when conditions in the field may not match to what is known in the information center, such as a malfunctioning water pump or a water level that has increased relatively quickly. Therefore, this study aims to provide an alternative source of information that may provide inundation level during flood condition based on tweets from Twitter. The proposed model is expected to provide output in the form of inundation level categories, namely “high”, “medium”, “low”, and “unknown”. 10-fold stratified cross validation with seven variations of classifiers were used to evaluate the model. The best relevance classification resulted in 90.6% accuracy (SVM Linear SVC), 89.05% average precision (SVM RBF), and 82.03% average F1-score (SVM Linear SVC) and average recall of 84.10% (Logistic Regression). The best classification results of inundation level resulted in accuracy (82.74%), average precision (85.44%) average recall (68.07%) and average F1-score (71.43%). All of them were obtained by using the SVM Linear SVC.

References

[1]
AHA Centre. 2020. Situtation Update: Massive Floods in Greater Jakarta Area Indonesia. Retrieved January 8, 2021 from https://ahacentre.org/wp-content/uploads/2020/01/AHA_Situation_Updates_No2_Greater_Jakarta_Floods-1.pdf
[2]
Khoirun Nisa. 2020. Rekapitulasi Data Banjir DKI Jakarta dan Penanggulangannya Tahun 2020. Retrieved October 20, 2020 from http://statistik.jakarta.go.id/rekapitulasi-data-banjir-dki-jakarta-dan-penanggulangannya-tahun-2020/
[3]
Claudio Rossi, Flavia Sofia Acerbo, Kaisa Ylinen, Ilkka Juga, Pertti Nurmi, Alessio Bosca, Francesco Tarasconi, Marco Cristoforetti, and Azra Alikadic. 2018. Early detection and information extraction for weather-induced floods using social media streams. International journal of disaster risk reduction 30 (2018), 145–157.
[4]
Jens A de Bruijn, Hans de Moel, Brenden Jongman, Marleen C de Ruiter, Jurjen Wagemaker, and Jeroen CJH Aerts. 2019. A global database of historic and real-time flood events based on social media. Scientific Data 6, 1 (2019), 1–12.
[5]
Jyoti Prakash Singh, Yogesh K Dwivedi, Nripendra P Rana, Abhinav Kumar, and Kawaljeet Kaur Kapoor. 2019. Event classification and location prediction from tweets during disasters. Annals of Operations Research 283, 1 (2019), 737–757.
[6]
Ajree Ducol Malawani, Achmad Nurmandi, Eko Priyo Purnomo, and Taufiqur Rahman. 2020. Social media in aid of post disaster management. Transforming Government: People, Process and Policy (2020).
[7]
Rungsun Kiatpanont, Uthai Tanlamai, Prabhas Chongstitvatana, 2016. Extraction of actionable information from crowdsourced disaster data. Journal of emergency management 14, 6 (2016), 377–390.
[8]
Joachim Fohringer, Doris Dransch, Heidi Kreibich, and Kai Schröter. 2015. Social media as an information source for rapid flood inundation mapping. Natural Hazards and Earth System Sciences (NHESS) 15 (2015), 2725–2738.
[9]
Dirk Eilander, Patricia Trambauer, Jurjen Wagemaker, and Arnejan Van Loenen. 2016. Harvesting social media for generation of near real-time flood maps. Procedia Engineering 154 (2016), 176–183.
[10]
Kiran Zahra, Muhammad Imran, and Frank O Ostermann. 2020. Automatic identification of eyewitness messages on twitter during disasters. Information processing & management 57, 1 (2020), 102107.
[11]
Khanh-An C Quan, Vinh-Tiep Nguyen, Tan-Cong Nguyen, Tam V Nguyen, and Minh-Triet Tran. 2020. Flood level prediction via human pose estimation from social media images. In Proceedings of the 2020 International Conference on Multimedia Retrieval. 479–485.
[12]
Yu Feng, Claus Brenner, and Monika Sester. 2020. Flood severity mapping from Volunteered Geographic Information by interpreting water level from images containing people: A case study of Hurricane Harvey. ISPRS Journal of Photogrammetry and Remote Sensing 169 (2020), 301–319.
[13]
Charu C Aggarwal and ChengXiang Zhai. 2012. Mining text data. Springer Science & Business Media.
[14]
Anthony J Viera, Joanne M Garrett, 2005. Understanding interobserver agreement: the kappa statistic. Fam med 37, 5 (2005), 360–363.
[15]
Nikmatun Aliyah Salsabila, Yosef Ardhito Winatmoko, Ali Akbar Septiandri, and Ade Jamal. 2018. Colloquial indonesian lexicon. In 2018 International Conference on Asian Language Processing (IALP). IEEE, 226–229.
[16]
Anna Glazkova. 2020. A Comparison of Synthetic Oversampling Methods for Multi-class Text Classification. arXiv preprint arXiv:2008.04636 (2020).
[17]
Yehuda Koren, Edo Liberty, Yoelle Maarek, and Roman Sandler. 2011. Automatically tagging email by leveraging other users’ folders. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. 913–921.
[18]
Jiawei Han and Micheline Kamber. 2006. Data Mining Concepts and Techniques (A. Stephan, Ed.), 2nd edn., vol. 40.
[19]
Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura Barnes, and Donald Brown. 2019. Text classification algorithms: A survey. Information 10, 4 (2019), 150.
[20]
Xinjian Guo, Yilong Yin, Cailing Dong, Gongping Yang, and Guangtong Zhou. 2008. On the class imbalance problem. In 2008 Fourth international conference on natural computation, Vol. 4. IEEE, 192–201.
[21]
Jasmina Novakovic and Suzana Markovic. 2020. Performance of Support Vector Machine in Imbalanced Data Set. In 2020 19th International Symposium INFOTEH-JAHORINA (INFOTEH). IEEE, 1–5.
[22]
Nasser Alsaedi, Pete Burnap, and Omer Rana. 2017. Can we predict a riot? Disruptive event detection using Twitter. ACM Transactions on Internet Technology (TOIT) 17, 2 (2017), 1–26.
[23]
Mr Nihar M Ranjan, YR Ghorpade, GR Kanthale, AR Ghorpade, and AS Dubey. 2017. Document classification using lstm neural network. Journal of Data Mining and Management 2, 2 (2017), 1–9.
[24]
Qingyao Wu, Yunming Ye, Haijun Zhang, Michael K Ng, and Shen-Shyang Ho. 2014. ForesTexter: an efficient random forest algorithm for imbalanced text categorization. Knowledge-Based Systems 67 (2014), 105–116.
[25]
Yuelin Wang, Yihan Zhang, Yan Lu, and Xinran Yu. 2020. A Comparative Assessment of Credit Risk Model Based on Machine Learning——a case study of bank loan data. Procedia Computer Science 174 (2020), 141–149.
[26]
Baoxun Xu, Xiufeng Guo, Yunming Ye, and Jiefeng Cheng. 2012. An Improved Random Forest Classifier for Text Categorization. JCP 7, 12 (2012), 2913–2920.
[27]
Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. 2014. Social media mining: an introduction. Cambridge University Press.
[28]
Yumiao Wang, Xueling Wu, Zhangjian Chen, Fu Ren, Luwei Feng, and Qingyun Du. 2019. Optimizing the predictive ability of machine learning methods for landslide susceptibility mapping using smote for lishui city in zhejiang province, china. International journal of environmental research and public health 16, 3 (2019), 368.
[29]
Nathalie Japkowicz. 2001. Concept-learning in the presence of between-class and within-class imbalances. In Conference of the Canadian society for computational studies of intelligence. Springer, 67–77.
[30]
Nathalie Japkowicz 2000. Learning from imbalanced data sets: a comparison of various strategies. In AAAI workshop on learning from imbalanced data sets, Vol. 68. Menlo Park, CA, 10–15.

Cited By

View all
  • (2024)Natural Language Processing for Infrastructure Resilience to Natural Disasters: A Scientometric ReviewProceedings of the 7th International Conference on Geotechnics, Civil Engineering and Structures, CIGOS 2024, 4-5 April, Ho Chi Minh City, Vietnam10.1007/978-981-97-1972-3_165(1506-1513)Online publication date: 1-Jun-2024
  • (2022)Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets from 2017–2022 and 100 Research QuestionsAnalytics10.3390/analytics10200071:2(72-97)Online publication date: 23-Sep-2022
  • (undefined)Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research QuestionsSSRN Electronic Journal10.2139/ssrn.4170991

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICSCA '21: Proceedings of the 2021 10th International Conference on Software and Computer Applications
February 2021
325 pages
ISBN:9781450388825
DOI:10.1145/3457784
© 2021 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Flood
  2. Inundation Level
  3. Text Mining
  4. Twitter

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICSCA 2021

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Natural Language Processing for Infrastructure Resilience to Natural Disasters: A Scientometric ReviewProceedings of the 7th International Conference on Geotechnics, Civil Engineering and Structures, CIGOS 2024, 4-5 April, Ho Chi Minh City, Vietnam10.1007/978-981-97-1972-3_165(1506-1513)Online publication date: 1-Jun-2024
  • (2022)Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets from 2017–2022 and 100 Research QuestionsAnalytics10.3390/analytics10200071:2(72-97)Online publication date: 23-Sep-2022
  • (undefined)Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research QuestionsSSRN Electronic Journal10.2139/ssrn.4170991

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media