research-article

K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization

Authors:

Tianhang Zhang,

Junxian HeAuthors Info & Claims

WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining

Pages 161 - 170

https://doi.org/10.1145/3616855.3635772

Published: 04 March 2024 Publication History

Abstract

Large language models (LLMs) have achieved great success in general domains of natural language processing. In this paper, we bring LLMs to the realm of geoscience with the objective of advancing research and applications in this field. To this end, we present the first-ever LLM in geoscience, K2, alongside a suite of resources developed to further promote LLM research within geoscience. For instance, we have curated the first geoscience instruction tuning dataset, GeoSignal, which aims to align LLM responses to geoscience-related user queries. Additionally, we have established the first geoscience benchmark, GeoBench, to evaluate LLMs in the context of geoscience. In this work, we experiment with a complete recipe to adapt a pre-trained general-domain LLM to the geoscience domain. Specifically, we further train the LLaMA-7B model on 5.5B tokens of geoscience text corpus, including over 1 million pieces of geoscience literature, and utilize GeoSignal's supervised data to fine-tune the model. Moreover, we share a protocol that can efficiently gather domain-specific data and construct domain-supervised data, even in situations where manpower is scarce. Meanwhile, we equip K2 with the abilities of using tools to be a naive geoscience aide. Experiments conducted on the GeoBench demonstrate the effectiveness of our approach and datasets on geoscience knowledge understanding and utilization.We open-source all the training data and K2 model checkpoints at https://github.com/davendw49/k2

References

[1]

Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language Model for Scientific Text. In Conference on Empirical Methods in Natural Language Processing.

[2]

, Marion E. Bickford. 2013. The Impact of the Geological Sciences on Society. Geological Society of America. https://doi.org/10.1130/SPE501

[3]

Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E. Gonzalez, Ion Stoica, and Eric P. Xing. 2023. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://vicuna.lmsys.org

[4]

Hyung Won Chung, Le Hou, S. Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Wei Yu, Vincent Zhao, Yanping Huang, Andrew M. Dai, Hongkun Yu, Slav Petrov, Ed Huai hsin Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei. 2022. Scaling Instruction-Finetuned Language Models. ArXiv, Vol. abs/2210.11416 (2022).

[5]

Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. 2018. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge. ArXiv, Vol. abs/1803.05457 (2018).

[6]

Databricks. 2023. Hello Dolly: Democratizing the magic of ChatGPT with open models. https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html

[7]

Cheng Deng, Yuting Jia, Hui Xu, Chong Zhang, Jingyao Tang, Luoyi Fu, Weinan Zhang, Haisong Zhang, Xinbing Wang, and Cheng Zhou. 2021. GAKG: A Multimodal Geoscience Academic Knowledge Graph. Proceedings of the 30th ACM International Conference on Information & Knowledge Management (2021).

Digital Library

[8]

Cheng Deng, Bo Tong, Luoyi Fu, Jiaxin Ding, Dexing Cao, Xinbing Wang, and Chenghu Zhou. 2023. PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue Model. arXiv preprint arXiv:2304.00592 (2023).

[9]

Huseyin Denli, HassanJaved Chughtai, Brian Hughes, Robert Gistri, and Peng Xu. 2021. Geoscience Language Processing for Exploration. Day 3 Wed, November 17, 2021 (2021).

[10]

Ruixue Ding, Boli Chen, Pengjun Xie, Fei Huang, Xin Li, Qiang-Wei Zhang, and Yao Xu. 2023. A Multi-Modal Geographic Pre-Training Method. ArXiv, Vol. abs/2301.04283 (2023).

[11]

Majigsuren Enkhsaikhan, Wei Liu, Eun-Jung Holden, and Paul Duuring. 2021. Auto-labelling entities in low-resource text: a geological case study. Knowledge and Information Systems, Vol. 63 (2021), 695 -- 715.

Digital Library

[12]

Jinlan Fu, See-Kiong Ng, Zhengbao Jiang, and Pengfei Liu. 2023. GPTScore: Evaluate as You Desire. ArXiv, Vol. abs/2302.04166 (2023).

[13]

Leo Gao, Stella Rose Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, and Connor Leahy. 2020. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. ArXiv, Vol. abs/2101.00027 (2020).

[14]

Xinyang Geng, Arnav Gudibande, Hao Liu, Eric Wallace, Pieter Abbeel, Sergey Levine, and Dawn Song. 2023. Koala: A Dialogue Model for Academic Research. Blog post. https://bair.berkeley.edu/blog/2023/04/03/koala/

[15]

Fabrizio Gilardi, Meysam Alizadeh, and Maël Kubli. 2023. ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences of the United States of America, Vol. 120 (2023). https://api.semanticscholar.org/CorpusID:257766307

[16]

Tanishq Gupta, Mohd Zaki, N. Krishnan, and Mausam. 2021. MatSciBERT: A materials domain language model for text mining and information extraction. npj Computational Materials, Vol. 8 (2021), 1--11.

[17]

Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. 2019. Parameter-Efficient Transfer Learning for NLP. In International Conference on Machine Learning.

[18]

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. ArXiv, Vol. abs/2106.09685 (2021).

[19]

Jizhou Huang, Haifeng Wang, Yibo Sun, Yunsheng Shi, Zhengjie Huang, An Zhuo, and Shikun Feng. 2022. ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022).

Digital Library

[20]

Jared Kaplan, Sam McCandlish, T. J. Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeff Wu, and Dario Amodei. 2020. Scaling Laws for Neural Language Models. ArXiv, Vol. abs/2001.08361 (2020).

[21]

Zeljko Kraljevic, Anthony Shek, Daniel M Bean, Rebecca Bendayan, James T. H. Teo, and Richard J. B. Dobson. 2021. MedGPT: Medical Concept Prediction from Clinical Narratives. ArXiv, Vol. abs/2107.03134 (2021).

[22]

Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The Power of Scale for Parameter-Efficient Prompt Tuning. ArXiv, Vol. abs/2104.08691 (2021).

[23]

Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Vol. abs/2101.00190 (2021).

[24]

Xiao Liu, Juan Hu, Qi Shen, and Huan Chen. 2021. Geo-BERT Pre-training Model for Query Rewriting in POI Search. In Conference on Empirical Methods in Natural Language Processing.

[25]

S. Longpre, Le Hou, Tu Vu, Albert Webson, Hyung Won Chung, Yi Tay, Denny Zhou, Quoc V. Le, Barret Zoph, Jason Wei, and Adam Roberts. 2023. The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. ArXiv, Vol. abs/2301.13688 (2023).

[26]

Bin Lu, Lyuwen Wu, Lina Yang, Chenxing Sun, Wei Liu, Xiaoying Gan, Shiyu Liang, Luoyi Fu, Xinbing Wang, and Cheng Zhou. 2023. DataExpo: A One-Stop Dataset Service for Open Science Research. Companion Proceedings of the ACM Web Conference 2023 (2023).

[27]

Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, and Tie-Yan Liu. 2022. BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining. Briefings in bioinformatics (2022).

[28]

Kai Ma, Miao Tian, Yongjian Tan, Xuejing Xie, and Qinjun Qiu. 2021. What is this article about? Generative summarization with the BERT model in the geosciences domain. Earth Science Informatics, Vol. 15 (2021), 21 -- 36.

[29]

Xiaogang Ma, Chao Ma, and Chengbin Wang. 2020. A new structure for representing and tracking version information in a deep time knowledge graph. Comput. Geosci., Vol. 145 (2020), 104620.

[30]

Gengchen Mai, Weiming Huang, Jin Sun, Suhang Song, Deepak Mishra, Ninghao Liu, Song Gao, Tianming Liu, G. Cong, Yingjie Hu, Chris Cundy, Ziyuan Li, Rui Zhu, and Ni Lao. 2023. On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence. ArXiv, Vol. abs/2304.06798 (2023).

[31]

Gengchen Mai, Krzysztof Janowicz, Yingjie Hu, Song Gao, Bo Yan, Rui Zhu, Ling Cai, and Ni Lao. 2021. A review of location encoding for GeoAI: methods and applications. International Journal of Geographical Information Science, Vol. 36 (2021), 639 -- 673. https://api.semanticscholar.org/CorpusID:243847917

[32]

Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi. 2021. Natural Instructions: Benchmarking Generalization to New Tasks from Natural Language Instructions. arXiv preprint arXiv:2104.08773 (2021).

[33]

Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Haiquan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2022. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis.

[34]

OpenAI. 2022. Introducing ChatGPT. (2022). https://openai.com/blog/chatgpt

[35]

OpenAI. 2023. GPT-4 Technical Report. ArXiv, Vol. abs/2303.08774 (2023).

[36]

José Padarian and Ignacio Fuentes. 2019. Word embeddings for application in geosciences: development, evaluation, and examples of soil-related concepts. SOIL (2019).

[37]

Yujia Qin, Shi Liang, Yining Ye, Kunlun Zhu, Lan Yan, Ya-Ting Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Runchu Tian, Ruobing Xie, Jie Zhou, Marc H. Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. 2023. ToolLLM: Facilitating Large Language Models to Master 16000 Real-world APIs. ArXiv, Vol. abs/2307.16789 (2023). https://api.semanticscholar.org/CorpusID:260334759

[38]

Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. (2019).

[39]

Colin Raffel, Noam M. Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. ArXiv, Vol. abs/1910.10683 (2019).

[40]

Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal V. Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Févry, Jason Alan Fries, Ryan Teehan, Stella Rose Biderman, Leo Gao, Tali Bers, Thomas Wolf, and Alexander M. Rush. 2021. Multitask Prompted Training Enables Zero-Shot Task Generalization. ArXiv, Vol. abs/2110.08207 (2021).

[41]

Hoo-Chang Shin, Yang Zhang, Evelina Bakhturina, Raul Puri, Mostofa Patwary, Mohammad Shoeybi, and Raghav Mani. 2020. Bio-Megatron: Larger Biomedical Domain Language Model. ArXiv, Vol. abs/2010.06060 (2020).

[42]

K. Singhal, Shekoofeh Azizi, Tao Tu, Said Mahdavi, Jason Lee Kai Wei, Hyung Won Chung, Nathan Scales, Ajay Kumar Tanwani, Heather J. Cole-Lewis, Stephen J. Pfohl, P A Payne, Martin G. Seneviratne, Paul Gamble, Chris Kelly, Nathaneal Scharli, Aakanksha Chowdhery, P. A. Mansfield, Blaise Agüera y Arcas, Dale R. Webster, Greg S. Corrado, Y. Matias, Katherine Hui-Ling Chou, Juraj Gottweis, Nenad Tomavsev, Yun Liu, Alvin Rajkomar, Joëlle K. Barral, Christopher Semturs, Alan Karthikesalingam, and Vivek Natarajan. 2022. Large Language Models Encode Clinical Knowledge. ArXiv, Vol. abs/2212.13138 (2022).

[43]

Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto. 2023. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.

[44]

Ross Taylor, Marcin Kardas, Guillem Cucurull, Thomas Scialom, Anthony S. Hartshorn, Elvis Saravia, Andrew Poulton, Viktor Kerkez, and Robert Stojnic. 2022. Galactica: A Large Language Model for Science. ArXiv, Vol. abs/2211.09085 (2022).

[45]

MosaicML NLP Team. 2023. Introducing MPT-7B: A New Standard for Open-Source, ly Usable LLMs. (2023). www.mosaicml.com/blog/mpt-7b

[46]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aur'elien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. ArXiv, Vol. abs/2302.13971 (2023).

[47]

Benyou Wang, Qianqian Xie, Jiahuan Pei, Prayag Tiwari, Zhao Li, and Jie Fu. 2021b. Pre-trained Language Models in Biomedical Domain: A Systematic Survey. ArXiv, Vol. abs/2110.05006 (2021).

[48]

Chengshan Wang, Robert M. Hazen, Qiuming Cheng, Michael H. Stephenson, Chenghu Zhou, Peter A. Fox, Shu'zhong Shen, Roland Oberh"ansli, Zeng'qian Hou, Xiaogang Ma, Zhiqiang Feng, Junxuan Fan, Chao Ma, Xiumian Hu, Bin Luo, Juanle Wang, and Craig M. Schiffries. 2021a. The Deep-Time Digital Earth program: data-driven discovery in geosciences. National Science Review, Vol. 8 (2021).

[49]

Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, and Hannaneh Hajishirzi. 2022. Self-Instruct: Aligning Language Model with Self Generated Instructions. ArXiv, Vol. abs/2212.10560 (2022).

[50]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Huai hsin Chi, F. Xia, Quoc Le, and Denny Zhou. 2022. Chain of Thought Prompting Elicits Reasoning in Large Language Models. ArXiv, Vol. abs/2201.11903 (2022). https://api.semanticscholar.org/CorpusID:246411621

[51]

Canwen Xu, Daya Guo, Nan Duan, and Julian McAuley. 2023 a. Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data. ArXiv, Vol. abs/2304.01196 (2023).

[52]

Yi Xu, Shuqian Sheng, Bo Xue, Luoyi Fu, Xinbing Wang, and Chenghu Zhou. 2023 b. Exploring and Verbalizing Academic Ideas by Concept Co-occurrence. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Toronto, Canada, 13001--13027. https://doi.org/10.18653/v1/2023.acl-long.727

[53]

Weizhe Yuan and Pengfei Liu. 2022. reStructured Pre-training. ArXiv, Vol. abs/2206.11147 (2022).

[54]

Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, Weng Lam Tam, Zixuan Ma, Yufei Xue, Jidong Zhai, Wenguang Chen, P. Zhang, Yuxiao Dong, and Jie Tang. 2022. GLM-130B: An Open Bilingual Pre-trained Model. ArXiv, Vol. abs/2210.02414 (2022).

[55]

Shao Zhang, Yuting Jia, Hui Xu, Ying Wen, Dakuo Wang, and Xinbing Wang. 2022. DeepShovel: An Online Collaborative Platform for Data Extraction in Geoscience Literature with AI Assistance. ArXiv, Vol. abs/2202.10163 (2022). https://api.semanticscholar.org/CorpusID:247011979 io

Cited By

Lee S(2024)Development of a Large-scale Korean Language Model in the Field of GeosciencesEconomic and Environmental Geology10.9719/EEG.2024.57.5.53957:5(539-550)Online publication date: 29-Oct-2024
https://doi.org/10.9719/EEG.2024.57.5.539
Wolniewicz P(2024)The Combined Use of GIS and Generative Artificial Intelligence in Detecting Potential Geodiversity Sites and Promoting GeoheritageResources10.3390/resources1309011913:9(119)Online publication date: 27-Aug-2024
https://doi.org/10.3390/resources13090119
Hou ZLiu XZhou SJing WYang J(2024)Bibliometric Analysis on the Research of Geoscience Knowledge Graph (GeoKG) from 2012 to 2023ISPRS International Journal of Geo-Information10.3390/ijgi1307025513:7(255)Online publication date: 16-Jul-2024
https://doi.org/10.3390/ijgi13070255
Show More Cited By

Index Terms

K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Understanding environmental factors associated with cyanobacterial bloom
EMS '07: Proceedings of the Third IASTED International Conference on Environmental Modelling and Simulation

Occurrences of cyanobacterial abundance and their consequence in toxin produced by the bloom-forming cyanobacterium Microcystis aeruginosa were studied based on the data collected from 2003 to 2005 in the Wingecarribee reservoir, New South Wales, ...
Computational Studies in Understanding the Genesis and Utilisation of Bauxite Deposits of Rajahuan Area, Uttar Pradesh State, India
Poor-quality Groundwater Exploitation and Utilization
CESCE '10: Proceedings of the 2010 International Conference on Challenges in Environmental Science and Computer Engineering - Volume 01

Social and economic sustainable development is restricted by shortage of water resources in northern China, environmental geological problems which induced by groundwater over-extraction. In this essay, an assessment method of groundwater allowable ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining

March 2024

1246 pages

ISBN:9798400703713

DOI:10.1145/3616855

General Chairs:
Luz Angélica
Caudillo Mata (MDA Geointelligence)
,
Silvio Lattanzi
Google Research
,
Andrés Muñoz Medina
Google Research
,
Program Chairs:
Leman Akoglu
CMU
,
Aristides Gionis
KTH
,
Sergei Vassilvitskii
Google Research

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Shanghai Pilot Program for Basic Research
National Natural Science Foundation of China

Conference

WSDM '24

Sponsor:

WSDM '24: The 17th ACM International Conference on Web Search and Data Mining

March 4 - 8, 2024

Merida, Mexico

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
511
Total Downloads

Downloads (Last 12 months)511
Downloads (Last 6 weeks)62

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lee S(2024)Development of a Large-scale Korean Language Model in the Field of GeosciencesEconomic and Environmental Geology10.9719/EEG.2024.57.5.53957:5(539-550)Online publication date: 29-Oct-2024
https://doi.org/10.9719/EEG.2024.57.5.539
Wolniewicz P(2024)The Combined Use of GIS and Generative Artificial Intelligence in Detecting Potential Geodiversity Sites and Promoting GeoheritageResources10.3390/resources1309011913:9(119)Online publication date: 27-Aug-2024
https://doi.org/10.3390/resources13090119
Hou ZLiu XZhou SJing WYang J(2024)Bibliometric Analysis on the Research of Geoscience Knowledge Graph (GeoKG) from 2012 to 2023ISPRS International Journal of Geo-Information10.3390/ijgi1307025513:7(255)Online publication date: 16-Jul-2024
https://doi.org/10.3390/ijgi13070255
Yang YWang SLi DSun SWu Q(2024)GeoLocator: A Location-Integrated Large Multimodal Model (LMM) for Inferring Geo-PrivacyApplied Sciences10.3390/app1416709114:16(7091)Online publication date: 13-Aug-2024
https://doi.org/10.3390/app14167091
Yang CZhu YLu WWang YChen QGao CYan BChen Y(2024)Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and ApplicationACM Transactions on Intelligent Systems and Technology10.1145/3699518Online publication date: 8-Oct-2024
https://doi.org/10.1145/3699518
Hadid AChakraborty TBusby D(2024) When geoscience meets generative AI and large language models: Foundations, trends, and future challenges Expert Systems10.1111/exsy.13654Online publication date: 11-Jun-2024
https://doi.org/10.1111/exsy.13654
Chambi SLindsay MKlump JFrancis N(2024)Assessing named entity recognition efficacy using diverse geoscience datasets2024 International Conference on Machine Intelligence for GeoAnalytics and Remote Sensing (MIGARS)10.1109/MIGARS61408.2024.10544642(1-3)Online publication date: 8-Apr-2024
https://doi.org/10.1109/MIGARS61408.2024.10544642
Feng SLyu HLi FSun ZChen C(2024)Where to Move Next: Zero-shot Generalization of LLMs for Next POI Recommendation2024 IEEE Conference on Artificial Intelligence (CAI)10.1109/CAI59869.2024.00277(1530-1535)Online publication date: 25-Jun-2024
https://doi.org/10.1109/CAI59869.2024.00277
Chen ZLin MWang ZZang MBai Y(2024) PreparedLLM: effective pre -pretraining framework for domain-specific large language models Big Earth Data10.1080/20964471.2024.2396159(1-24)Online publication date: 8-Sep-2024
https://doi.org/10.1080/20964471.2024.2396159
Wu SOtake YMizutani DLiu CAsano KSato NSaito TBaba HFukunaga YHigo YKamura AKodama SMetoki MNakamura TNakazato YShioi ATakenobu MTsukioka KYoshikawa R(2024)Future-proofing geotechnics workflows: accelerating problem-solving with large language modelsGeorisk: Assessment and Management of Risk for Engineered Systems and Geohazards10.1080/17499518.2024.2381026(1-18)Online publication date: 25-Jul-2024
https://doi.org/10.1080/17499518.2024.2381026
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents