Enhancing Chinese Address Parsing in Low-Resource Scenarios through In-Context Learning
<p>Model overview. The generator takes the raw text as input and generates a regular expression sequence and a label word sequence. Subsequently, the REB–KNN algorithm selects a similar annotated sample based on the regular expression sequence and label word sequence. Then, the prompt generator constructs prompts corresponding to the selected sample. The generated prompts are fed into the Generative Pre-trained Transformer (GPT) model for prediction, and the address parsing result is extracted from the model’s output.</p> "> Figure 2
<p>CapICL architecture. The sequence generator captures the distribution patterns and boundary features of address types in the raw text, generating regular expression and label word sequences. The REB–KNN algorithm performs regular expression matching and BERT-based semantic similarity computation on these two sequences, selecting annotated samples that are similar to the query (raw). The specific structure of the Prompt template is illustrated in the top right corner of the figure, while the instruction provides a brief description of the dataset.</p> "> Figure 3
<p>Trie data structure for representing INF (detailed information such as floor and room numbers) address components. The trie is constructed by reversing the address segmentation and captures the specific characteristics of the addressed entities. The root node provides overall information about the corresponding type, including the capacity (maximum number of sub-tree root nodes, e.g., “building”), the threshold (minimum score for selecting information when constructing regular expressions), and the maximum and minimum lengths defining the length range of entities for the current type. The sub-trees below the root node represent the segmented parts of the address, while all nodes follow the same structure, describing the features of the represented string (e.g., “building number”). Additionally, each node maintains a left/right 1-neighboring list, which contains the neighboring strings. The overall score represents the normalized frequency of the string occurrence in the dataset.</p> "> Figure 4
<p>Directed Acyclic Graph (DAG) representing Chinese address component types. Nodes represent types of address components, and edges indicate the transition probability between types. The weight <span class="html-italic">W</span> of an edge represents the likelihood of transitioning from the current node to the next. The subscript of <span class="html-italic">W</span> is formed by connecting types with “-”. If there is a previous node, it is included. The symbols <span>$</span> and # represent the beginning and end of named entities, respectively.</p> "> Figure 5
<p>Illustration of the Arbitrary Granularity Segmentation Process for Raw Text. Firstly, the address component type and transition probabilities are obtained from the directed acyclic graph (DAG) based on the current state. At the same time, length information is retrieved from the trie. Next, the corresponding regular expression set (RES) is obtained from SORES and matched against the text. Then, an 8-dimensional vector is constructed by combining the type information, transition probabilities, length information, matched start and end positions, regular expression scores, match length, and numerical indicators. This vector is used as input to the binary classifier. By predicting the positive label, the validity of the segmentation is determined. Finally, the segmentation result is obtained.</p> "> Figure 6
<p>Impact of K on model performance. Other experimental settings remain the same as in <a href="#ijgi-12-00296-t004" class="html-table">Table 4</a>.</p> "> Figure 7
<p>Comparison of model performance based on randomly sampled annotated datasets of different sizes. Each sample size was randomly sampled six times, and independent model evaluations were performed for each annotated dataset.</p> ">
Abstract
:1. Introduction
- We propose CapICL, an innovative low-resource Chinese address parsing model, to address the challenges in Chinese address parsing. The key component of our model is the sequence generator, which is constructed using a small annotated dataset. By capturing the distribution patterns and boundary features of address types, the sequence generator effectively models the structure and semantics of addresses, mitigating interference from unnecessary variations;
- We introduce an integrated approach that combines regular expression matching and BERT-based semantic similarity computation to enhance the performance of Chinese address parsing. The regular expression matching captures specific patterns inherent in address components, while the BERT-based semantic similarity computation measures the semantic relatedness between different address components. This comprehensive approach achieves significant improvements in address parsing accuracy, particularly in low-resource scenarios;
- Compared to traditional methods of fine-tuning large-scale language models, our proposed CapICL model offers a higher cost-effectiveness. By leveraging the sequence generator to maximize the utilization of existing resources and knowledge, our approach eliminates the need for additional training or fine-tuning. This enables our model to achieve outstanding performance in Chinese address parsing, even with limited annotated data and computational resources.
2. Related Work
3. Methodology
3.1. Sequence Generator
3.1.1. Set of Regular Expressions
- Scoring Calculation. Introducing S to represent the tokenized sequence of an address component T, the score () of the regular expression generated by T is calculated using Equations (1) and (2):
- Solving Regular Expressions. If, during the search process of in the trie, it branches at the N-th level (with the root node being the 0-th level) where the number of subtrees exceeds 1, we introduce to represent the number of subtrees. For example, if is “1号楼” (“Building 1” in English), in Figure 3, branches at the second level (corresponding to the node “号”) where the node “号” has multiple subtrees. Assuming there are three subtrees, we have . Specifically, if does not exist in the trie, it is denoted as . Equation (3) provides the method for solving the regular expression of :
3.1.2. Directed Acyclic Graph
3.1.3. Binary Classifier
Algorithm 1 Automatic Construction of Sequence Generator |
|
3.1.4. Automatic Construction of Sequence Generator
3.2. Sequence Generation
Algorithm 2 Segmentation Algorithm |
|
- Generating the regular expression sequence. The regular expression sequence can be obtained by concatenating the regular expressions in shown in Table 2 using “.*?”.
- Generating the label word sequence. In contrast to the regular expression sequence, we not only use the label words from , but also incorporate the type information. Specifically, we first concatenate each label word with its corresponding type information, and then concatenate them using spaces to form the label word sequence.
3.3. Prompt Generation
3.3.1. KNN Demonstration Examples
- Regular Expression Matching. Use the generated regular expression sequence to match the labeled samples and select M similar samples to the raw text, where M is no more than K/2.
- BERT Semantic Similarity Calculation. Although regular expressions are effective in matching desired samples, the regular expression method itself requires the data distribution to meet specific criteria. In order to enhance the precision and robustness of our model, we employ the BERT semantic similarity calculation method to select the remaining K−M samples. Specifically, when computing the semantic similarity between the raw input sample and each example in the small annotation set, we utilize the label word sequence instead of the original text. This approach involves leveraging BERT-based models, such as SentenceBERT [30,31], to evaluate the semantic similarity between the raw text and the corresponding label word sequence from the small annotation set. Subsequently, we choose the K−M samples with the highest similarity, as depicted in Figure 2.
3.3.2. Prompt Template
3.4. Model Prediction
4. Experiment
4.1. Dataset
4.2. Main Experimental Results
4.3. Impact of K on the Model
4.4. Ablation Study
- The effectiveness of REB–KNN in improving model performance is validated. When random sample selection was used, the F1 score decreased by nearly 30% on the Logistic dataset and nearly 20% on the ChineseAddress dataset;
- The BERT semantic similarity selection module plays a crucial role in REB–KNN, especially the method based on the label word sequence. On the ChineseAddress dataset, FSB-KNN only exhibited a decrease of approximately 2%;
- Although regular expression matching can improve the F1 score, its effect is limited, as it may not yield valid matches in many cases. Relatively speaking, its impact was more significant on the Logistic dataset, where more samples could be matched.
4.5. Stability Analysis
5. Discussion
5.1. CapICL Effectiveness and K-Value Impact
5.2. Role of REB–KNN Algorithm
5.3. Limitations and Future Directions
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. The Prompt in Figure 2
- Instruction:
- -
- The following information is derived from Chinese logistics addresses. Each sentence represents a complete logistics address, and the entities are fully continuous. The labels and meanings of the entities are as follows: LOC represents road information, such as Jinshui East Road, etc.; PLT represents the name of a residential area or unit, such as Oasis Cloud Top Community, etc.; INF represents other auxiliary detailed information, such as Unit 4 in Building C, etc. In general, LOC is placed at the beginning of the sentence, but, sometimes, it can also be placed at the end of the sentence;
- Examples:
- -
- Input: 龙湖九溪郡1-1-1;
- -
- Output: [[’T’: ’LOC’, ’E’: ’龙湖’, ’T’: ’PLT’, ’E’: ’九溪郡’, ’T’: ’INF’, ’E’: ’1-1-1’]];
- -
- Input: 龙湖镇梧桐郡1*-1-9**;
- -
- Output: [[’T’: ’LOC’, ’E’: ’龙湖镇’, ’T’: ’COM’, ’E’: ’梧桐郡’, ’T’: ’INF’, ’E’: ’1*-1-9**’]];
- -
- Input: 龙湖镇九溪郡3**-2***;
- -
- Output: [[’T’: ’LOC’, ’E’: ’龙湖镇’, ’T’: ’COM’, ’E’: ’九溪郡’, ’T’: ’INF’, ’E’: ’3**-2***’]];
- -
- Input: 龙湖镇九溪郡二期2**-2***;
- -
- Output: [[’T’: ’LOC’, ’E’: ’龙湖镇’, ’T’: ’COM’, ’E’: ’九溪郡二期’, ’T’: ’INF’, ’E’: ’2**-2***’]];
- Query:
- -
- Input: 龙湖浩创梧桐郡1-11-111;
- -
- Output:;
References
- Wang, J.; Hu, Y.; Joseph, K. NeuroTPR: A Neuro-net Toponym Recognition Model for Extracting Locations from Social Media Messages. Trans. GIS 2020, 24, 719–735. [Google Scholar] [CrossRef]
- Tao, L.; Xie, Z.; Xu, D.; Ma, K.; Qiu, Q.; Pan, S.; Huang, B. Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model. ISPRS Int. J. Geo-Inf. 2022, 11, 598. [Google Scholar] [CrossRef]
- Stock, K.; Yousaf, J. Context-Aware Automated Interpretation of Elaborate Natural Language Descriptions of Location through Learning from Empirical Data. Int. J. Geogr. Inf. Sci. 2018, 32, 1087–1116. [Google Scholar] [CrossRef]
- Berragan, C.; Singleton, A.; Calafiore, A.; Morley, J. Transformer Based Named Entity Recognition for Place Name Extraction from Unstructured Text. Int. J. Geogr. Inf. Sci. 2023, 37, 747–766. [Google Scholar] [CrossRef]
- Li, H.; Lu, W.; Xie, P.; Li, L. Neural Chinese Address Parsing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 3421–3431. [Google Scholar] [CrossRef]
- Karimzadeh, M.; Pezanowski, S.; MacEachren, A.M.; Wallgrün, J.O. GeoTxt: A Scalable Geoparsing System for Unstructured Text Geolocation. Trans. GIS 2019, 23, 118–136. [Google Scholar] [CrossRef] [Green Version]
- Hu, X.; Hu, Y.; Resch, B.; Kersten, J. Geographic Information Extraction from Texts (GeoExT). In Proceedings of the Advances in Information Retrieval: 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, 2–6 April 2023; Proceedings, Part III. Springer: Berlin/Heidelberg, Germany, 2023; pp. 398–404. [Google Scholar] [CrossRef]
- Hongwei, Z.; Qingyun, D.U.; Zhangjian, C.; Chen, Z. A Chinese Address Parsing Method Using RoBERTa-BiLSTM-CRF. Geomat. Inf. Sci. Wuhan Univ. 2022, 47, 665–672. [Google Scholar] [CrossRef]
- Gritta, M.; Pilehvar, M.T.; Collier, N. A Pragmatic Guide to Geoparsing Evaluation. Lang. Resour. Eval. 2020, 54, 683–712. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hedderich, M.A.; Lange, L.; Adel, H.; Strötgen, J.; Klakow, D. A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 2545–2568. [Google Scholar] [CrossRef]
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. PaLM: Scaling Language Modeling with Pathways. arXiv 2022, arXiv:2204.02311. [Google Scholar] [CrossRef]
- Wu, Z.; Wang, Y.; Ye, J.; Kong, L. Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering. arXiv 2023, arXiv:2212.10375. [Google Scholar] [CrossRef]
- Min, S.; Lyu, X.; Holtzman, A.; Artetxe, M.; Lewis, M.; Hajishirzi, H.; Zettlemoyer, L. Rethinking the Role of Demonstrations: What makes In-context Learning Work? arXiv 2022, arXiv:2202.12837. [Google Scholar]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models Are Few-Shot Learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020; Curran Associates Inc.: Nice, France, 2020; pp. 1877–1901. [Google Scholar]
- Gao, T.; Fisch, A.; Chen, D. Making Pre-trained Language Models Better Few-shot Learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 3816–3830. [Google Scholar] [CrossRef]
- Chen, J.; Liu, Q.; Lin, H.; Han, X.; Sun, L. Few-Shot Named Entity Recognition with Self-describing Networks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 5711–5722. [Google Scholar] [CrossRef]
- Han, C.; Zhu, R.; Kuang, J.; Chen, F.; Li, X.; Gao, M.; Cao, X.; Wu, W. Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition. arXiv 2023, arXiv:2302.07739. [Google Scholar]
- Zhou, B.; Zou, L.; Hu, Y.; Qiang, Y.; Goldberg, D. TopoBERT: Plug and Play Toponym Recognition Module Harnessing Fine-tuned BERT. arXiv 2023, arXiv:2301.13631. [Google Scholar] [CrossRef]
- Liu, J.; Shen, D.; Zhang, Y.; Dolan, B.; Carin, L.; Chen, W. What Makes Good In-Context Examples for GPT-3? In Proceedings of the Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, Dublin, Ireland, 27 May 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 100–114. [Google Scholar] [CrossRef]
- Liu, P.; Yuan, W.; Fu, J.; Jiang, Z.; Hayashi, H.; Neubig, G. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. 2023, 55, 195:1–195:35. [Google Scholar] [CrossRef]
- Sun, T.; Shao, Y.; Qian, H.; Huang, X.; Qiu, X. Black-Box Tuning for Language-Model-as-a-Service. In Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 20841–20855. [Google Scholar]
- Lu, Y.; Bartolo, M.; Moore, A.; Riedel, S.; Stenetorp, P. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 8086–8098. [Google Scholar] [CrossRef]
- Zhao, Z.; Wallace, E.; Feng, S.; Klein, D.; Singh, S. Calibrate Before Use: Improving Few-shot Performance of Language Models. In Proceedings of the 38th International Conference on Machine Learning, Virtual Event, 18–24 July 2021; pp. 12697–12706. [Google Scholar]
- Ling, G.M.; Xu, A.P.; Wang, W. Research of address information automatic annotation based on deep learning. Acta Electonica Sin. 2020, 48, 2081. [Google Scholar]
- Ling, G.; Xu, A.; Wang, C.; Wu, J. REBDT: A Regular Expression Boundary-Based Decision Tree Model for Chinese Logistics Address Segmentation. Appl. Intell. 2023, 53, 6856–6872. [Google Scholar] [CrossRef]
- Tennant, P.W.G.; Murray, E.J.; Arnold, K.F.; Berrie, L.; Fox, M.P.; Gadd, S.C.; Harrison, W.J.; Keeble, C.; Ranker, L.R.; Textor, J.; et al. Use of Directed Acyclic Graphs (DAGs) to Identify Confounders in Applied Health Research: Review and Recommendations. Int. J. Epidemiol. 2021, 50, 620–632. [Google Scholar] [CrossRef] [PubMed]
- Shen, W.; Wu, S.; Yang, Y.; Quan, X. Directed Acyclic Graph Network for Conversational Emotion Recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 1551–1560. [Google Scholar] [CrossRef]
- Ferguson, K.D.; McCann, M.; Katikireddi, S.V.; Thomson, H.; Green, M.J.; Smith, D.J.; Lewsey, J.D. Evidence Synthesis for Constructing Directed Acyclic Graphs (ESC-DAGs): A Novel and Systematic Method for Building Directed Acyclic Graphs. Int. J. Epidemiol. 2020, 49, 322–329. [Google Scholar] [CrossRef] [PubMed]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.H.; Le, Q.V.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Adv. Neural Inf. Process. Syst. 2022, 35, 24824–24837. [Google Scholar]
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Online, 3–7 November 2019; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 3982–3992. [Google Scholar] [CrossRef] [Green Version]
- Reimers, N.; Gurevych, I. Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 4512–4525. [Google Scholar] [CrossRef]
- Petroni, F.; Rocktäschel, T.; Riedel, S.; Lewis, P.; Bakhtin, A.; Wu, Y.; Miller, A. Language Models as Knowledge Bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Online, 3–7 November 2019; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 2463–2473. [Google Scholar] [CrossRef] [Green Version]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models Are Unsupervised Multitask Learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Schick, T.; Schütze, H. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 2339–2352. [Google Scholar] [CrossRef]
- Ma, K.; Tan, Y.; Xie, Z.; Qiu, Q.; Chen, S. Chinese Toponym Recognition with Variant Neural Structures from Social Media Messages Based on BERT Methods. J. Geogr. Syst. 2022, 24, 143–169. [Google Scholar] [CrossRef]
- Liu, W.; Fu, X.; Zhang, Y.; Xiao, W. Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 5847–5858. [Google Scholar] [CrossRef]
Regular Expression | Score | Label Word |
---|---|---|
@$号楼@$. (.-|@$|.楼|.区|(@$)@#|.户|.院) * | 3.84 | 1号楼1 (Room 1, Building 1) |
@$-@$@#户(.-|@$|.楼|.区|(@$)@#|.户|.院) * | 3.81 | 1-1东户 (Unit 1, East Wing, Building 1) |
@$@C@$@C@$ (.-|@$|.楼|.区|(@$)@#|.户|.院) * | 3.79 | 1-1-1 (Room 1, Unit 1, Building 1) |
Type | Regular Expression | Label Word |
---|---|---|
LOC | .湖 | 龙湖 (dragon lake) |
COM | 郡(.城|.期|(小|街|新)区|.园|(尚)郡|.苑|.家) * | 民安东郡 (a community) |
INF | @$-@$-@$(.-|@$|.楼|.区|(@$)@-#|.户|.院) * | 1-1-1 (Room 1, Unit 1, Building 1) |
Name | Training | Validation | Test | Example | Labels |
---|---|---|---|---|---|
Address 1 | 8957 | 2985 | 2985 | 下城区上塘路9号浙江昆剧团8室(Room 8, Zhejiang Kunqu Opera Troupe, No. 9 Shangtang Rd, Xicheng District) | country, prov, city, district, devzone, town, community, road, subroad, and 21 other types [5]. |
Logistic | 1422 | 474 | 474 | 李@@~1*********8~~~石南路与翠竹街公园道一号2期*号楼**楼东*户(@@ Li~1*********8~~~Unit *, East of the *th floor in Building *, Phase 2 of Park Road No. 1, Intersection of Shinan Road and Cuizhu Street) | LOC, COM, INF |
Method | Chinese Address | Logistic | ||||
---|---|---|---|---|---|---|
P | R | F | P | R | F | |
BERT–CRF | 86.02 | 85.87 | 85.94 | 88.04 | 90.10 | 89.06 |
BERT–LSTM–CRF [35] | 86.13 | 86.01 | 86.07 | 88.50 | 87.41 | 87.95 |
BERT–Softmax | 86.58 | 85.83 | 86.20 | 87.96 | 88.37 | 88.17 |
RoBERTa–LSTM–CRF [8] | 85.95 | 86.50 | 86.22 | 88.17 | 88.73 | 88.45 |
LEBERT–CRF [36] | 86.54 | 86.08 | 86.31 | 87.66 | 89.49 | 88.57 |
ALBERT–LSTM–CRF [2] | 86.27 | 86.45 | 86.36 | 88.07 | 88.97 | 88.52 |
APLT [5] | 89.14 | 87.71 | 88.42 | 88.37 | 88.89 | 88.63 |
CapICL | 93.39 | 89.74 | 91.53 | 91.90 | 89.49 | 90.68 |
Ablation Method | Dataset | |||||
---|---|---|---|---|---|---|
Chinese Address | Logistic | |||||
P | R | F | P | R | F | |
Baseline (REB–KNN) | 93.39 | 89.74 | 91.53 | 91.90 | 89.49 | 90.68 |
Rand–KNN w/o RE | 17.39 | 21.88 | 19.83 | 19.17 | 32.35 | 26.68 |
RE–KNN w/o B | 21.69 | 27.44 | 24.86 | 22.20 | 25.31 | 23.85 |
RawB–KNN w/o RE | 13.77 | 19.25 | 16.75 | 14.04 | 19.96 | 17.22 |
FSB–KNN w/o RE | 2.01 | 2.85 | 2.45 | 8.11 | 12.33 | 10.34 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ling, G.; Mu, X.; Wang, C.; Xu, A. Enhancing Chinese Address Parsing in Low-Resource Scenarios through In-Context Learning. ISPRS Int. J. Geo-Inf. 2023, 12, 296. https://doi.org/10.3390/ijgi12070296
Ling G, Mu X, Wang C, Xu A. Enhancing Chinese Address Parsing in Low-Resource Scenarios through In-Context Learning. ISPRS International Journal of Geo-Information. 2023; 12(7):296. https://doi.org/10.3390/ijgi12070296
Chicago/Turabian StyleLing, Guangming, Xiaofeng Mu, Chao Wang, and Aiping Xu. 2023. "Enhancing Chinese Address Parsing in Low-Resource Scenarios through In-Context Learning" ISPRS International Journal of Geo-Information 12, no. 7: 296. https://doi.org/10.3390/ijgi12070296
APA StyleLing, G., Mu, X., Wang, C., & Xu, A. (2023). Enhancing Chinese Address Parsing in Low-Resource Scenarios through In-Context Learning. ISPRS International Journal of Geo-Information, 12(7), 296. https://doi.org/10.3390/ijgi12070296