Financial FAQ Question-Answering System Based on Question Semantic Similarity

Wenxing Hong¹³,
Jun Li¹³ &
Shuyan Li¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14886))

Included in the following conference series:

International Conference on Knowledge Science, Engineering and Management

254 Accesses

Abstract

In the current wave of digital transformation, Frequently Asked Questions (FAQ) answering systems have become a crucial technology to replace traditional manual customer service for efficiently addressing high-frequency issues. This paper focuses on two real business scenarios within the financial industry - banking and funds. We delve into and implement FAQ question-answering systems based on question semantic similarity, suitable for both cold start and domain adaptation phases. In the banking scenario, we confront the challenge of cold start problem. To mitigate the anisotropy issues associated with pre-trained models, we employ unsupervised SimCSE, which leverages dropout as data augmentation. In the fund scenario, where an ample labeled dataset is available for fine-tuning, we introduce the improved supervised CoSENT. CoSENT leverages unified optimization criteria throughout the training and prediction stages of SBERT. Experimental results indicate that CoSENT can achieve superior sentence embeddings. Starting from real-world scenarios, we propose a practical data accumulation process for FAQ question-answering systems, spanning from the cold start phase to fine-tuning domain-adapted models. In conclusion, the FAQ question-answering systems constructed in this paper can effectively adapt to the cold start and domain adaptation requirements in different business scenarios, providing valuable practical and theoretical references for enterprises in the process of digital transformation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LARQ: Learning to Ask and Rewrite Questions for Community Question Answering

Effective FAQ Retrieval and Question Matching Tasks with Unsupervised Knowledge Injection

DSQA-LLM: Domain-Specific Intelligent Question Answering Based on Large Language Model

References

Araci, D.: Finbert: financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063 (2019)
Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. In: Advances in Neural Information Processing Systems, vol. 13 (2000)
Google Scholar
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Google Scholar
Chamekh, A., Mahfoudh, M., Forestier, G.: Sentiment analysis based on deep learning in e-commerce. In: Memmi, G., Yang, B., Kong, L., Zhang, T., Qiu, M. (eds.) KSEM 2022. LNCS, vol. 13369, pp. 498–507. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10986-7_40
Chapter Google Scholar
Chen, J., Chen, Q., Liu, X., Yang, H., Lu, D., Tang, B.: The BQ corpus: a large-scale domain-specific Chinese corpus for sentence semantic equivalence identification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4946–4951 (2018)
Google Scholar
Gao, J., He, D., Tan, X., Qin, T., Wang, L., Liu, T.Y.: Representation degeneration problem in training natural language generation models. arXiv preprint arXiv:1907.12009 (2019)
Gao, T., Yao, X., Chen, D.: Simcse: simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021)
Hinton, G.E., et al.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, Amherst, MA, vol. 1, p. 12 (1986)
Google Scholar
Hu, C., Xiao, K., Wang, Z., Wang, S., Li, Q.: Extracting prerequisite relations among wikipedia concepts using the clickstream data. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, S.-Y. (eds.) KSEM 2021. LNCS (LNAI), vol. 12815, pp. 13–26. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82136-4_2
Chapter Google Scholar
Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 2333–2338 (2013)
Google Scholar
Kenton, J.D.M.W.C., Toutanova, L.K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, vol. 1, p. 2 (2019)
Google Scholar
Levenshtein, V.I., et al.: Binary codes capable of correcting deletions, insertions, and reversals. In: Soviet Physics Doklady, Soviet Union, vol. 10, pp. 707–710 (1966)
Google Scholar
Li, B., Zhou, H., He, J., Wang, M., Yang, Y., Li, L.: On the sentence embeddings from pre-trained language models. arXiv preprint arXiv:2011.05864 (2020)
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Google Scholar
Nguyen, N.T.H., Ha, P.P.D., Nguyen, L.T., Van Nguyen, K., Nguyen, N.L.T.: Spbertqa: a two-stage question answering system based on sentence transformers for medical texts. In: Memmi, G., Yang, B., Kong, L., Zhang, T., Qiu, M. (eds.) KSEM 2022. LNCS, vol. 13369, pp. 371–382. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10986-7_30
Chapter Google Scholar
Peng, S., Cui, H., Xie, N., Li, S., Zhang, J., Li, X.: Enhanced-RCNN: an efficient method for learning sentence similarity. In: Proceedings of the Web Conference 2020, pp. 2500–2506 (2020)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Ramos, J., et al.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning, vol. 242, pp. 29–48. Citeseer (2003)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)
Robertson, S., Zaragoza, H., et al.: The probabilistic relevance framework: BM25 and beyond. Found. Trends® Inf. Retrieval 3(4), 333–389 (2009)
Google Scholar
Su, J.: Cosent(1): a more effective sentence vector scheme than sentence bert (2022). https://kexue.fm/archives/8847
Su, J., Cao, J., Liu, W., Ou, Y.: Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316 (2021)
Sun, K., Luo, X., Luo, M.Y.: A survey of pretrained language models. In: Memmi, G., Yang, B., Kong, L., Zhang, T., Qiu, M. (eds.) KSEM 2022. LNCS, vol. 13369, pp. 442–456. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10986-7_36
Chapter Google Scholar
Sun, Y., et al.: Circle loss: a unified perspective of pair similarity optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6398–6407 (2020)
Google Scholar
Sun, Y., et al.: Ernie 3.0: large-scale knowledge enhanced pre-training for language understanding and generation. arXiv preprint arXiv:2107.02137 (2021)
Wang, Z., Hamza, W., Florian, R.: Bilateral multi-perspective matching for natural language sentences. arXiv preprint arXiv:1702.03814 (2017)

Download references

Author information

Authors and Affiliations

School of Aerospace Engineering, Xiamen University, Xiamen, China
Wenxing Hong, Jun Li & Shuyan Li

Authors

Wenxing Hong
View author publications
You can also search for this author in PubMed Google Scholar
Jun Li
View author publications
You can also search for this author in PubMed Google Scholar
Shuyan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wenxing Hong or Jun Li .

Editor information

Editors and Affiliations

Chinese Academy of Sciences, Beijing, China
Cungeng Cao
Zhejiang University, Zhejiang, China
Huajun Chen
Emory University, Atlanta, GA, USA
Liang Zhao
Birmingham City University, Birmingham, UK
Junaid Arshad
Monash University, Banten, Indonesia
Taufiq Asyhari
Birmingham City University, Birmingham, UK
Yonghao Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hong, W., Li, J., Li, S. (2024). Financial FAQ Question-Answering System Based on Question Semantic Similarity. In: Cao, C., Chen, H., Zhao, L., Arshad, J., Asyhari, T., Wang, Y. (eds) Knowledge Science, Engineering and Management. KSEM 2024. Lecture Notes in Computer Science(), vol 14886. Springer, Singapore. https://doi.org/10.1007/978-981-97-5498-4_12

Download citation

DOI: https://doi.org/10.1007/978-981-97-5498-4_12
Published: 27 July 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5497-7
Online ISBN: 978-981-97-5498-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Financial FAQ Question-Answering System Based on Question Semantic Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LARQ: Learning to Ask and Rewrite Questions for Community Question Answering

Effective FAQ Retrieval and Question Matching Tasks with Unsupervised Knowledge Injection

DSQA-LLM: Domain-Specific Intelligent Question Answering Based on Large Language Model

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Financial FAQ Question-Answering System Based on Question Semantic Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LARQ: Learning to Ask and Rewrite Questions for Community Question Answering

Effective FAQ Retrieval and Question Matching Tasks with Unsupervised Knowledge Injection

DSQA-LLM: Domain-Specific Intelligent Question Answering Based on Large Language Model

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation