Abstract
Tabular data is widely used as a common form of relational data in many industries where machine learning is applied, especially in finance, healthcare, and industry. However, traditional methods often do not fully consider the information embedded in the samples in the tables, including the meaning of the features themselves (column descriptions) as well as the contextual information. In this paper, we propose the BertTab model, which transforms a table sample into a sentence describing that sample by using an utterance template into which the category features of the table sample are populated. The sentences describing the sample are then converted into powerful contextual embeddings using the pre-trained Bert model. Finally, the context-embedded features are fused with the original features to obtain richer and more complete features of the sample, thus achieving higher performance. We evaluated the model on three datasets. Compared to the benchmark model, BertTab improves the AUC-ROC, AUC-PR, and accuracy by an average of 2.10, 4.43, and 0.48% on the three datasets, respectively. The ablation experiments demonstrate the positive effect of introducing category feature column descriptions and considering sample category feature contexts with the fusion of raw features on model effectiveness improvement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, M., Afzal, H., Majeed, A., Khan, B.: A survey of evolution in predictive models and impacting factors in customer churn. Adv. Data Sci. Adapt. Anal. 9(03), 1750007 (2017)
Arik, S.Ö., Pfister, T.: Tabnet: attentive interpretable tabular learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 6679–6687 (2021)
Bojer, C.S., Meldgaard, J.P.: Kaggle forecasting competitions: an overlooked learning opportunity. Int. J. Forecast. 37(2), 587–603 (2021)
Borisov, V., Leemann, T., Seßler, K., Haug, J., Pawelczyk, M., Kasneci, G.: Deep neural networks and tabular data: a survey. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 1153–1176 (2015)
Cholakov, R., Kolev, T.: The GatedTabtransformer. An enhanced deep learning architecture for tabular modeling (2022). arXiv preprint arXiv:2201.00199
Clements, J.M., Xu, D., Yousefi, N., Efimov, D.: Sequential deep learning for credit risk monitoring with tabular financial data (2020). arXiv preprint arXiv:2012.15330
Cox, D.R.: The regression analysis of binary sequences. J. R. Stat. Soc. Ser. B Stat Methodol. 20(2), 215–232 (1958)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018). arXiv preprint arXiv:1810.04805
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)
Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)
Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with lstm recurrent networks. J. Mach. Learn. Res. 3, 115–143 (2002)
Gorishniy, Y., Rubachev, I., Babenko, A.: On embeddings for numerical features in tabular deep learning. Adv. Neural. Inf. Process. Syst. 35, 24991–25004 (2022)
Gorishniy, Y., Rubachev, I., Khrulkov, V., Babenko, A.: Revisiting deep learning models for tabular data. Adv. Neural. Inf. Process. Syst. 34, 18932–18943 (2021)
Guo, H., Tang, R., Ye, Y., Li, Z., He, X.: Deepfm: a factorization-machine based neural network for ctr prediction (2017). arXiv preprint arXiv:1703.04247
Huang, X., Khetan, A., Cvitkovic, M., Karnin, Z.: Tabtransformer: tabular data modeling using contextual embeddings (2020). arXiv preprint arXiv:2012.06678
Kadra, A., Lindauer, M., Hutter, F., Grabocka, J.: Well-tuned simple nets excel on tabular datasets. Adv. Neural. Inf. Process. Syst. 34, 23928–23941 (2021)
Kaggle: don’t get kicked (2011). https://www.kaggle.com/competitions/DontGetKicked
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. Adv. Neural Inf. Process. Syst. 30 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13, pp. 740–755. Springer (2014)
Pang, G., Aggarwal, C., Shen, C., Sebe, N.: Editorial deep learning for anomaly detection. IEEE Trans. Neural Netw. Learn. Syst. 33(6), 2282–2286 (2022)
Pang, G., Shen, C., Van Den Hengel, A.: Deep anomaly detection with deviation networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 353–362 (2019)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)
Popov, S., Morozov, S., Babenko, A.: Neural oblivious decision ensembles for deep learning on tabular data (2019). arXiv preprint arXiv:1909.06312
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Shwartz-Ziv, R., Armon, A.: Tabular data: deep learning is not all you need. Inf. Fusion 81, 84–90 (2022)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Tianchi: Default (2021). https://tianchi.aliyun.com/dataset/107537
Tianchi: Bank product (2023). https://tianchi.aliyun.com/dataset/163247
Ucar, T., Hajiramezanali, E., Edwards, L.: Subtab: subsetting features of tabular data for self-supervised representation learning. Adv. Neural. Inf. Process. Syst. 34, 18853–18865 (2021)
Ulmer, D., Meijerink, L., Cinà, G.: Trust issues: uncertainty estimation does not enable reliable ood detection on medical tabular data. In: Machine Learning for Health, pp. 341–354. PMLR (2020)
Urban, C.J., Gates, K.M.: Deep learning: a primer for psychologists. Psychol. Methods 26(6), 743 (2021)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Wang, S., Liu, J., Yu, G., Liu, X., Zhou, S., Zhu, E., Yang, Y., Yin, J., Yang, W.: Multiview deep anomaly detection: a systematic exploration. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Zhang, Q., Cao, L., Shi, C., Niu, Z.: Neural time-aware sequential recommendation by jointly modeling preference dynamics and explicit feature couplings. IEEE Trans. Neural Netw. Learn. Syst. 33(10), 5125–5137 (2021)
Zhang, S., Yao, L., Sun, A., Tay, Y.: Deep learning based recommender system: a survey and new perspectives. ACM Comput. Surv. (CSUR) 52(1), 1–38 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xie, M., An, H., Han, S., Mao, J., Jiang, Y., Wang, J. (2025). BertTab: Table Learning with Feature Descriptions and Context. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15041. Springer, Singapore. https://doi.org/10.1007/978-981-97-8795-1_4
Download citation
DOI: https://doi.org/10.1007/978-981-97-8795-1_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8794-4
Online ISBN: 978-981-97-8795-1
eBook Packages: Computer ScienceComputer Science (R0)