Offline handwritten mathematical expression recognition based on YOLOv5s

Fei Li^1,2,
Hongbo Fang¹,
Dengzhun Wang¹,
Ruixin Liu¹,
Qing Hou³ &
…
Benliang Xie ORCID: orcid.org/0000-0002-9854-8300^1,2

479 Accesses
2 Citations
Explore all metrics

Abstract

The error accumulation in traditional offline handwritten mathematical expression recognition (OHMER) becomes challenging, because of the two-dimensional structure and writing arbitrariness of offline handwritten mathematical formulas. In this study, an OHMER method based on YOLOv5s was proposed. First, YOLOv5s was used to recognize the symbol category and spatial location information of the expression image. Second, the spatial attention mechanism was introduced in YOLOv5s to enlarge the difference among symbol categories and improve accuracy. Then, a bidirectional long short-term memory network (BiLSTM) was introduced to give the symbols context-related information. Finally, the contextual relevance of the symbols was improved by increasing the number of BiLSTM layers, achieving an accuracy of 95.67%. A mathematical expressions relationship tree was built using the symbol recognition results. Clustering theory was used to analyze the two-dimensional structure of expressions. The recognition accuracy of expressions on the CROHME 2019 Test was 65.47%. The recognition rate of YOLOv5s_SB3CT is second only to that of PAL. However, the recognition rate of YOLOv5_SB3CT is higher than that of PAL when the error is less than three. This finding demonstrates that the proposed model is more fault-tolerant and stable than other models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Global Context for Improving Recognition of Online Handwritten Mathematical Expressions

Learning Symbol Relation Tree for Online Handwritten Mathematical Expression Recognition

Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offline Handwritten Mathematical Expression Recognition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The data set used in this article comes from the CROHME, the other data supporting this study’s findings are available from the corresponding author upon reasonable request.

References

Yang, C., Du, J., Zhang, J.S., Wu, C.J., Chen, M.J., Wu, J.J.: Tree-based data augmentation and mutual learning for offline handwritten mathematical expression recognition. Pattern Recognit. 132, 108910 (2022). https://doi.org/10.1016/j.patcog.2022.108910
Article Google Scholar
Pambudi, S., Hidayatulloh, I., Surjono, H.D., Sukardiyono, T.: Development of instructional videos for the principles of 3D computer animation. J. Phys.: Conf. Ser. 1737(1), 012022 (2021). https://doi.org/10.1088/1742-6596/1737/1/012022
Article Google Scholar
Choudhary, A., Ahlawat, S., Gupta, H., Bhandari, A., Dhall, A., Kumar, M.: Offline handwritten mathematical expression evaluator using convolutional neural network. In: 2020 International Conference on Innovative Computing and Communications, pp. 527–537 (2020). https://doi.org/10.1007/978-981-15-5148-2_47
Zhang, J., Du, J., Dai, L.: Multi-scale attention with dense encoder for handwritten mathematical expression recognition. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 2245–2250 (2018). https://doi.org/10.1109/ICPR.2018.8546031
Zanibbi, R., Blostein, D., Cordy, J.R.: Recognizing mathematical expressions using tree transformation. IEEE Trans. Pattern. Anal. Mach. Intell. 24(11), 1455–1467 (2002). https://doi.org/10.1109/TPAMI.2002.1046157
Article Google Scholar
Mouchère, H., Viard-Gaudin, C., Zanibbi, R., Garain, U.: ICFHR 2014 competition on recognition of on-line handwritten mathematical expressions (CROHME 2014). In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 791–796 (2014). https://doi.org/10.1109/ICFHR.2014.138
Álvaro, F., Sánchez, J.A., Benedí, J.M.: An integrated grammar-based approach for mathematical expression recognition. Pattern Recognit. 51, 135–147 (2016). https://doi.org/10.1016/j.patcog.2015.09.013
Article ADS Google Scholar
Hirata, N.S.T., Julca-Aguilarm, F.D.: Matching based ground-truth annotation for online handwritten mathematical expressions. Pattern Recognit. 48(3), 837–848 (2015). https://doi.org/10.1016/j.patcog.2014.09.015
Article ADS Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81
Zhang, J.S., Du, J., Dai, L.R.: Track, attend, and parse (tap): an end-to-end framework for online handwritten mathematical expression recognition. IEEE Trans. Multimedia. 21(1), 221–233 (2019). https://doi.org/10.1109/ICFHR2020.2020.00047
Article Google Scholar
Ding, L., Wang, Y., Laganiѐre, R., Huang, D., Luo, X., Zhang, H.: A robust and fast multispectral pedestrian detection deep network. Knowl Based Syst. 227, 106990 (2021). https://doi.org/10.1016/j.knosys.2021.106990
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91
Mahdavi, M., Zanibbi, R., Mouchere, H., Viard-Gaudin, C., Garain, U.: ICDAR 2019 CROHME + TFD: Competition on recognition of handwritten mathematical expressions and typeset formula detection. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1533–1538 (2019). https://doi.org/10.1109/ICDAR.2019.00247
Truong, T.N., Nguyen, H.T., Nguyen, C.T., Nakagawa, M.: Learning symbol relation tree for online handwritten mathematical expression recognition. In: 2021 Asian Conference on Pattern Recognition (ACPR), pp. 307–321 (2021). https://doi.org/10.1007/978-3-031-02444-3_23
Xu, H., Wang, Z., Zhang, Y., Weng, X., Wang, Z., Zhou, G.: Document structure model for survey generation using neural network. Front. Comput. Sci. 15(4), 1–10 (2021). https://doi.org/10.1007/s11704-020-9366-8
Article Google Scholar
Mouchere, H., Viard-Gaudin, C., Kim, D.H., Kim, J.H., Garain, U.: Crohme2011: competition on recognition of online handwritten mathematical expressions. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1497–1500 (2011). https://doi.org/10.1109/ICDAR.2011.297
Wang, Y., Li, K., Lei, Y.: A general multi-scale image classification based on shared conversion matrix routing. Appl. Intell. 52(3), 3249–3265 (2022). https://doi.org/10.1007/s10489-021-02558-1
Article Google Scholar
Woo, S., Park, J., Lee J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: 2018 European Conference on Computer Vision (ECCV), pp. 3–19 (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Cheng, Z., Qu, A., He, X.: Contour-aware semantic segmentation network with spatial attention mechanism for medical image. Vis. Comput. 38, 749–762 (2022). https://doi.org/10.1007/s00371-021-02075-9
Article PubMed Google Scholar
Song, Y., Tian, S., Yu, L.: A method for identifying local drug names in xinjiang based on BERT-BiLSTM-CRF. Autom. Control Comput. Sci. 54, 179–190 (2020). https://doi.org/10.3103/S0146411620030098
Article Google Scholar
Liang, D., Liang, H., Yu, Z., Zhang, Y.: Deep convolutional BiLSTM fusion network for facial expression recognition. Vis. Comput. 36(3), 499–508 (2020). https://doi.org/10.1007/s00371-019-01636-3
Article Google Scholar
Xu, Y., Wei, M.: Multi-view clustering toward aerial images by combining spectral analysis and local refinement. Future. Gener. Comput. Syst. 117, 138–144 (2021). https://doi.org/10.1016/j.future.2020.11.005
Article Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z.M., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J.J, Chintala, S.: PyTorch: an imperative style, high-performance deep learning library. In: 33rd Conference on Neural Information Processing Systems (NeurIPS), pp. 8024–8035 (2019). https://doi.org/10.48550/arXiv.1912.01703
Ge, Z., Liu, S. T., Wang, F., Li, Z. M., Sun, J.: YOLOX: exceeding YOLO Series in 2021. In: 2021 Computer Vision and Pattern Recognition (CVPR) (2021). https://doi.org/10.48550/arXiv.2107.08430
Le, A.D., Indurkhya, B., Nakagawa, M.: Pattern generation strategies for improving recognition of handwritten mathematical expressions. Pattern Recognit. Lett. 128, 255–262 (2019). https://doi.org/10.1016/j.patrec.2019.09.002
Article ADS Google Scholar
Chan, C.: Stroke extraction for offline handwritten mathematical expression recognition. IEEE Access. 8, 61565–61575 (2020). https://doi.org/10.1109/ACCESS.2020.2984627
Article Google Scholar
Wu, J., Yin, F., Zhang, Y., Zhang, X., Liu, C.: Image-to-markup generation via paired adversarial learning. In: 2018 Joint European Conference on Machine Learning and Knowledge Discovery in Databases. pp. 18–34 (2018). https://doi.org/10.1007/978-3-030-10925-7_2
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. In: 2018 Computer Vision and Pattern Recognition (CVPR), 1804.02767 (2018). https://doi.org/10.48550/arXiv.1804.02767

Download references

Acknowledgements

This work is partly supported by the National Natural Science Foundation of China (No. 61562009), the Open Fund Project in Semiconductor Power Device Reliability Engineering Center of Ministry of Education (No. ERCMEKFJJ2019-06), Guizhou Provincial Science and Technology Projects (No. [2023]060), Guizhou Provincial Science and Technology Support Plan (No. [2022]003).

Author information

Authors and Affiliations

College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
Fei Li, Hongbo Fang, Dengzhun Wang, Ruixin Liu & Benliang Xie
Power Semiconductor Device Reliability Engineering Center of the Ministry of Education, Guiyang, 550025, China
Fei Li & Benliang Xie
Guizhou Communication Industry Service Co., Ltd, Guiyang, 550002, China
Qing Hou

Authors

Fei Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongbo Fang
View author publications
You can also search for this author in PubMed Google Scholar
Dengzhun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ruixin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qing Hou
View author publications
You can also search for this author in PubMed Google Scholar
Benliang Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benliang Xie.

Ethics declarations

Conflict on interests

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, F., Fang, H., Wang, D. et al. Offline handwritten mathematical expression recognition based on YOLOv5s. Vis Comput 40, 1439–1452 (2024). https://doi.org/10.1007/s00371-023-02859-1

Download citation

Accepted: 01 April 2023
Published: 22 April 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00371-023-02859-1

Offline handwritten mathematical expression recognition based on YOLOv5s

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Global Context for Improving Recognition of Online Handwritten Mathematical Expressions

Learning Symbol Relation Tree for Online Handwritten Mathematical Expression Recognition

Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offline Handwritten Mathematical Expression Recognition

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict on interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Offline handwritten mathematical expression recognition based on YOLOv5s

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Global Context for Improving Recognition of Online Handwritten Mathematical Expressions

Learning Symbol Relation Tree for Online Handwritten Mathematical Expression Recognition

Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offline Handwritten Mathematical Expression Recognition

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict on interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation