research-article

Combining OCL and natural language: a call for a community effort

Authors:

Lola BurgueñoAuthors Info & Claims

MODELS '22: Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings

Pages 908 - 912

https://doi.org/10.1145/3550356.3561542

Published: 09 November 2022 Publication History

Abstract

The growing popularity and availability of pretrained natural language models opens the door to many interesting applications combining natural language (NL) with software artefacts. A couple of examples are the generation of code excerpts from NL instructions or the verbalization of programs in NL to facilitate their comprehension.

Many of these language models have been trained with open source software datasets and therefore "understand" a variety of programming languages, but not OCL.

We argue that OCL needs to jump into the machine learning bandwagon or it will risk losing its appeal as a constraint specification language. For that, the key first task is to create together an OCL corpus dataset amenable for natural language processing.

References

[1]

Imran Sarwar Bajwa, Behzad Bordbar, and Mark G. Lee. 2010. OCL Constraints Generation from Natural Language Specification. In Proceedings of the 14th IEEE International Enterprise Distributed Object Computing Conference, EDOC 2010, Vitòria, Brazil, 25-29 October 2010. IEEE Computer Society, 204--213.

Digital Library

[2]

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri S. Chatterji, Annie S. Chen, Kathleen Creel, Jared Quincy Davis, Dorottya Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah D. Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark S. Krass, Ranjay Krishna, Rohith Kuditipudi, and et al. 2021. On the Opportunities and Risks of Foundation Models. CoRR abs/2108.07258 (2021). arXiv:2108.07258 https://arxiv.org/abs/2108.07258

[3]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.

[4]

Lola Burgueño, Jordi Cabot, Manuel Wimmer, and Steffen Zschaler. 2022. Guest editorial to the theme section on AI-enhanced model-driven engineering. Softw. Syst. Model. 21, 3 (2022), 963--965.

Digital Library

[5]

Jordi Cabot, Raquel Pau, and Ruth Raventòs. 2010. From UML/OCL to SBVR specifications: A challenging transformation. Inf. Syst. 35, 4 (2010), 417--440.

Digital Library

[6]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).

[7]

Birgit Demuth, Heinrich Hussmann, and Sten Loecher. 2001. OCL as a specification language for business rules in database applications. In International Conference on the Unified Modeling Language. Springer, 104--117.

[8]

Marina Egea and Carolina Dania. 2019. SQL-PL4OCL: an automatic code generator from OCL to SQL procedural language. Software & Systems Modeling 18, 1 (2019), 769--791.

Digital Library

[9]

Martin Gogolla and Jordi Cabot. 2016. Continuing a Benchmark for UML and OCL Design and Analysis Tools. In Software Technologies: Applications and Foundations - STAF 2016 Collocated Workshops: DataMod, GCM, HOFM, MELO, SEMS, VeryComp, Vienna, Austria, July 4-8, 2016, Revised Selected Papers (Lecture Notes in Computer Science, Vol. 9946), Paolo Milazzo, Dániel Varrò, and Manuel Wimmer (Eds.). Springer, 289--302.

[10]

SOM Research Group. 2022. NL-OCL corpus - Git Repository. https://github.com/SOM-Research/nl-ocl.

[11]

Jeremy Howard and Sebastian Ruder. 2018. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 (2018).

[12]

Xi Victoria Lin, Richard Socher, and Caiming Xiong. 2020. Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020, November 16-20, 2020.

[13]

José Antonio Hernández Lòpez, Javier Luis Cánovas Izquierdo, and Jesús Sánchez Cuadrado. 2022. ModelSet: a dataset for machine learning in model-driven engineering. Softw. Syst. Model. 21, 3 (2022), 967--986.

Digital Library

[14]

Josh G. M. Mengerink, Jeroen Noten, and Alexander Serebrenik. 2019. Empowering OCL research: a large-scale corpus of open-source data from GitHub. Empir. Softw. Eng. 24, 3 (2019), 1574--1609.

Digital Library

[15]

Farid Meziane, Nikos Athanasakis, and Sophia Ananiadou. 2008. Generating natural language specifications from UML class diagrams. Requirements Engineering 13, 1 (2008), 1--18.

Digital Library

[16]

XiPeng Qiu, TianXiang Sun, YiGe Xu, YunFan Shao, Ning Dai, and XuanJing Huang. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (sep 2020), 1872--1897.

[17]

Ben Wang. 2021. Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX. https://github.com/kingofiolz/mesh-transformer-jax.

[18]

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38--45.

[19]

Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, and Dragomir Radev. 2018. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium.

[20]

Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. CoRR abs/1709.00103 (2017).

Cited By

Marchezan LAssunção WHerac EShafiq SEgyed A(2024)Exploring Dependencies Among Inconsistencies to Enhance the Consistency Maintenance of Models2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00023(147-158)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00023
Abukhalaf SHamdaqa MKhomh F(2023)On Codex Prompt Engineering for OCL Generation: An Empirical Study2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00033(148-157)Online publication date: May-2023
https://doi.org/10.1109/MSR59073.2023.00033

Index Terms

Combining OCL and natural language: a call for a community effort
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

OCL Constraints Generation from Natural Language Specification
EDOC '10: Proceedings of the 2010 14th IEEE International Enterprise Distributed Object Computing Conference

Object Constraint Language (OCL) plays a key role in Unified Modeling Language (UML). In the UML standards, OCL is used for expressing constraints such as well-definedness criteria. In addition OCL can be used for specifying constraints on the models ...
Empowering OCL research: a large-scale corpus of open-source data from GitHub

Model-driven engineering (MDE) enables the rise in abstraction during development in software and system design. In particular, meta-models become a central artifact in the process, and are supported by various other artifacts such as editors and ...
Behavior Modeling with Interaction Diagrams in a UML and OCL Tool
BM-FA '14: Proceedings of the 2014 Workshop on Behaviour Modelling-Foundations and Applications

This contribution discusses system modeling with UML behavior diagrams. We consider statecharts and both kinds of interaction diagrams, i.e., sequence and communication diagrams. We present new implementation features in a UML and OCL modeling tool: (1) ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MODELS '22: Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings

October 2022

1003 pages

ISBN:9781450394673

DOI:10.1145/3550356

Conference Chairs:
Thomas Kühn
Karlsruhe Institute of Technology, Germany
,
Vasco Sousa
University of Montréal, Canada

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

Univ. of Montreal: University of Montreal
IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 November 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Ministerio de Ciencia, Innovación y Universidades

Conference

MODELS '22

Sponsor:

SIGSOFT

MODELS '22: ACM/IEEE 25th International Conference on Model Driven Engineering Languages and Systems

October 23 - 28, 2022

Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 144 of 506 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
71
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)1

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Marchezan LAssunção WHerac EShafiq SEgyed A(2024)Exploring Dependencies Among Inconsistencies to Enhance the Consistency Maintenance of Models2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00023(147-158)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00023
Abukhalaf SHamdaqa MKhomh F(2023)On Codex Prompt Engineering for OCL Generation: An Empirical Study2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00033(148-157)Online publication date: May-2023
https://doi.org/10.1109/MSR59073.2023.00033

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents