Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3524610.3527878acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper

pycefr: Python competency level through code analysis

Published: 20 October 2022 Publication History

Abstract

Python is known to be a versatile language, well suited both for beginners and advanced users. Some elements of the language are easier to understand than others: some are found in any kind of code, while some others are used only by experienced programmers. The use of these elements lead to different ways to code, depending on the experience with the language and the knowledge of its elements, the general programming competence and programming skills, etc. In this paper, we present pycefr, a tool that detects the use of the different elements of the Python language, effectively measuring the level of Python proficiency required to comprehend and deal with a fragment of Python code. Following the well-known Common European Framework of Reference for Languages (CEFR), widely used for natural languages, pycefr categorizes Python code in six levels, depending on the proficiency required to create and understand it. We also discuss different use cases for pycefr: identifying code snippets that can be understood by developers with a certain proficiency, labeling code examples in online resources such as Stackoverflow and GitHub to suit them to a certain level of competency, helping in the onboarding process of new developers in Open Source Software projects, etc. A video shows availability and usage of the tool: https://tinyurl.com/ypdt3fwe.

References

[1]
Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. 2016. You Get Where You're Looking for: The Impact of Information Sources on Code Security. In Proceedings of the IEEE Symposium on Security and Privacy (SP '16). IEEE, 289--305.
[2]
Carol V Alexandru, José J Merchante, Sebastiano Panichella, Sebastian Proksch, Harald C Gall, and Gregorio Robles. 2018. On the usage of pythonic idioms. In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. 1--11.
[3]
Andrea Capiluppi, Alexander Serebrenik, and Leif Singer. 2012. Assessing technical candidates on the social web. IEEE software 30, 1 (2012), 45--51.
[4]
Andrea Capiluppi, Alexander Serebrenik, and Ahmmad Youssef. 2012. Developing an h-index for OSS developers. In 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). IEEE, 251--254.
[5]
Wesley Chun. 2001. Core python programming. Vol. 1. Prentice Hall Professional.
[6]
Peter JA Cock, Tiago Antao, Jeffrey T Chang, Brad A Chapman, Cymon J Cox, Andrew Dalke, Iddo Friedberg, Thomas Hamelryck, Frank Kauff, Bartek Wilczynski, et al. 2009. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 11 (2009), 1422--1423.
[7]
Bart Deygers, Beate Zeidler, Dina Vilcu, and Cecilie Hamnes Carlsen. 2018. One framework to unite them all? Use of the CEFR in European university entrance policies. Language Assessment Quarterly 15, 1 (2018), 3--15.
[8]
Allen Downey. 2012. Think python. "O'Reilly Media, Inc.".
[9]
Neus Figueras. 2007. The CEFR, a lever for the improvement of language professionals in Europe. Modern Language Journal (2007), 673--675.
[10]
Julia Hancke and Detmar Meurers. 2013. Exploring CEFR classification for German based on rich linguistic modeling. Learner Corpus Research (2013), 54--56.
[11]
Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the Naturalness of Software. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). 837--847.
[12]
Hugo. 2020. Python version share over time, 6. https://dev.to/hugovk/python-version-share-over-time-6-1jb8. Online; accessed 21 June 2021.
[13]
JetBrains. 2020. Python Developers Survey 2020 Results. https://www.jetbrains.com/lp/python-developers-survey-2020/. Online; accessed 21 June 2021.
[14]
Nurdan Kavakli and Sezen Arslan. 2017. Applying EALTA guidelines as baseline for the foreign language proficiency test in Turkey: The case of YDS. International Journal of Curriculum and Instruction 9, 1 (2017), 104--118.
[15]
Dave Kuhlman. 2009. A python book: Beginning python, advanced python, and python exercises. Dave Kuhlman Lutz.
[16]
Mark Lutz. 2001. Programming python. "O'Reilly Media, Inc.".
[17]
Waldemar Martyniuk. 2011. Aligning Tests with the CEFR. Ernst Klett Sprachen.
[18]
Brian North. 2007. The CEFR illustrative descriptor scales. The Modern Language Journal 91, 4 (2007), 656--659.
[19]
Council of Europe. 2021. https://www.coe.int/en/web/common-european-framework-reference-languages
[20]
Purit Phan-udom, Naruedon Wattanakul, Tattiya Sakulniwat, Chaiyong Ragkhitwetsagul, Thanwadee Sunetnanta, Morakot Choetkiertikul, and Raula Gaikovina Kula. 2020. Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 806--809.
[21]
Mark Pilgrim and Simon Willison. 2009. Dive Into Python 3. Vol. 2. Springer.
[22]
Chaiyong Ragkhitwetsagul, Jens Krinke, Matheus Paixao, Giuseppe Bianco, and Rocco Oliveto. 2021. Toxic Code Snippets on Stack Overflow. IEEE Transactions on Software Engineering 47, 3 (2021), 560--581.
[23]
Tattiya Sakulniwat, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanwadee Sunetnanta, Dong Wang, Takashi Ishio, and Kenichi Matsumoto. 2019. Visualizing the Usage of Pythonic Idioms over Time: A Case Study of the with open Idiom. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). IEEE, 43--435.
[24]
Anita Sarma, Xiaofan Chen, Sandeep Kuttal, Laura Dabbish, and Zhendong Wang. 2016. Hiring in the global stage: Profiles of online contributions. In 2016 IEEE 11th International Conference on Global Software Engineering (ICGSE). IEEE, 1--10.
[25]
Igor Steinmacher, Marco Aurelio Graciotto Silva, Marco Aurelio Gerosa, and David F Redmiles. 2015. A systematic literature review on the barriers faced by newcomers to open source software projects. Information and Software Technology 59 (2015), 67--85.
[26]
Stephen Cass. 2021. Top Programming Languages 2021: Python dominates as the de facto platform for new technologies. https://spectrum-ieee-org.ejournal.mahidol.ac.th/top-programming-languages-2021. Online; accessed 21 October 2021.
[27]
Mark Summerfield. 2010. Programming in Python 3: a complete introduction to the Python language. Addison-Wesley Professional.
[28]
TIOBE. 2021. TIOBE Index for October 2021. https://www.tiobe.com/tiobe-index/. Online; accessed 21 October 2021.
[29]
Bogdan Vasilescu, Alexander Serebrenik, Prem Devanbu, and Vladimir Filkov. 2014. How social Q&A sites are changing knowledge sharing in open source software communities. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing. 342--354.
[30]
Bogdan Vasilescu, Alexander Serebrenik, and Mark GJ van den Brand. 2013. The Babel of software development: Linguistic diversity in open source. In International Conference on Social Informatics. Springer, 391--404.
[31]
Jie Yang, Claudia Hauff, Alessandro Bozzon, and Geert-Jan Houben. 2014. Asking the right question in collaborative q&a systems. In Proceedings of the 25th ACM conference on Hypertext and social media. 179--189.
[32]
Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, and Miryung Kim. 2018. Are Code Examples on an Online Q&A Forum Reliable? A Study of API Misuse on Stack Overflow. In Proceedings of the 40th International Conference on Software Engineering - ICSE '18. 886--896.

Cited By

View all
  • (2023)Do Developers Present Proficient Code Snippets in Their README Files? An Analysis of PyPI Libraries in GitHubJournal of Information Processing10.2197/ipsjjip.31.67931(679-688)Online publication date: 2023
  • (2023)Towards Assessment of Practicality of Introductory Programming Course Using Vocabulary of Textbooks, Assignments, and Actual Projects2023 IEEE 35th International Conference on Software Engineering Education and Training (CSEE&T)10.1109/CSEET58097.2023.00046(199-200)Online publication date: Aug-2023
  • (2022)Visualizing Contributor Code Competency for PyPI Libraries: Preliminary Results2022 29th Asia-Pacific Software Engineering Conference (APSEC)10.1109/APSEC57359.2022.00065(472-476)Online publication date: Dec-2022

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension
May 2022
698 pages
ISBN:9781450392983
DOI:10.1145/3524610
  • Conference Chairs:
  • Ayushi Rastogi,
  • Rosalia Tufano,
  • General Chair:
  • Gabriele Bavota,
  • Program Chairs:
  • Venera Arnaoudova,
  • Sonia Haiduc
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Short-paper

Funding Sources

Conference

ICPC '22
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Do Developers Present Proficient Code Snippets in Their README Files? An Analysis of PyPI Libraries in GitHubJournal of Information Processing10.2197/ipsjjip.31.67931(679-688)Online publication date: 2023
  • (2023)Towards Assessment of Practicality of Introductory Programming Course Using Vocabulary of Textbooks, Assignments, and Actual Projects2023 IEEE 35th International Conference on Software Engineering Education and Training (CSEE&T)10.1109/CSEET58097.2023.00046(199-200)Online publication date: Aug-2023
  • (2022)Visualizing Contributor Code Competency for PyPI Libraries: Preliminary Results2022 29th Asia-Pacific Software Engineering Conference (APSEC)10.1109/APSEC57359.2022.00065(472-476)Online publication date: Dec-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media