Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3155562.3155576guideproceedingsArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
Article
Free access

Perceived language complexity in GitHub issue discussions and their effect on issue resolution

Published: 30 October 2017 Publication History

Abstract

Modern software development is increasingly collaborative. Open Source Software (OSS) are the bellwether; they support dynamic teams, with tools for code sharing, communication, and issue tracking. The success of an OSS project is reliant on team communication. E.g., in issue discussions, individuals rely on rhetoric to argue their position, but also maintain technical relevancy. Rhetoric and technical language are on opposite ends of a language complexity spectrum: the former is stylistically natural; the latter is terse and concise. Issue discussions embody this duality, as developers use rhetoric to describe technical issues. The style mix in any discussion can define group culture and affect performance, e.g., issue resolution times may be longer if discussion is imprecise.
Using GitHub, we studied issue discussions to understand whether project-specific language differences exist, and to what extent users conform to a language norm. We built project-specific and overall GitHub language models to study the effect of perceived language complexity on multiple responses. We find that experienced users conform to project-specific language norms, popular individuals use overall GitHub language rather than project-specific language, and conformance to project-specific language norms reduces issue resolution times. We also provide a tool to calculate project-specific perceived language complexity.

References

[1]
S. K. Deckert and C. H. Vickers, An introduction to sociolinguistics: Society and identity. A&C Black, 2011.
[2]
D. Hymes, “Two types of linguistic relativity,” in Sociolinguistics: proceedings of the UCLA Sociolinguistics Conference, 1964, pp. 114– 67.
[3]
K. Varantola, “Special language and general language: Linguistic and didactic aspects,” Unesco Alsed-LSP Newsletter (1977-2000), vol. 9, no. 2, 1986.
[4]
J. Lave and E. Wenger, Situated learning: Legitimate peripheral participation. Cambridge university press, 1991.
[5]
E. Wenger, Communities of practice: Learning, meaning, and identity. Cambridge university press, 1998.
[6]
E. Wenger, R. A. McDermott, and W. Snyder, Cultivating communities of practice: A guide to managing knowledge. Harvard Business Press, 2002.
[7]
J. Holmes and M. Meyerhoff, “The community of practice: Theories and methodologies in language and gender research,” Language in society, vol. 28, no. 02, pp. 173–183, 1999.
[8]
L. Dubé, A. Bourhis, and R. Jacob, “The impact of structuring characteristics on the launching of virtual communities of practice,” Journal of Organizational Change Management, vol. 18, no. 2, pp. 145–166, 2005.
[9]
J. Kleinnijenhuis, B. van den Hooff, S. Utz, I. Vermeulen, and M. Huysman, “Social influence in networks of practice: An analysis of organizational communication content,” Communication Research, vol. 38, no. 5, pp. 587–612, 2011.
[10]
C. Llamas, L. Mullany, and P. Stockwell, The Routledge companion to sociolinguistics. Routledge, 2006.
[11]
P. Eckert, “Communities of practice,” Encyclopedia of language and linguistics, vol. 2, no. 2006, pp. 683–685, 2006.
[12]
J. S. Justeson and S. M. Katz, “Technical terminology: some linguistic properties and an algorithm for identification in text,” Natural language engineering, vol. 1, no. 01, pp. 9–27, 1995.
[13]
W. R. Fisher, “Human communication as narration: Toward a philosophy of reason, value, and action,” 1989.
[14]
G. A. Hauser, L. C. Hawes, G. L. Wilson, G. Cheney, P. K. Tompkins, C. R. Burgchardt, C. J. Stewart, E. C. Clark, J. M. Hogan, F. J. Boster, G. M. Phillips, R. T. Craig, S. B. Shimanoff, C. Oravec, J. R. Bennett, E. Smokewood, C. L. Bartow, J. Blankenship, M. P. Graves, R. J. Connors, C. Kramarae, G. Berquist, R. M. Ogles, S. R. Brydon, S. R. Hankins, W. M. Purcell, V. O’Donnell, B. K. Duffy, S. H. Browne, M. Weiler, M. Cooper, and W. R. Fisher, “Book reviews,” Quarterly Journal of Speech, vol. 74, no. 3, pp. 347–400, 1988.
[15]
D. Birman, Acculturation and human diversity in a multicultural society. Jossey-Bass, 1994.
[16]
R. H. Teske and B. H. Nelson, “Acculturation and assimilation: A clarification,” American Ethnologist, vol. 1, no. 2, pp. 351–367, 1974.
[17]
P. G. Zimbardo, “Involvement and communication discrepancy as determinants of opinion conformity.” The Journal of Abnormal and Social Psychology, vol. 60, no. 1, p. 86, 1960.
[18]
E. P. Dozier, “Two examples of linguistic acculturation: The yaqui of sonora and arizona and the tewa of new mexico,” Language, vol. 32, no. 1, pp. 146–157, 1956.
[19]
M. Lea, D. Barton, and K. Tusting, “Communities of practice in higher education,” Beyond communities of practice: Language, power and social context, pp. 180–197, 2005.
[20]
Q. Xuan, M. Gharehyazie, P. T. Devanbu, and V. Filkov, “Measuring the effect of social communications on individual working rhythms: A case study of open source software,” in Social Informatics (SocialInformatics), 2012 International Conference on. IEEE, 2012, pp. 78–85.
[21]
Q. Xuan, P. Devanbu, and V. Filkov, “Converging work-talk patterns in online task-oriented communities,” PloS one, vol. 11, no. 5, p. e0154324, 2016.
[22]
A. N. Kolmogorov, “Three approaches to the quantitative definition ofinformation’,” Problems of information transmission, vol. 1, no. 1, pp. 1–7, 1965.
[23]
T. M. Cover, P. Gacs, and R. M. Gray, “Kolmogorov’s contributions to information theory and algorithmic complexity,” The annals of probability, vol. 17, no. 3, pp. 840–865, 1989.
[24]
T. M. Cover and J. A. Thomas, Elements of information theory. John Wiley & Sons, 2012.
[25]
I. Kontoyiannis, The complexity and entropy of literary styles. Department of Statistics, Stanford University, 1997.
[26]
P. Juola and R. H. Baayen, “A controlled-corpus experiment in authorship identification by cross-entropy,” Literary and Linguistic Computing, vol. 20, no. Suppl, pp. 59–67, 2005.
[27]
H. Kwon, H. T. Kwon, and W. C. Yoon, “An information-theoretic evaluation of narrative complexity for interactive writing support,” Expert Systems with Applications, vol. 53, pp. 219–230, 2016.
[28]
L. Dabbish, C. Stuart, J. Tsay, and J. Herbsleb, “Social coding in github: transparency and collaboration in an open software repository,” in Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work. ACM, 2012, pp. 1277–1286.
[29]
J. Tsay, L. Dabbish, and J. Herbsleb, “Influence of social and technical factors for evaluating contribution in github,” in Proceedings of the 36th international conference on Software engineering. ACM, 2014, pp. 356–366.
[30]
K. Blincoe, J. Sheoran, S. Goggins, E. Petakovic, and D. Damian, “Understanding the popular users: Following, affiliation influence and leadership on github,” Information and Software Technology, vol. 70, pp. 30–39, 2016.
[31]
Y. Yu, H. Wang, V. Filkov, P. Devanbu, and B. Vasilescu, “Wait for it: Determinants of pull request evaluation latency on github,” in Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on. IEEE, 2015, pp. 367–371.
[32]
Y. Zhang, H. Wang, G. Yin, T. Wang, and Y. Yu, “Exploring the use of@-mention to assist software development in github,” in Proceedings of the 7th Asia-Pacific Symposium on Internetware. ACM, 2015, pp. 83–92.
[33]
V. J. Hellendoorn and P. Devanbu, “Are deep neural networks the best choice for modeling source code?” in Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2017. New York, NY, USA: ACM, 2017, pp. 763–773. {Online}. Available:
[34]
Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Journal of machine learning research, vol. 3, no. Feb, pp. 1137–1155, 2003.
[35]
S. F. Chen and J. Goodman, “An empirical study of smoothing techniques for language modeling,” in Proceedings of the 34th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 1996, pp. 310–318.
[36]
D. A. Belsley, E. Kuh, and R. E. Welsch, Regression diagnostics: Identifying influential data and sources of collinearity. John Wiley & Sons, 2005, vol. 571.
[37]
F. L. Schmidt and J. E. Hunter, Methods of meta-analysis: Correcting error and bias in research findings. Sage publications, 2014.
[38]
M. Hu, “What does it mean to have a low r-squared? a warning about misleading interpretation,” http://humanvarieties.org/2014/03/31/whatdoes-it-mean-to-have-a-low-r-squared-a-warning-about-misleadinginterpretation/#more-3185. Human Varieties, 2014.
[39]
P. Birnbaum, “On correlation, r, and r-squared,” http: //blog.philbirnbaum.com/2006/08/on-correlation-r-and-r-squared.html. Sabermetric Research, 2006.
[40]
P. Birnbaum, “r-squared abuse,” http://blog.philbirnbaum.com/2007/10/ r-squared-abuse.html. Sabermetric Research, 2007.
[41]
J. Cohen, Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum, 2003.
[42]
J. Cohen, “Statistical power analysis for the behavioral sciences (revised ed.),” 1977.
[43]
S. Hasan and H. Ney, “Clustered language models based on regular expressions for smt,” in Proc. of the 10th Annual Conf. of the European Association for Machine Translation (EAMT). Citeseer, 2005.
[44]
A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu, “On the naturalness of software,” in Software Engineering (ICSE), 2012 34th International Conference on. IEEE, 2012, pp. 837–847.
[45]
Z. Tu, Z. Su, and P. Devanbu, “On the localness of software,” in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 2014, pp. 269–280.
[46]
D. Posnett, E. Warburg, P. Devanbu, and V. Filkov, “Mining stack exchange: Expertise is evident from initial contributions,” in Social Informatics (SocialInformatics), 2012 International Conference on. IEEE, 2012, pp. 199–204.
[47]
B. Vasilescu, A. Capiluppi, and A. Serebrenik, “Gender, representation and online participation: A quantitative study of stackoverflow,” in Social Informatics (SocialInformatics), 2012 International Conference on. IEEE, 2012, pp. 332–338.
[48]
B. Vasilescu, D. Posnett, B. Ray, M. G. van den Brand, A. Serebrenik, P. Devanbu, and V. Filkov, “Gender and tenure diversity in github teams,” in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2015, pp. 3789–3798.
[49]
W. Ma, L. Chen, X. Zhang, Y. Zhou, and B. Xu, “How do developers fix cross-project correlated bugs?” To be presented at ICSE, 2017.
[50]
H. Pashler, N. J. Cepeda, J. T. Wixted, and D. Rohrer, “When does feedback facilitate learning of words?” Journal of Experimental Psychology: Learning, Memory, and Cognition, vol. 31, no. 1, p. 3, 2005.

Cited By

View all
  • (2019)Studying the difference between natural and programming language corporaEmpirical Software Engineering10.1007/s10664-018-9669-724:4(1823-1868)Online publication date: 1-Aug-2019
  • (2018)ICSDProceedings of the 34th Annual Computer Security Applications Conference10.1145/3274694.3274742(542-552)Online publication date: 3-Dec-2018
  • (2018)The Power of BotsProceedings of the ACM on Human-Computer Interaction10.1145/32744512:CSCW(1-19)Online publication date: 1-Nov-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering
October 2017
1033 pages
ISBN:9781538626849

Sponsors

Publisher

IEEE Press

Publication History

Published: 30 October 2017

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Studying the difference between natural and programming language corporaEmpirical Software Engineering10.1007/s10664-018-9669-724:4(1823-1868)Online publication date: 1-Aug-2019
  • (2018)ICSDProceedings of the 34th Annual Computer Security Applications Conference10.1145/3274694.3274742(542-552)Online publication date: 3-Dec-2018
  • (2018)The Power of BotsProceedings of the ACM on Human-Computer Interaction10.1145/32744512:CSCW(1-19)Online publication date: 1-Nov-2018

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media