Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICSE.2019.00065acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

How reliable is the crowdsourced knowledge of security implementation?

Published: 25 May 2019 Publication History

Abstract

Stack Overflow (SO) is the most popular online Q&A site for developers to share their expertise in solving programming issues. Given multiple answers to a certain question, developers may take the accepted answer, the answer from a person with high reputation, or the one frequently suggested. However, researchers recently observed that SO contains exploitable security vulnerabilities in the suggested code of popular answers, which found their way into security-sensitive high-profile applications that millions of users install every day. This observation inspires us to explore the following questions: How much can we trust the security implementation suggestions on SO? If suggested answers are vulnerable, can developers rely on the community's dynamics to infer the vulnerability and identify a secure counterpart?
To answer these highly important questions, we conducted a comprehensive study on security-related SO posts by contrasting secure and insecure advice with the community-given content evaluation. Thereby, we investigated whether SO's gamification approach on incentivizing users is effective in improving security properties of distributed code examples. Moreover, we traced the distribution of duplicated samples over given answers to test whether the community behavior facilitates or prevents propagation of secure and insecure code suggestions within SO.
We compiled 953 different groups of similar security-related code examples and labeled their security, identifying 785 secure answer posts and 644 insecure answer posts. Compared with secure suggestions, insecure ones had higher view counts (36,508 vs. 18,713), received a higher score (14 vs. 5), and had significantly more duplicates (3.8 vs. 3.0) on average. 34% of the posts provided by highly reputable so-called trusted users were insecure.
Our findings show that based on the distribution of secure and insecure code on SO, users being laymen in security rely on additional advice and guidance. However, the community-given feedback does not allow differentiating secure from insecure choices. The reputation mechanism fails in indicating trustworthy users with respect to security questions, ultimately leaving other users wandering around alone in a software security minefield.

References

[1]
"Stack Overflow goes beyond Q&As and launches crowdsourced documentation," https://techcrunch.com/2016/07/21/stack-overflow-goes-beyond-qas-and-launches-crowdsourced-documentation/, 2016.
[2]
"Stack Overflow's Crowdsourcing Model Guarantees Success," https://www.theatlantic.com/technology/archive/2010/11/stack-overflows-crowdsourcing-model-guarantees-success/66713/, 2010.
[3]
L. Mamykina, B. Manoim, M. Mittal, G. Hripcsak, and B. Hartmann, "Design Lessons from the Fastest Q&A Site in the West," in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI '11. New York, NY, USA: ACM, 2011, pp. 2857--2866.
[4]
Y. Acar, M. Backes, S. Fahl, D. Kim, M. L. Mazurek, and C. Stransky, "You get where you're looking for: The impact of information sources on code security," in 2016 IEEE Symposium on Security and Privacy (SP), May 2016, pp. 289--305.
[5]
F. Fischer, K. Böttinger, H. Xiao, C. Stransky, Y. Acar, M. Backes, and S. Fahl, "Stack Overflow considered harmful? The impact of copy&paste on Android application security," in 38th IEEE Symposium on Security and Privacy, 2017.
[6]
N. Meng, S. Nagy, D. Yao, W. Zhuang, and G. A. Argoty, "Secure coding practices in Java: Challenges and vulnerabilities," in ICSE, 2018.
[7]
S. Fahl, M. Harbach, T. Muders, L. Baumgärtner, B. Freisleben, and M. Smith, "Why Eve and Mallory love Android: An analysis of Android SSL (in)security," in Proceedings of the 2012 ACM Conference on Computer and Communications Security, ser. CCS. New York, NY, USA: ACM, 2012, pp. 50--61. {Online}. Available
[8]
S. Subramanian, L. Inozemtseva, and R. Holmes, "Live API documentation," in Proceedings of the 36th International Conference on Software Engineering, ser. ICSE 2014. New York, NY, USA: ACM, pp. 643--652.
[9]
T. Kamiya, S. Kusumoto, and K. Inoue, "CCFinder: A multilinguistic token-based code clone detection system for large scale source code," TSE, pp. 654--670, 2002.
[10]
P. Jurczyk and E. Agichtein, "Discovering authorities in question answer communities by using link analysis," in Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2007, pp. 919--922.
[11]
H. Xie, J. C. S. Lui, and D. Towsley, "Incentive and reputation mechanisms for online crowdsourcing systems," in 2015 IEEE 23rd International Symposium on Quality of Service (IWQoS), June 2015, pp. 207--212.
[12]
A. Katmada, A. Satsiou, and I. Kompatsiaris, "A reputation-based incentive mechanism for a crowdsourcing platform for financial awareness," in International Workshop on the Internet for Financial Collective Awareness and Intelligence, 2016, pp. 57--80.
[13]
"Privileges," https://stackoverflow.com/help/privileges.
[14]
"KSOAP 2 Android with HTTPS," https://stackoverflow.com/questions/3440062, 2010.
[15]
"The Success of Stack Exchange: Crowdsourcing + Reputation Systems," https://permut.wordpress.com/2012/05/03/the-success-of-stack-exchange-crowdsourcing-reputation-systems/, 2012.
[16]
"What is reputation? How do I earn (and lose) it?" https://stackoverflow.com/help/whats-reputation.
[17]
Y. Sheffer, R. Holz, and P. Saint-Andre, "Recommendations for Secure Use of Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS)," RFC 7525, May 2015. {Online}. Available: https://rfc-editor.org/rfc/rfc7525.txt
[18]
"KSOAP 2 Android with HTTPS," https://stackoverflow.com/questions/4957359, 2010.
[19]
"Android - Crittografy Cipher decrypt doesn't work," https://stackoverflow.com/questions/14490575, 2013.
[20]
"RSA Encryption-Decryption : BadPaddingException : Data must start with zero," https://stackoverflow.com/questions/14086194, 2012.
[21]
"How do I use 3DES encryption/decryption in Java?" https://stackoverflow.com/questions/20670, 2008.
[22]
"where to store confidential data like decryption key in android so that it can never be found by hacker," https://stackoverflow.com/questions/35082426, 2016.
[23]
"Stack Exchange Data Dump," https://archive.org/details/stackexchange, 2018.
[24]
"Networks-Learning/stackexchange-dump-to-postgres," https://github.com/Networks-Learning/stackexchange-dump-to-postgres, Visited on 7/31/2018.
[25]
"Bouncy castle," https://www.bouncycastle.org.
[26]
"The GNU Crypto project," https://www.gnu.org/software/gnu-crypto/, Visited on 7/31/18.
[27]
"jasypt," http://www.jasypt.org, 2014.
[28]
A. Dey and S. Weis, Keyczar: A Cryptographic Toolkit.
[29]
"scribejava," https://github.com/scribejava/scribejava, Visited on 7/31/2018.
[30]
"spongycastle," https://github.com/rtyley/spongycastle/releases, Visited on 7/31/2018.
[31]
J. E. Gentle, Computational Statistics, 1st ed. Springer Publishing Company, Incorporated, 2009.
[32]
M. P. Fay and M. A. Proschan, "Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules." Statistics Surveys, vol. 4, pp. 1--39, 2010.
[33]
"Cliff's Delta Calculator: A non-parametric effect size program for two groups of observations," Universitas Psychologica, vol. 10, pp. 545 -- 555, 05 2011. {Online}. Available: http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S1657-92672011000200018&nrm=iso
[34]
Laboratory of Cryptography and System Security (CrySyS Lab), "sky-wiper (a.k.a. flame a.k.a. flamer): A complex malware for targeted attacks," Budapest University of Technology and Economics, Tech. Rep., 2012.
[35]
M. Georgiev, S. Iyengar, S. Jana, R. Anubhai, D. Boneh, and V. Shmatikov, "The most dangerous code in the world: Validating SSL certificates in non-browser software," in Proceedings of the ACM Conference on Computer and Communications Security. New York, NY, USA: ACM, pp. 38--49.
[36]
T. Duong and J. Rizzo, "Here come the xor ninjas," unpublished manuscript 2011.
[37]
"Fast and simple String encrypt/decrypt in JAVA," https://stackoverflow.com/questions/5220761, 2011.
[38]
S. Wang, T.-H. Chen, and A. E. Hassan, "Understanding the factors for fast answers in technical Q&A websites," Empirical Software Engineering, vol. 23, no. 3, pp. 1552--1593, Jun 2018. {Online}. Available
[39]
"SSL Certificate Verification: javax.net.ssl.SSLHandshakeException," https://stackoverflow.com/questions/25079751/ssl-certificate-verification-javax-net-ssl-sslhandshakeexception.
[40]
"MessageDigest," https://docs.oracle.com/javase/9/docs/api/java/security/MessageDigest.html, Recently visited on 08/13/2018.
[41]
"is there a Java ECB provider?" https://stackoverflow.com/questions/5665680, 2011.
[42]
"Java: How implement AES with 128 bits with CFB and No Padding," https://stackoverflow.com/questions/6252501, 2011.
[43]
"Android Facebook API won't login," https://stackoverflow.com/questions/22150331, 2014.
[44]
"Trusting all certificates using HttpClient over HTTPS," https://stackoverflow.com/questions/4837230, 2011.
[45]
"encrypt message with symmetric key byte{} in Java," http://stackoverflow.com/questions/27621392, 2014.
[46]
F. Long, "Software vulnerabilities in Java," Software Engineering Institute, Carnegie Mellon University, Pittsburgh, PA, Tech. Rep. CMU/SEI-2005-TN-044, 2005. {Online}. Available: http://resources.sei.cmu.edu/library/asset-view.cfm?AssetID=7573
[47]
M. Egele, D. Brumley, Y. Fratantonio, and C. Kruegel, "An empirical study of cryptographic misuse in Android applications," in Proceedings of the ACM Conference on Computer and Communications Security, ser. CCS. New York, NY, USA: ACM, 2013, pp. 73--84. {Online}. Available
[48]
D. Lazar, H. Chen, X. Wang, and N. Zeldovich, "Why does cryptographic software fail? A case study and open problems," in Proceedings of 5th Asia-Pacific Workshop on Systems, ser. APSys '14. New York, NY, USA: ACM, 2014, pp. 7:1--7:7. {Online}. Available
[49]
"State of software security," https://www.veracode.com/sites/default/files/Resources/Reports/state-of-software-security-volume-7-veracode-report.pdf, 2016, veracode.
[50]
X.-L. Yang, D. Lo, X. Xia, Z.-Y. Wan, and J.-L. Sun, "What security questions do developers ask? A large-scale study of Stack Overflow posts," Journal of Computer Science and Technology, vol. 31, no. 5, pp. 910--924, Sep 2016. {Online}. Available
[51]
J. Xie, H. R. Lipford, and B. Chu, "Why do programmers make security errors?" in 2011 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), Sep. 2011, pp. 161--164.
[52]
R. Balebako and L. Cranor, "Improving App Privacy: Nudging App Developers to Protect User Privacy," IEEE Security & Privacy, vol. 12, no. 4, pp. 55--58, Jul. 2014.
[53]
S. Nadi, S. Krüger, M. Mezini, and E. Bodden, "Jumping through hoops: Why do Java developers struggle with cryptography APIs?" in Proceedings of the 38th International Conference on Software Engineering, ser. ICSE. New York, NY, USA: ACM, 2016, pp. 935--946. {Online}. Available
[54]
A. Bosu, C. S. Corley, D. Heaton, D. Chatterji, J. C. Carver, and N. A. Kraft, "Building reputation in stackoverflow: An empirical investigation," in 2013 10th Working Conference on Mining Software Repositories (MSR), May 2013, pp. 89--92.
[55]
A. Barua, S. W. Thomas, and A. E. Hassan, "What are developers talking about? An analysis of topics and trends in Stack Overflow," Empirical Software Engineering, vol. 19, no. 3, pp. 619--654, Jun 2014. {Online}. Available
[56]
M. S. Rahman, "An empirical case study on Stack Overflow to explore developers' security challenges," Master's thesis, Kansas State University, 2016.
[57]
T. Zhang, G. Upadhyaya, A. Reinhardt, H. Rajan, and M. Kim, "Are Code Examples on an Online Q&A Forum Reliable?: A Study of API Misuse on Stack Overflow," in Proceedings of the 40th International Conference on Software Engineering, ser. ICSE '18. New York, NY, USA: ACM, 2018, pp. 886--896.
[58]
B. Vasilescu, V. Filkov, and A. Serebrenik, "Stackoverflow and github: Associations between software development and crowdsourced knowledge," in 2013 International Conference on Social Computing, Sept 2013, pp. 188--195.
[59]
F. Chen and S. Kim, "Crowd debugging," in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2015. New York, NY, USA: ACM, 2015, pp. 320--332. {Online}. Available
[60]
M. Ahasanuzzaman, M. Asaduzzaman, C. K. Roy, and K. A. Schneider, "Mining Duplicate Questions in Stack Overflow," in Proceedings of the 13th International Conference on Mining Software Repositories, ser. MSR '16. New York, NY, USA: ACM, 2016, pp. 402--412.
[61]
L. An, O. Mlouki, F. Khomh, and G. Antoniol, "Stack overflow: A code laundering platform?" in 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), Feb 2017, pp. 283--293.
[62]
W. E. Zhang, Q. Z. Sheng, J. H. Lau, and E. Abebe, "Detecting Duplicate Posts in Programming QA Communities via Latent Semantics and Association Rules," in Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 1221--1229.
[63]
D. Yang, P. Martins, V. Saini, and C. Lopes, "Stack Overflow in Github: Any Snippets There?" in 2017 IEEE/ACM14th International Conference on Mining Software Repositories (MSR), May 2017, pp. 280--290.
[64]
N. H. Pham, T. T. Nguyen, H. A. Nguyen, and T. N. Nguyen, "Detection of recurring software vulnerabilities," in Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, ser. ASE '10. New York, NY, USA: ACM, 2010, pp. 447--456. {Online}. Available
[65]
J. Jang, A. Agrawal, and D. Brumley, "Redebug: Finding unpatched code clones in entire os distributions," in Proceedings of the 2012 IEEE Symposium on Security and Privacy, ser. SP '12. Washington, DC, USA: IEEE Computer Society, 2012, pp. 48--62. {Online}. Available
[66]
Z. Li, D. Zou, S. Xu, H. Jin, H. Qi, and J. Hu, "Vulpecker: An automated vulnerability detection system based on code similarity analysis," in Proceedings of the 32Nd Annual Conference on Computer Security Applications, ser. ACSAC '16. New York, NY, USA: ACM, 2016, pp. 201--213. {Online}. Available
[67]
S. Kim, S. Woo, H. Lee, and H. Oh, "Vuddy: A scalable approach for vulnerable code clone discovery," in 2017 IEEE Symposium on Security and Privacy (SP), May 2017, pp. 595--614.
[68]
"Keeping answers related to security up to date," https://meta.stackexchange.com/questions/301592/keeping-answers-related-to-security-up-to-date, 2017.
[69]
B. He, V. Rastogi, Y. Cao, Y. Chen, V. N. Venkatakrishnan, R. Yang, and Z. Zhang, "Vetting SSL usage in applications with SSLINT," in 2015 IEEE Symposium on Security and Privacy, May 2015, pp. 519--534.
[70]
S. Rahaman and D. Yao, "Program analysis of cryptographic implementations for security," in IEEE Security Development Conference (SecDev), 2017, pp. 61--68.
[71]
N. Meng, M. Kim, and K. McKinley, "Lase: Locating and applying systematic edits," in ICSE, 2013, p. 10.
[72]
"Flawfinder," https://dwheeler.com/flawfinder/.
[73]
"Checkmarx," https://www.checkmarx.com.

Cited By

View all
  • (2024)How Do Developers Reuse StackOverflow Answers in Their GitHub Projects?Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops10.1145/3691621.3694945(146-155)Online publication date: 27-Oct-2024
  • (2024)Using AI Assistants in Software Development: A Qualitative Study on Security Practices and ConcernsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690283(2726-2740)Online publication date: 2-Dec-2024
  • (2024)Boosting API Misuse Detection via Integrating API Constraints from Multiple SourcesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644904(14-26)Online publication date: 15-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '19: Proceedings of the 41st International Conference on Software Engineering
May 2019
1318 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 25 May 2019

Check for updates

Author Tags

  1. crowdsourced knowledge
  2. security implementation
  3. social dynamics
  4. stack overflow

Qualifiers

  • Research-article

Conference

ICSE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)How Do Developers Reuse StackOverflow Answers in Their GitHub Projects?Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops10.1145/3691621.3694945(146-155)Online publication date: 27-Oct-2024
  • (2024)Using AI Assistants in Software Development: A Qualitative Study on Security Practices and ConcernsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690283(2726-2740)Online publication date: 2-Dec-2024
  • (2024)Boosting API Misuse Detection via Integrating API Constraints from Multiple SourcesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644904(14-26)Online publication date: 15-Apr-2024
  • (2024)Quantifying Security Issues in Reusable JavaScript Actions in GitHub WorkflowsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644899(692-703)Online publication date: 15-Apr-2024
  • (2024)Mitigating Security Issues in GitHub ActionsProceedings of the 2024 ACM/IEEE 4th International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS) and 2024 IEEE/ACM Second International Workshop on Software Vulnerability10.1145/3643662.3643961(6-11)Online publication date: 15-Apr-2024
  • (2023)The Effectiveness of Security Interventions on GitHubProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623174(2426-2440)Online publication date: 15-Nov-2023
  • (2023)"Make Them Change it Every Week!": A Qualitative Exploration of Online Developer Advice on Usable and Secure AuthenticationProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623072(2740-2754)Online publication date: 15-Nov-2023
  • (2023)How have views on Software Quality differed over time? Research and practice viewpointsJournal of Systems and Software10.1016/j.jss.2022.111524195:COnline publication date: 1-Jan-2023
  • (2022)CopypastaVulGuard – A browser extension to prevent copy and paste spreading of vulnerable source code in forum postsProceedings of the 17th International Conference on Availability, Reliability and Security10.1145/3538969.3538973(1-8)Online publication date: 23-Aug-2022
  • (2022)"If security is required"Proceedings of the 4th International Workshop on Software Engineering Research and Practice for the IoT10.1145/3528227.3528565(1-8)Online publication date: 19-May-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media