research-article

How reliable is the crowdsourced knowledge of security implementation?

Authors:

Mengsu Chen,

Felix Fischer,

Na Meng,

Xiaoyin Wang,

Jens GrossklagsAuthors Info & Claims

ICSE '19: Proceedings of the 41st International Conference on Software Engineering

Pages 536 - 547

https://doi.org/10.1109/ICSE.2019.00065

Published: 25 May 2019 Publication History

Get Access

Abstract

Stack Overflow (SO) is the most popular online Q&A site for developers to share their expertise in solving programming issues. Given multiple answers to a certain question, developers may take the accepted answer, the answer from a person with high reputation, or the one frequently suggested. However, researchers recently observed that SO contains exploitable security vulnerabilities in the suggested code of popular answers, which found their way into security-sensitive high-profile applications that millions of users install every day. This observation inspires us to explore the following questions: How much can we trust the security implementation suggestions on SO? If suggested answers are vulnerable, can developers rely on the community's dynamics to infer the vulnerability and identify a secure counterpart?

To answer these highly important questions, we conducted a comprehensive study on security-related SO posts by contrasting secure and insecure advice with the community-given content evaluation. Thereby, we investigated whether SO's gamification approach on incentivizing users is effective in improving security properties of distributed code examples. Moreover, we traced the distribution of duplicated samples over given answers to test whether the community behavior facilitates or prevents propagation of secure and insecure code suggestions within SO.

We compiled 953 different groups of similar security-related code examples and labeled their security, identifying 785 secure answer posts and 644 insecure answer posts. Compared with secure suggestions, insecure ones had higher view counts (36,508 vs. 18,713), received a higher score (14 vs. 5), and had significantly more duplicates (3.8 vs. 3.0) on average. 34% of the posts provided by highly reputable so-called trusted users were insecure.

Our findings show that based on the distribution of secure and insecure code on SO, users being laymen in security rely on additional advice and guidance. However, the community-given feedback does not allow differentiating secure from insecure choices. The reputation mechanism fails in indicating trustworthy users with respect to security questions, ultimately leaving other users wandering around alone in a software security minefield.

References

[1]

"Stack Overflow goes beyond Q&As and launches crowdsourced documentation," https://techcrunch.com/2016/07/21/stack-overflow-goes-beyond-qas-and-launches-crowdsourced-documentation/, 2016.

Abstract

References

Cited By

Recommendations

An empirical study of IoT security aspects at sentence-level in developer textual discussions

Automatic query reformulation for code search using crowdsourced knowledge

RACK: code search in the IDE using crowdsourced knowledge

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations