Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

An exploratory study on the repeatedly shared external links on Stack Overflow

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

On Stack Overflow, users reuse 11,926,354 external links to share the resources hosted outside the Stack Overflow website. The external links connect to the existing programming-related knowledge and extend the crowdsourced knowledge on Stack Overflow. Some of the external links, so-called as repeated external links, can be shared for multiple times. We observe that 82.5% of the link sharing activities (i.e., sharing links in any question, answer, or comment) on Stack Overflow share external resources, and 57.0% of the occurrences of the external links are sharing the repeated external links. However, it is still unclear what types of external resources are repeatedly shared. To help users manage their knowledge, we wish to investigate the characteristics of the repeated external links in knowledge sharing on Stack Overflow. In this paper, we analyze the repeated external links on Stack Overflow. We observe that external links that point to the text resources (hosted in documentation websites, tutorial websites, etc.) are repeatedly shared the most. We observe that different users repeatedly share the same knowledge in the form of repeated external links, thus increasing the maintenance effort of knowledge (e.g., update invalid links in multiple posts). The repeated external links can bring risks to the software engineering process, as 1) the same users can repeatedly share the external links for the purpose of promotion, and 2) external links can point to webpages with an overload of information that makes it difficult for users to retrieve relevant information. Our findings provide insights to Stack Overflow moderators and researchers. For example, we encourage Stack Overflow to centrally manage the commonly occurring knowledge in the form of repeated external links in order to better maintain the crowdsourced knowledge on Stack Overflow.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://meta.stackoverflow.com/q/358992/

  2. https://stackoverflow.com/help/how-to-ask

  3. https://stackoverflow.com/help/minimal-reproducible-example

  4. https://stackoverflow.com/help/formatting

  5. https://stackoverflow.com/editing-help#code

  6. https://stackoverflow.com/questions/28207373/

  7. https://github.com/jersey/jersey/blob/master/examples/https-clientserver-grizzly/src/main/java/org/glassfish/jersey/examples/h-ttpsclientservergrizzly/SecurityFilter.java

  8. https://stackoverflow.com/editing-help

  9. https://meta.stackoverflow.com/q/252811/

  10. For example, we consider docs.oracle.com and www.oracle.com are different websites because they have different full domains.

  11. https://zenodo.org/record/3255045#.XYWaMyh3iUk

  12. http://en.wikipedia.org/wiki/Internal_link

  13. https://en.wikipedia.org/wiki/Website

  14. https://meta.stackexchange.com/q/90342

  15. https://meta.stackexchange.com/questions/313790/i-stack-imgur-seems-to-be-down

  16. https://meta.stackoverflow.com/questions/341016/is-it-ok-to-re-upload-externally-hosted-images-on-stack-overflows-imgur

  17. https://stackoverflow.com/help/minimal-reproducible-example

  18. https://stackoverflow.com/editing-help#code

  19. https://meta.stackoverflow.com/questions/358992/ive-been-told-to-create-a-runnable-example-with-stack-snippets-how-do-i-dohttps://meta.stackoverflow.com/questions/358992/ive-been-told-to-create-a-runnable-example-with-stack-snippets-how-do-i-do

  20. https://stackoverflow.com/q/28886508/

  21. http://bugs.python.org/issue22942

  22. https://stackoverflow.com/q/22021491/

  23. https://stackoverflow.com/help/how-to-ask

  24. https://www.google.com/#q=_IOWR_BAD+OR+_IOR_BAD+OR+_-IOW_BAD&safe=off

  25. https://stackoverflow.com/q/22021491/22021641

  26. https://www.google.com/search?q=_IOR_BAD+lkml

  27. https://meta.stackexchange.com/q/176445

  28. https://stackoverflow.com/q/2660914

  29. http://goo.gl/b93ns

  30. http://www.youtube.com/wa-tch?v=_CruQY55HOk

  31. https://www.youtube.com/playlist?list=PL284C9FF2488-BC6D1

  32. https://help.archive.org/hc/en-us/articles/360004716091-Wayback-Machine-General-Information

References

  • An L, Mlouki O, Khomh F, Antoniol G (2017) Stack overflow: a code laundering platform?, IEEE

  • Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2012) Discovering value from community activity on focused question answering sites: A case study of Stack Overflow. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12. ACM, New York, pp 850–858

  • Bajaj K, Pattabiraman K, Mesbah A (2014) Mining questions asked by web developers. In: Proceedings of the 11th working conference on mining software repositories, MSR 2014. ACM, New York, pp 112–121

  • Baltes S, Dumani L, Treude C, Diehl S (2018) Sotorrent: reconstructing and analyzing the evolution of Stack Overflow posts. In: Proceedings of the 15th international conference on mining software repositories, MSR 2018. Gothenburg, Sweden, May 28-29, 2018, pp 319–330

  • Baltes S, Treude C, Diehl S (2019) Sotorrent: Studying the origin, evolution, and usage of Stack Overflow code snippets. In: 2019 IEEE/ACM 16th international conference on mining software repositories, MSR. IEEE, pp 191–194

  • Baltes S, Treude C, Robillard MP (2020) Contextual documentation referencing on Stack Overflow. IEEE Trans Softw Eng

  • Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? an analysis of topics and trends in Stack Overflow. Empir Softw Eng 19 (3):619–654

    Article  Google Scholar 

  • Benesty J, Chen J, Huang Y, Cohen I (2009) Pearson correlation coefficient. In: Noise reduction in speech processing. Springer, pp 1–4

  • Berners-Lee T, Fielding R, Masinter L (2005) Rfc 3986, uniform resource identifier (uri): Generic syntax, 2005. http://www.faqs.org/rfcs/rfc3986.html

  • Cai L, Wang H, Huang Q, Xia X, Xing Z, Lo D (2019) Biker: a tool for bi-information source based api method recommendation. In: Dumas M, Pfahl D, Apel S, Russo A (eds) Proceedings of the 2019 27th ACM joint meeting - european software engineering conference and symposium on the foundations of software engineering. Association for Computing Machinery, pp 1075–1079

  • Cavusoglu H, Li Z, Huang KW (2015) Can gamification motivate voluntary contributions?: The case of stackoverflow q&a community. In: Proceedings of the 18th ACM conference companion on computer supported cooperative work & social computing, CSCW’15. ACM, New York Companion, pp 171–174

  • Chen C, Xing Z, Liu Y (2017) By the community & for the community: a deep learning approach to assist collaborative editing in q&a sites. ACM Proc Human-Comput Interact 1(CSCW). https://doi.org/10.1145/3134667

  • Chen C, Chen X, Sun J, Xing Z, Li G (2018) Data-driven proactive policy assurance of post quality in community q&a sites. Proc ACM Hum-Comput Interact 2(CSCW):33:1–33:22

    Google Scholar 

  • Chen F, Kim S (2015) Crowd debugging. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 320–332

  • Chen M, Fischer F, Meng N, Wang X, Grossklags J (2019) How reliable is the crowdsourced knowledge of security implementation?. In: Proceedings of the 41st international conference on software engineering, pp 536–547

  • Correa D, Sureka A (2013) Integrating issue tracking systems with community-based question and answering websites. In: 2013 22nd Australian software engineering conference. IEEE, pp 88–96

  • Dang V, Croft BW (2010) Query reformulation using anchor text. In: Proceedings of the third ACM international conference on web search and data mining, WSDM ’10. Association for Computing Machinery, New York, p 41–50. https://doi.org/10.1145/1718487.1718493

  • Gao S, Xing Z, Ma Y, Ye D, Lin S (2017) Enhancing knowledge sharing in Stack Overflow via automatic external web resources linking. In: 2017 22nd international conference on engineering of complex computer systems, pp 90–99

  • Gómez C, Cleary B, Singer L (2013) A study of innovation diffusion through link sharing on Stack Overflow. In: Proceedings of the 10th Working Conference on Mining Software Repositories, IEEE Press

  • Hanrahan BV, Convertino G, Nelson L (2012) Modeling problem difficulty and expertise in stackoverflow. In: Proceedings of the ACM 2012 conference on computer supported cooperative work companion, CSCW ’12. ACM, New York, pp 91–94

  • Huang Q, Xia X, Xing Z, Lo D, Wang X (2018) Api method recommendation without worrying about the task-api knowledge gap. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018. ACM, New York, pp 293–304

  • Li G, Zhu H, Lu T, Ding X, Gu N (2015) Is it good to be like wikipedia?: Exploring the trade-offs of introducing collaborative editing model to q&a sites. In: Conference on computer supported cooperative work, pp 1080–1091

  • Linares-Vásquez M, Bavota G, Di Penta M, Oliveto R, Poshyvanyk D (2014) How do api changes trigger Stack Overflow discussions? a study on the android sdk. In: Proceedings of the 22nd international conference on program comprehension, ICPC 2014. ACM, New York, pp 83–94

  • Liu J, Xia X, Lo D, Zhang H, Zou Y, Hassan AE, Li S (2020) Broken external links on Stack Overflow. arXiv:201004892

  • Liu J, Xia X, Lo D, Li S (2021) Characterizing and predicting fragile links on Stack Overflow. submitted to EMSE journal

  • MacLeod L, Storey MA, Bergen A (2015) Code, camera, action: How software developers document and share program knowledge using youtube. In: Proceedings of the 2015 IEEE 23rd international conference on program comprehension. IEEE Press, pp 104–114

  • Mamykina L, Manoim B, Mittal M, Hripcsak G, Hartmann B (2011) Design lessons from the fastest q&a site in the west. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’11. ACM, New York, pp 2857–2866

  • Newman M (2005) Power laws, pareto distributions and zipf’s law. Contemp Phys 46(5):323–351. https://doi.org/10.1080/00107510500052444

    Article  Google Scholar 

  • Pal A, Chang S, Konstan JA (2012) Evolution of experts in question answering communities. In: Sixth international AAAI conference on weblogs and social media

  • Ponzanelli L, Bavota G, Mocci A, Di Penta M, Oliveto R, Hasan M, Russo B, Haiduc S, Lanza M (2016a) Too long; didn’t watch!: Extracting relevant fragments from software development video tutorials. In: Proceedings of the 38th international conference on software engineering, ICSE ’16. ACM, New York, pp 261–272

  • Ponzanelli L, Bavota G, Mocci A, Di Penta M, Oliveto R, Russo B, Haiduc S, Lanza M (2016b) Codetube: Extracting relevant fragments from software development video tutorials. In: Proceedings of the 38th international conference on software engineering companionICSE ’16. ACM, New York, pp 645–648

  • Ragkhitwetsagul C, Krinke J, Paixão M, Bianco G, Oliveto R (2018) Toxic code snippets on Stack Overflow. arXiv:1806.07659

  • Rahman MM, Yeasmin S, Roy CK (2014) Towards a context-aware ide-based meta search engine for recommendation about programming errors and exceptions. In: 2014 software evolution week-ieee conference on software maintenance, reengineering, and reverse engineering. IEEE, pp 194–203

  • Rath M, Rendall J, Guo JL, Cleland-Huang J, Mäder P (2018) Traceability in the wild: automatically augmenting incomplete trace links. In: 2018 IEEE/ACM 40th international conference on software engineering. IEEE

  • Rosen C, Shihab E (2016) What are mobile developers asking about? a large scale study using Stack Overflow. Empir Softw Eng 21(3):1192–1223

    Article  Google Scholar 

  • Saha RK, Saha AK, Perry DE (2013) Toward understanding the causes of unanswered questions in software information sites: A case study of Stack Overflow. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, ESEC/FSE 2013. ACM, New York. https://doi.org/10.1145/2491411.2494585, pp 663–666

  • Spencer D (2009) Card sorting: Designing usable categories. Rosenfeld Media, New York

    Google Scholar 

  • Viera AJ, Garrett JM (2005) Understanding interobserver agreement: The kappa statistic. Fam Med 37(5):360–363

    Google Scholar 

  • Wang S, Lo D, Vasilescu B, Serebrenik A (2014) Entagrec: An enhanced tag recommendation system for software information sites. In: 2014 IEEE international conference on software maintenance and evolution. IEEE, pp 291–300

  • Wang S, Chen THP, Hassan AE (2018) How do users revise answers on technical q&a websites? a case study on Stack Overflow. IEEE Trans Softw Eng

  • Wang T, Yin G, Wang H, Yang C, Zou P (2015) Automatic knowledge sharing across communities: a case study on android issue tracker and Stack Overflow. In: 2015 IEEE symposium on service-oriented system engineering. IEEE, pp 107–116

  • Wu Y, Wang S, Bezemer CP, Inoue K (2019) How do developers utilize source code from Stack Overflow? Empir Softw Eng 24(2):637–673

    Article  Google Scholar 

  • Xia X, Bao L, Lo D, Kochhar PS, Hassan AE, Xing Z (2017) What do developers search for on the web? Empir Softw Eng 22(6):3149–3185

    Article  Google Scholar 

  • Xu B, Ye D, Xing Z, Xia X, Chen G, Li S (2016) Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering, ASE 2016. ACM, New York, pp 51–62

  • Ye D, Xing Z, Kapre N (2017) The structure and dynamics of knowledge network in domain-specific q&a sites: a case study of Stack Overflow. Empir Softw Eng 22(1):375–406

    Article  Google Scholar 

  • Zhang H, Wang S, Chen T, Hassan AE (2019a) Reading answers on Stack Overflow: Not enough! IEEE Trans Softw Eng :1–1

  • Zhang H, Wang S, Chen TP, Zou Y, Hassan AE (2019b), An empirical study of obsolete answers on Stack Overflow. IEEE Trans Softw Eng. https://doi.org/10.1109/TSE.2019.2906315

Download references

Acknowledgments

This research was partially supported by the National Science Foundation of China (No. U20A20173), Key Research and Development Program of Zhejiang Province (No.2021C01014), and the National Research Foundation, Singapore under its Industry Alignment Fund – Prepositioning (IAF-PP) Funding Initiative. Any opinions, findings, and conclusions, or recommendations expressed in this material are those of the author(s) and do not reflect the views of Huawei and the National Research Foundation, Singapore.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Xia.

Additional information

Communicated by: Emerson Murphy-Hill

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Zhang, H., Xia, X. et al. An exploratory study on the repeatedly shared external links on Stack Overflow. Empir Software Eng 27, 19 (2022). https://doi.org/10.1007/s10664-021-10028-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-10028-y

Keywords

Navigation