Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/876889.880369guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Using latent semantic analysis to identify similarities in source code to support program understanding

Published: 13 November 2000 Publication History

Abstract

Abstract: The paper describes the results of applying Latent Semantic Analysis (LSA), an advanced information retrieval method, to program source code and associated documentation. Latent semantic analysis is a corpus based statistical method for inducing and representing aspects of the meanings of words and passages (of natural language) reflective in their usage. This methodology is assessed for application to the domain of software components (i.e., source code and its accompanying documentation). Here LSA is used as the basis to cluster software components. This clustering is used to assist in the understanding of a nontrivial software system, namely a version of Mosaic. Applying latent semantic analysis to the domain of source code and internal documentation for the support of program understanding is a new application of this method and a departure from the normal application domain of natural language.

Cited By

View all
  • (2018)Leveraging the agile development process for selecting invoking/excluding tests to support feature locationProceedings of the 26th Conference on Program Comprehension10.1145/3196321.3196354(370-379)Online publication date: 28-May-2018
  • (2016)Object injection vulnerability discovery based on latent semantic indexingProceedings of the 31st Annual ACM Symposium on Applied Computing10.1145/2851613.2851865(801-807)Online publication date: 4-Apr-2016
  • (2015)Clustering source code elements by semantic similarity using WikipediaProceedings of the Fourth International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering10.5555/2820668.2820672(13-18)Online publication date: 16-May-2015
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICTAI '00: Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
November 2000

Publisher

IEEE Computer Society

United States

Publication History

Published: 13 November 2000

Author Tags

  1. LSA
  2. Mosaic
  3. computational linguistics
  4. corpus based statistical method
  5. information retrieval
  6. information retrieval method
  7. internal documentation
  8. latent semantic analysis
  9. natural language
  10. natural languages
  11. nontrivial software system
  12. program understanding
  13. reverse engineering
  14. software component clustering
  15. software components
  16. source code
  17. source code similarities
  18. statistical analysis
  19. system documentation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Leveraging the agile development process for selecting invoking/excluding tests to support feature locationProceedings of the 26th Conference on Program Comprehension10.1145/3196321.3196354(370-379)Online publication date: 28-May-2018
  • (2016)Object injection vulnerability discovery based on latent semantic indexingProceedings of the 31st Annual ACM Symposium on Applied Computing10.1145/2851613.2851865(801-807)Online publication date: 4-Apr-2016
  • (2015)Clustering source code elements by semantic similarity using WikipediaProceedings of the Fourth International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering10.5555/2820668.2820672(13-18)Online publication date: 16-May-2015
  • (2011)Experiences with text mining large collections of unstructured systems development artifacts at jplProceedings of the 33rd International Conference on Software Engineering10.1145/1985793.1985891(701-710)Online publication date: 21-May-2011
  • (2010)Towards mining replacement queries for hard-to-retrieve tracesProceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering10.1145/1858996.1859046(245-254)Online publication date: 20-Sep-2010
  • (2010)A machine learning approach for tracing regulatory codes to product specific requirementsProceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 110.1145/1806799.1806825(155-164)Online publication date: 1-May-2010
  • (2008)Identifying domain expertise of developers from source codeProceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1401890.1402007(981-989)Online publication date: 24-Aug-2008
  • (2007)Semantic clusteringInformation and Software Technology10.1016/j.infsof.2006.10.01749:3(230-243)Online publication date: 1-Mar-2007
  • (2006)MUDABlueJournal of Systems and Software10.1016/j.jss.2005.06.04479:7(939-953)Online publication date: 1-Jul-2006
  • (2005)Supervised categorization of JavaScript™ using program analysis featuresProceedings of the Second Asia conference on Asia Information Retrieval Technology10.1007/11562382_13(160-173)Online publication date: 13-Oct-2005
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media