Abstract
It is very important to understand the domain topic of software to maintain and reuse it. However, the continual development and change in its size makes it difficult to understand it. To solve this problem, researches have been recently conducted to extract the domain topic using various information search techniques such as LDA, with the researches on LDA-based techniques being especially active. However, since only unstructured information such as an identifier or note is used in most research, without including structured ones like information calling, problems in which extracted topics are different from the characteristics of the program can occur. In this paper, we propose a method to generate documents and extract topics using both structured and unstructured information. We also generate indexes based on the frequency of the identifier of the source code, and propose a system that extracts an association rule based on the simultaneous generation of the method. We as well establish a system that provides highly reliable search results to user queries by combining domain topics, indexes with scores, and the association rule information. Consequently a TEXAS2 system for this study was established and confirmed a high user satisfaction on search results to the queries in a performance test.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antoniol, G., Guéhéneuc, Y. G.: Feature identification: an epidemiological metaphor. IEEE Trans. Softw. Eng. 32(9), 627–641. IEEE Press, New York (2006)
Karrer, T., Krämer, J.P., Diel, J., Hartmann, B.: Stacksplorer: call graph navigation helps increasing code maintenance efficiency. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 217–224. ACM, New York (2011)
Maskeri, G., Sarkar, S., Heafield, K.: Mining business topics in source code using Latent Dirichlet Allocation. In: Proceedings of the 1st India Software Engineering Conference, pp. 113–120. ACM, New York (2008)
Alenezi, M.: Extracting high-level concepts from open-source systems. Intl. J. Softw. Eng. Appl. 9(1), 183–190 (2015). SERSC, Tasmania
McBurney, P.W., Liu, C., McMillan, C., Weninger, T.: Improving topic model source code summarization. In: Proceedings of the 22nd International Conference on Program Comprehension, pp. 291–294. ACM, New York (2014)
Savage, T., Dit, B., Gethers, M., Poshyvank, D.: Topic XP: exploring topics in source code using Latent Dirichlet Allocation. In: IEEE International Conference on Software Maintenance, pp. 1–6. IEEE Press, New York (2010)
Slimani, T., Lazzez, A.: Sequential mining: patterns and algorithms analysis. Intl. J. Comput. Electron. Res. 2, 639–647 (2013)
Apache Lucene. https://lucene.apache.org/core/
Apache Solr. https://lucene.apache.org/solr/
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet Allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). MIT Press, Cambridge
Blei, D.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012). ACM, New York
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. J. Comput. Netw. ISDN Syst. 30, 107–117 (1998). Amsterdam
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB Conference
Acknowledgement
This research was supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT & Future Planning(NRF-2014M3C4A7030505).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hwang, S., Lee, Y., Nam, Y. (2017). TEXAS2: A System for Extracting Domain Topic Using Link Analysis and Searching for Relevant Features. In: Park, J., Pan, Y., Yi, G., Loia, V. (eds) Advances in Computer Science and Ubiquitous Computing. UCAWSN CUTE CSA 2016 2016 2016. Lecture Notes in Electrical Engineering, vol 421. Springer, Singapore. https://doi.org/10.1007/978-981-10-3023-9_113
Download citation
DOI: https://doi.org/10.1007/978-981-10-3023-9_113
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3022-2
Online ISBN: 978-981-10-3023-9
eBook Packages: EngineeringEngineering (R0)