short-paper

The ineffectiveness of domain-specific word embedding models for GUI test reuse

Authors:

Farideh Khalili,

Valerio Terragni,

Leonardo Mariani,

Abbas HeydarnooriAuthors Info & Claims

ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

Pages 560 - 564

https://doi.org/10.1145/3524610.3527873

Published: 20 October 2022 Publication History

Abstract

Reusing test cases across similar applications can significantly reduce testing effort. Some recent test reuse approaches successfully exploit word embedding models to semantically match GUI events across Android apps. It is a common understanding that word embedding models trained on domain-specific corpora perform better on specialized tasks. Our recent study confirms this understanding in the context of Android test reuse. It shows that word embedding models trained with a corpus of the English descriptions of apps in the Google Play Store lead to a better semantic matching of Android GUI events. Motivated by this result, we hypothesize that we can further increase the effectiveness of semantic matching by partitioning the corpus of app descriptions into domain-specific corpora. Our experiments do not confirm our hypothesis. This paper sheds light on this unexpected negative result that contradicts the common understanding.

References

[1]

Afnan A Al-Subaihin, Federica Sarro, Sue Black, Licia Capra, Mark Harman, Yue Jia, and Yuanyuan Zhang. 2016. Clustering mobile apps based on mined textual features. In Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. 1--10.

Digital Library

[2]

Farnaz Behrang and Alessandro Orso. 2019. Test migration between mobile apps with similar functionality. In Proceedings of the International Conference on Automated Software Engineering (ASE'19). IEEE Computer Society, 54--65.

Digital Library

[3]

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research 3 (2003), 993--1022.

[4]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. arXiv (2016).

[5]

Jonathan Chang, Sean Gerrish, Chong Wang, Jordan Boyd-Graber, and David Blei. 2009. Reading tea leaves: How humans interpret topic models. Advances in neural information processing systems 22 (2009).

[6]

Yong Chen, Hui Zhang, Rui Liu, Zhiwen Ye, and Jianying Lin. 2019. Experimental explorations on short text topic mining between LDA and NMF based Schemes. Knowledge-Based Systems 163 (2019), 1--13.

[7]

Matthew J Denny and Arthur Spirling. 2018. Text preprocessing for unsupervised learning: Why it matters, when it misleads, and what to do about it. Political Analysis 26, 2 (2018), 168--189.

[8]

Susan T Dumais, George W Furnas, Thomas K Landauer, Scott Deerwester, and Richard Harshman. 1988. Using latent semantic analysis to improve access to textual information. In Proceedings of the SIGCHI conference on Human factors in computing systems. 281--285.

Digital Library

[9]

George Forman. 2004. A pitfall and solution in multi-class feature selection for text classification. In Proceedings of the twenty-first international conference on Machine learning. 38.

Digital Library

[10]

Yuening Hu, Jordan Boyd-Graber, Brianna Satinoff, and Alison Smith. 2014. Interactive topic modeling. Machine learning 95, 3 (2014), 423--469.

[11]

Matt J. Kusner, Yu Sun, Nicholas I. Kolkin, and Kilian Q. Weinberger. 2015. From Word Embeddings to Document Distances. In Proceedings of the International Conference on International Conference on Machine Learning (ICML '15). 957--966.

Digital Library

[12]

Hongmin Li, Xukun Li, Doina Caragea, and Cornelia Caragea. 2018. Comparison of word embeddings and sentence encodings as generalized representations for crisis tweet classification tasks. Proceedings of ISCRAM Asia Pacific (2018).

[13]

Jun-Wei Lin, Reyhaneh Jabbarvand, and Sam Malek. 2019. Test Transfer Across Mobile Apps Through Semantic Mapping. In Proceedings of the International Conference on Automated Software Engineering (ASE'34). IEEE Computer Society, 42--53.

Digital Library

[14]

Chi-Yu Liu, Zheng Liu, Tao Li, and Bin Xia. 2018. Topic Modeling for Noisy Short Texts with Multiple Relations. In SEKE. 610--609.

[15]

Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Found. Trends Inf. Retr. 3, 3 (2009), 225--331.

Digital Library

[16]

Daniel Maier, Andreas Niekler, Gregor Wiedemann, and Daniela Stoltenberg. 2020. How document sampling and vocabulary pruning affect the results of topic models. Computational Communication Research 2, 2 (2020), 139--152.

[17]

Daniel Maier, Annie Waldherr, Peter Miltner, Gregor Wiedemann, Andreas Niekler, Alexa Keinert, Barbara Pfetsch, Gerhard Heyer, Ueli Reber, Thomas Häussler, et al. 2018. Applying LDA topic modeling in communication research: Toward a valid and reliable methodology. Communication Methods and Measures 12, 2--3 (2018), 93--118.

[18]

Masoud Makrehchi and Mohamed S Kamel. 2017. Extracting domain-specific stopwords for text classifiers. Intelligent Data Analysis 21, 1 (2017), 39--62.

Digital Library

[19]

Leonardo Mariani, Ali Mohebbi, Mauro Pezzè, and Valerio Terragni. 2021. Semantic Matching of GUI Events for Test Reuse: Are We There Yet?. In Proceedings of the 30th International Symposium on Software Testing and Analysis (ISSTA 21). ACM.

Digital Library

[20]

Leonardo Mariani, Mauro Pezzè, Valerio Terragni, and Daniele Zuddas. 2021. An Evolutionary Approach to Adapt Tests Across Mobile Apps. In International Conference on Automation of Software Test (AST '21). 70--79.

[21]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv (2013).

[22]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS '13). 3111--3119.

[23]

David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. 2010. Automatic evaluation of topic coherence. In Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics. 100--108.

Digital Library

[24]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.

[25]

Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. In Proceedings of the eighth ACM international conference on Web search and data mining. 399--408.

Digital Library

[26]

EB Roessler, RM Pangborn, JL Sidel, and H Stone. 1978. Expanded statistical tables for estimating significance in paired---preference, paired-difference, duo-trio and triangle tests. Journal of food Science 43, 3 (1978), 940--943.

[27]

Jonathan Schler, Moshe Koppel, Shlomo Argamon, and James W Pennebaker. 2006. Effects of age and gender on blogging. In AAAI spring symposium: Computational approaches to analyzing weblogs, Vol. 6. 199--205.

[28]

Didi Surian, Suranga Seneviratne, Aruna Seneviratne, and Sanjay Chawla. 2017. App Miscategorization Detection: A Case Study on Google Play. IEEE Transactions on Knowledge and Data Engineering 29, 8 (2017), 1591--1604.

Digital Library

[29]

Shaheen Syed and Marco Spruit. 2017. Full-text or abstract? Examining topic coherence scores using latent dirichlet allocation. In 2017 IEEE International conference on data science and advanced analytics (DSAA). IEEE, 165--174.

[30]

Yee Whye Teh, Michael I Jordan, Matthew J Beal, and David M Blei. 2006. Hierarchical dirichlet processes. Journal of the american statistical association 101, 476 (2006), 1566--1581.

[31]

Xukun Wang, Matthias Lee, Angie Pinchbeck, and Fatemeh Fard. 2019. Where does LDAsit for GitHub?. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW). IEEE, 94--97.

[32]

Yixue Zhao, Justin Chen, Adriana Sejfia, Marcelo Schmitt Laser, Jie Zhang, Federica Sarro, Mark Harman, and Nenad Medvidovic. 2020. FrUITeR: a framework for evaluating UI test reuse. In Proceedings of the Joint Meeting on Foundations of Software Engineering (ESEC/FSE 20). 1190--1201.

Digital Library

[33]

Hengshu Zhu, Huanhuan Cao, Enhong Chen, Hui Xiong, and Jilei Tian. 2012. Exploiting enriched contextual information for mobile app classification. In Proceedings of the 21st ACM international conference on Information and knowledge management. 1617--1621.

Digital Library

[34]

Hengshu Zhu, Enhong Chen, Hui Xiong, Huanhuan Cao, and Jilei Tian. 2013. Mobile app classification with enriched contextual information. IEEE Transactions on mobile computing 13, 7 (2013), 1550--1563.

Cited By

Khalili FMariani LMohebbi APezzè MTerragni V(2024)Semantic matching in GUI test reuseEmpirical Software Engineering10.1007/s10664-023-10406-829:3Online publication date: 9-May-2024
https://dl.acm.org/doi/10.1007/s10664-023-10406-8

Index Terms

The ineffectiveness of domain-specific word embedding models for GUI test reuse

Recommendations

Semantic matching of GUI events for test reuse: are we there yet?
ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

GUI testing is an important but expensive activity. Recently, research on test reuse approaches for Android applications produced interesting results. Test reuse approaches automatically migrate human-designed GUI tests from a source app to a target app ...
Morphological Word Embedding for Arabic
Abstract
Word embedding has opened new and exciting avenues for understanding and processing languages. The simple yet effective word embedding models rapidly became a dominant building block for Natural Language Processing (NLP) applications as they ...
Improving Vietnamese WordNet using word embedding
NLPIR '19: Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval

This paper presents a simple but effective method to improve the quality of WordNet synsets and extract glosses for synsets. We translate the Princeton WordNet and other intermediate WordNets to a target language using a machine translator, then the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

May 2022

698 pages

ISBN:9781450392983

DOI:10.1145/3524610

Conference Chairs:
Ayushi Rastogi
University of Groningen, The Netherlands
,
Rosalia Tufano
USI Università della Svizzera italiana, Switzerland
,
General Chair:
Gabriele Bavota
USI Università della Svizzera italiana, Switzerland
,
Program Chairs:
Venera Arnaoudova
Washington State University, United States of America
,
Sonia Haiduc
Florida State University, United States of America

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

SNF

Conference

ICPC '22

Sponsor:

SIGSOFT

ICPC '22: 30th International Conference on Program Comprehension

May 16 - 17, 2022

Virtual Event

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
72
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Khalili FMariani LMohebbi APezzè MTerragni V(2024)Semantic matching in GUI test reuseEmpirical Software Engineering10.1007/s10664-023-10406-829:3Online publication date: 9-May-2024
https://dl.acm.org/doi/10.1007/s10664-023-10406-8

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents