Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1923947.1923954dlproceedingsArticle/Chapter ViewAbstractPublication PagescasconConference Proceedingsconference-collections
research-article

F007: finding rediscovered faults from the field using function-level failed traces of software in the field

Published: 01 November 2010 Publication History

Abstract

Studies show that approximately 50% to 90% of the failures reported from the field are rediscoveries of previous faults. Also, approximately 80% of the failures originate from approximately 20% of the code. Despite this identification of the origin of the failures in system code remains an arduous activity, and consumes substantial resources. Prior fault discovery techniques for field traces either require many pass-fail traces, discover only crashing failures, or identify faulty coarse grain code such as files as the source of the fault. This paper describes a new method (F007) that focuses on identifying finer grain faulty code (faulty functions) from only failed traces of deployed software. F007 extracts patterns of function-calls from a historical collection of only function-level failed traces, and then trains decision trees on the extracted function-call patterns for each known faulty function. A ranked list of faulty functions is then predicted by F007 for a new failure trace based on the probability of fault proneness obtained via decision trees. Our case study on the Siemens suite shows that F007: (a) can identify rediscovered faulty functions (with new or old faults) with 60--86% accuracy, (b) needs to examine approximately 5--10% of the code for the Siemens suite, and (c) can discover the faulty functions in every new failed trace by using a small collection of previous failed traces. Thus, F007 can correctly identify the faulty functions for the majority (80%-90%) of (field) failures with the knowledge of a fault in a small percentage (20%) of functions.

References

[1]
Agrawal, H.; Horgan, J. R.; & London, S.; Wong, W. E.; "Fault Localization using Execution Slices and Dataflow Tests". Proc. Int'l Softw. Symp. on Reliability Eng., France, Oct., 1995, pp. 143--151.
[2]
Bowring J. F.; Rehg J. M.; and Harrold. M. J; "Active Learning for Automatic Classification of Software Behavior". SIGSOFT Soft Eng. Notes Vol. 29, No. 4, ACM, US, Jul. 2004, pp. 195--204.
[3]
Brodie, M.; Sheng Ma; Lohman, G.; Mignet, L.; Modani, N.; Wilding, M.; Champlin, J.; and Sohn, P.; "Quickly Finding Known Software Problems via Automated Symptom Matching". Proc. 2nd Int'l Conf. on Autonomic Computing, Seattle, USA, June 2005, pp. 101--110.
[4]
Chen M.; Accardi A.; Kiciman E.; and Fox A.; Patterson D.; and Brewer E.; "Path-based Failure and Evolution Management". Proc. Int'l Symp. on Networked Systems Design and Implementation, San Francisco, USA, March 2004, pp. 309--322.
[5]
Chilimbi, T. M.; Liblit, B.; Mehra, K.; Nori, A. V.; and Vaswani, K; "HOLMES: Effective Statistical Debugging via Efficient Path Profiling". Proc. 31st Intl. Conf. on Softw. Eng., IEEE CS, Canada, May, 2009, pp. 34--44.
[6]
Dallmeier, V.; Lindig, C.; and Zeller, A.; "Lightweight Defect Localization for Java". Proc. 19th European Conf. on Object-Oriented Programming, Springer LNCS, Glasgow, UK, Aug. 2005, pp. 528--550.
[7]
Di Fatta, G.; Leue, S.; and Stegantova, E; "Discriminative Pattern Mining in Software Fault Detection". Proc. 3rd Int'l Workshop on Softw. Quality Assurance, ACM, Oregon, USA, Nov. 2006, pp. 62--69.
[8]
Ding, X.; Huang, H.; Ruan, Y.; Shaikh, A.; and Zhang, X.; "Automatic Software Fault Diagnosis by Exploiting Application Signatures". Proc. 22nd Conf. on Large Installation System Admin., San Diego, CA, USA, Nov., 2008, pp. 23--39.
[9]
Do, H., Elbaum, S. G., and Rothermel, G.; "Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact". Empirical Softw. Eng., Vol. 10, Springer, Oct. 2005, pp. 405--435.
[10]
Elbaum, S., Kanduri, S., and Andrews, A. "Trace Anomalies as Precursors of Field Failures: An Empirical Study". Empirical Softw. Eng., Vol. 12, No. 5, Springer, Oct. 2007, pp. 447--469.
[11]
Etrace (Runtime Tracing Tool): http://ndevilla.free.fr/etrace/; March, 2008
[12]
Haran, M.; Karr, A.; Last, M.; Orso, A.; Porter, A. A.; Sanil, A.; Fouche, S.; "Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks". IEEE Trans. on Softw. Eng. Vol. 33, No. 5, May, 2007, pp. 287--304.
[13]
Hutchins, M.; Foster, H.; Goradia, T.; Ostrand, T., "Experiments on The Effectiveness of Dataflow-and Control-Flow-Based Test Adequacy Criteria". Proc. 16th Int'l Conf. on Softw. Eng., IEEE, Sorrento, Italy, May, 1994, pp. 191--200.
[14]
Jones, J. A. and Harrold, M. J., "Empirical Evaluation of the Tarantula Automatic Fault-Localization Technique". Proc. 20th Int'l Conf. on Automated Softw. Eng., IEEE/ACM, CA, USA, 2005, pp. 273--282.
[15]
Lee M. G. and Jefferson T. L.; "An Empirical Study of Software Maintenance of a Web-based Java Application". Proc. Int'l Conf. on Soft. Maintenance, IEEE, Budapest, Hungary, Sep., 2005, pp. 571--576.
[16]
Lee, I.; Iyer, R., "Diagnosing Rediscovered Problems Using Symptoms". IEEE Trans. on Sofw. Eng., Vol. 26, No. 2, Feb, 2000, pp. 113--127.
[17]
Liu, C. and Han, J., "Failure Proximity: A Fault Localization-based Approach". Proc. of the 14th SIGSOFT Symp. on Foundations of Softw. Eng., ACM, Portland, USA, Nov. 2006, pp. 45--56.
[18]
Liu, C.; Yan, X.; Fei, L.; Han, J.; Midkiff, S. P.; "SOBER: Statistical Model-Based Bug Localization". SIGSOFT Softw. Eng. Notes, Vol 30, No. 5, ACM, USA, Sep., 2005, pp. 286--295.
[19]
Gittens M.; Kim Y.; and Godwin D.; "The Vital Few Versus the Trivial Many: Examining the Pareto Principle for Software". Proc. 29th Int'l Computer Softw. and Appl. Conf., Edinburgh, Scotland, July 2005, pp. 179--185.
[20]
Mannila, H.; Toivonen, H; Inkeri, V.; "Discovery of Frequent Episodes in Event Sequences". Data Mining and Knowledge Discovery, Vol. 1, No. 3, Springer, Jan 1997, pp. 259--289.
[21]
Podgurski, A.; Leon, D.; Francis, P.; Masri, W.; Minch, M.; & Sun, J.; Wang, B, "Automated Support for Classifying Software Failure Reports". Proc. Intl. Conf. on Softw. Eng., IEEE CS, Portland, US, May, 2003, pp. 465--475.
[22]
Polat, K. and Güneş, S; "A Novel Hybrid Intelligent Method Based on C4.5 Decision Tree Classifier and One-Against-All Approach for Multi-class Classification Problems". J. of Expert Syst. Appl., Vol. 36, No. 2, Pergamon Press, Mar. 2009, pp. 1587--1592.
[23]
Proprietary workshop on large commercial software, Sep., 2008.
[24]
Quinlan, J. R.; C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993.
[25]
Refaat M., Data Preparation for Data Mining using SAS, Elsevier, 2007.
[26]
Schach S. R.; Jin B.; Yu L.; Heller G. Z.; and Offutt J.; "Determining the Distribution of Maintenance Categories: Survey versus Measurement". Empirical Soft. Eng. Vol. 8, No. 4, Springer, Dec., 2003, pp. 351--365.
[27]
Siemens Suite: http://www-static.cc.gatech.edu/aristotle/Tools/subjects/, March, 2008.
[28]
Ostrand T. J., Weyuker E., and Bell R. M., "Predicting the Location and Number of Faults in Large Software Systems". IEEE Trans. on Softw. Eng., Vol. 31, No. 4, 2005, pp. 340--355.
[29]
Witten I. H. and Frank E., Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, USA, 2005.
[30]
Wong, W. E. and Qi, Y.; "Effective Program Debugging Based On Execution Slices and Inter-Block Data Dependency". J. Syst. Softw., Elsevier, Vol. 79, No. 7, July, 2006, pp. 891--903.
[31]
Wong, W. E.; Yu Qi; Lei Zhao; Kai-Yuan Cai, "Effective Fault Localization using Code Coverage". Proc. 31st Int'l Conf. on Comp. Softw. & App., IEEE, China, July, 2007, pp. 449--456.
[32]
Wood A., "Software Reliability from the Customer View". Computer, Vol. 36, No. 8, IEEE CS, Aug., 2003, pp. 37--42.
[33]
Yuan, C.; Lao, N.; Wen, J.; Li, J.; Zhang, Z.; Wang, Y.; and Ma, W; "Automated Known Problem Diagnosis with Event Traces". SIGOPS, OS. Syst. Rev., Vol. 40, No. 4, ACM, USA, Oct., 2006, pp. 375--388.
[34]
Zheng A. X.; Jordan M. I., Liblit, B.; and Aiken, A, "Statistical Debugging of Sampled Programs". Advances in Neural Info. Processing Syst., MIT Press, Cambridge, MA, US, 2004, pp. 9--18.

Cited By

View all
  • (2015)Towards an emerging theory for the diagnosis of faulty functions in function-call tracesProceedings of the Fourth SEMAT Workshop on General Theory of Software Engineering10.5555/2820167.2820180(59-68)Online publication date: 16-May-2015
  • (2011)Diagnosing new faults using mutants and prior faults (NIER track)Proceedings of the 33rd International Conference on Software Engineering10.1145/1985793.1985959(960-963)Online publication date: 21-May-2011

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
CASCON '10: Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
November 2010
482 pages

Publisher

IBM Corp.

United States

Publication History

Published: 01 November 2010

Qualifiers

  • Research-article

Conference

CASCON '10
CASCON '10: Center for Advanced Studies on Collaborative Research
November 1 - 4, 2010
Ontario, Toronto, Canada

Acceptance Rates

Overall Acceptance Rate 24 of 90 submissions, 27%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2015)Towards an emerging theory for the diagnosis of faulty functions in function-call tracesProceedings of the Fourth SEMAT Workshop on General Theory of Software Engineering10.5555/2820167.2820180(59-68)Online publication date: 16-May-2015
  • (2011)Diagnosing new faults using mutants and prior faults (NIER track)Proceedings of the 33rd International Conference on Software Engineering10.1145/1985793.1985959(960-963)Online publication date: 21-May-2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media