Article

Graph-Based Procedural Abstraction

Authors:

M. PhilippsenAuthors Info & Claims

CGO '07: Proceedings of the International Symposium on Code Generation and Optimization

Pages 259 - 270

https://doi.org/10.1109/CGO.2007.14

Published: 11 March 2007 Publication History

Abstract

Procedural abstraction (PA) extracts duplicate code segments into a newly created method and hence reduces code size. For embedded micro computers the amount of memory is still limited so code reduction is an important issue. This paper presents a novel approach to PA, that is especially targeted towards embedded systems. Earlier approaches of PA are blind with respect to code reordering, i.e., two code segments with the same semantic effect but with different instruction orders were not detected as candidates for PA. Instead of instruction sequences, in our approach the data flow graphs of basic blocks are considered. Compared to known PA techniques more than twice the number of instructions can be saved on a set of binaries, by detecting frequently appearing graph fragments with a graph mining tool based on the well known gSpan algorithm. The detection and extraction of graph fragments is not as straight forward as extracting sequential code fragments. NP-complete graph operations and special rules to decide which parts can be abstracted are needed. However, this effort pays off as smaller sizes significantly reduce costs on mass-produced embedded systems.

References

[1]

{1} Advanced RISC Machines Ltd. (ARM). An Introduction to THUMB, March 1995.

[2]

{2} R. Agrawal, T. Imielinski, and A. N. Swami. Mining Association Rules between Sets of Items in Large Databases. In P. Buneman and S. Jajodia, editors, Proc. 1993 ACM SIGMOD Int'l Conf. on Management of Data, pages 207-216, Washington, D. C., USA, 1993. ACM Press.

Digital Library

[3]

{3} aipop. AbsInt Angew. Informatik GmbH, Saarbrücken, http://www.AbsInt.com/aipop.

[4]

{4} B. Baker. On finding duplication and near-duplication in large software systems. In Second Working Conference on Reverse Engineering, pages 86-95, Los Alamitos, CA, USA, 1995. IEEE Computer Society.

Digital Library

[5]

{5} V. Barthelmann. Advanced Compiling Techniques to Reduce RAM Usage of Static Operating Systems. PhD thesis, Universität Erlangen-Nürnberg, Erlangen, Germany, 2004.

[6]

{6} I. Baxter, A. Yahin, L. De Moura, M. Sant'Anna, and L. Bier. Clone detection using abstract syntax trees. In Proceedings of the 1998 International Conference on Software Maintenance, pages 368-377, Bethesda, Maryland, USA, 1998. IEEE Computer Society.

Digital Library

[7]

{7} J. Bentley. Programming Pearls: Squeezing Space. Communications of the ACM, 27(5):416-421, May 1984.

Digital Library

[8]

{8} Á. Beszédes, R. Ferenc, T. Gyimóthy, A. Dolenc, and K. Karsisto. Survey of code-size reduction methods. ACM Computing Surveys, 35(3):223-267, Sept. 2003.

Digital Library

[9]

{9} C. Borgelt and M. Berthold. Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In Proc. IEEE Int'l Conf. on Data Mining ICDM, pages 51-58, Maebashi City, Japan, 2002. IEEE Computer Society Press.

Digital Library

[10]

{10} P. Brisk, A. Kaplan, R. Kastner, and M. Sarrafzadeh. Instruction generation and regularity extraction for reconfigurable processors. In Proceedings of the Int'l Conf. on Compilers, Architecture, and Synthesis for Embedded Systems, pages 262-269, New York, NY, USA, 2002. ACM Press.

Digital Library

[11]

{11} P. Brisk, J. Macbeth, A. Nahapetian, and M. Sarrafzadeh. A dictionary construction technique for code compression systems with echo instructions. In Y. Paek and R. Gupta, editors, Proceedings of the Conf. on Languages, Compilers, and Tools for Embedded Systems, pages 105-114, Chicago, USA, June 2005. ACM.

Digital Library

[12]

{12} T. Callahan, P. Chong, A. DeHon, and J. Wawrzynek. Fast module mapping and placement for datapaths in FPGAs. In ACM/SIGDA Int'l Symp. on Field Programmable Gate Arrays , pages 123-132, Monterey, CA, 1998.

Digital Library

[13]

{13} W. Cheung, W. Evans, and J. Moses. Predicated instructions for code compaction. In 7th Int'l Workshop on Software and Compilers for Embedded Systems (SCOPES), number 2826 in Lecture Notes in Computer Science, pages 17-32, 2003.

[14]

{14} A. Chowdhary, S. Kale, P. Saripella, N. Sehgal, and R. Gupta. A general approach for regularity extraction in datapath circuits. In IEEE Int'l Conf. on Computer Aided Design (ICCAD), pages 332-339, 1998.

Digital Library

[15]

{15} D. J. Cook and L. B. Holder. Graph-based data mining. IEEE Intelligent Systems, 15(2):32-41, 2000.

Digital Library

[16]

{16} B. De Sutter, B. De Bus, and K. De Bosschere. Sifting out the mud: Low level c++ code reuse. In Proceedings of the 17th ACM SIGPLAN Conf. on Object-oriented programming, systems, languages, and applications, pages 275-291, New York, NY, USA, 2002. ACM Press.

Digital Library

[17]

{17} B. De Sutter, H. Vandierendonck, B. De Bus, and K. De Bosschere. On the side-effects of code abstraction. In Proceedings of the 2003 ACM SIGPLAN Conf. on Languages, Compilers, and Tools for Embedded Systems, pages 244-253, New York, NY, USA, 2003. ACM Press.

Digital Library

[18]

{18} S. K. Debray, W. S. Evans, R. Muth, and B. de Sutter. Compiler Techniques for Code Compaction. ACM Trans. on Progamming Languages and Systems, 22(2):378-415, Mar. 2000.

Digital Library

[19]

{19} Diablo. Parallel Information System Group, University of Gent, Belgien, http://www.elis.ugent.be/diablo.

[20]

{20} dietlibc - a libc optimized for small size. http://www.fefe.de/dietlibc/.

[21]

{21} F. V. Fomin, F. Grandoni, and D. Kratsch. Measure and Conquer: A Simple O(n<sup>0.288n</sup>) Independent Set Algorithm. In Proc. of the 17th ACM-SIAM Symp. on Discrete Algorithms: SODA '06, pages 18-25, Miami, Florida, USA, Jan. 2006. ACM and SIAM.

Digital Library

[22]

{22} C. Fraser, E. Myers, and A. Wendt. Analyzing and compressing assembly code. In Proceedings of the ACM SIGPLAN '84 Symp. on Compiler Construction, pages 117-121, New York, NY, USA, 1984. ACM Press.

Digital Library

[23]

{23} C. W. Fraser, E. W. Myers, and A. L. Wendt. Analyzing and Compressing Assembly Code. In Proc. 1984 ACM SIGPLAN Symp. an Compiler Construction, pages 117-121, Montréal, Canada, June 1984.

Digital Library

[24]

{24} S. Furber. ARM System Architecture. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1996.

Digital Library

[25]

{25} S. Ghazizadeh and S. S. Chawathe. Seus: Structure extraction using summaries. In S. Lange, K. Satoh, and C. H. Smith, editors, 5th Int'l Conf. on Discovery Science, volume 2534 of Lecture Notes in Computer Science, pages 71-85, Lübeck, Germany, Nov. 2002. Springer.

Digital Library

[26]

{26} M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. Mibench: A free, commercially representative embedded benchmark suite. In Proceedings of the 4th IEEE Annual Workshop on Workload Characterization, pages 3-14, Austin, TX, USA, 2001. IEEE Computer Society.

Digital Library

[27]

{27} J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraphs in the presence of isomorphism. In Proc. of the 3rd IEEE Int'l Conf. on Data Mining ICDM, pages 549-552, Melbourne, FL, USA, Nov. 2003. IEEE Press.

Digital Library

[28]

{28} A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD '00: Proceedings of the 4th European Conf. on Principles of Data Mining and Knowledge Discovery, pages 13-23, London, UK, 2000. Springer.

Digital Library

[29]

{29} R. Komondoor and S. Horwitz. Eliminating duplication in source code via procedure extraction. Technical report, Comupter Sciences Department University of Wisconsin-Madison, Madison, WI, USA, 2002.

[30]

{30} D. Kumlander. A new exact Algorithm for the Maximum-Weight Clique Problem based on a Heuristic Vertex-Coloring and a Backtrack Search. In T. H. A. Le and D. T. Pham, editors, 5th Int'l Conf. on Modelling, Computation and Optimization in Information Systems and Management Sciences: MCO '04, pages 202-208, Metz, France, July 2004. Hermes Science Publishing Ltd.

[31]

{31} M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. In M. W. Berry, U. Dayal, C. Kamath, and D. B. Skillicorn, editors, Proceedings of the Fourth SIAM Int'l Conf. on Data Mining, pages 243-271, Orlando, FL, USA, Apr. 2004. SIAM.

Digital Library

[32]

{32} T. Meinl and I. Fischer. Subgraph mining. In J. Wang, editor, Encyclopedia of Data Warehousing and Mining, pages 1059-1063. Idea Group, Hershey, PA, USA, 2005.

[33]

{33} T. Meinl, I. Fischer, and M. Philippsen. Parallel Mining for Frequent Fragments on a Shared-Memory Multiprocessor -Results and Java-Obstacles-. In M. Bauer, A. Kröner, and B. Brandherm, editors, LWA 2005 - Beiträge zur GI-Workshopwoche Lernen, Wissensentdeckung, Adaptivität, pages 196-201, 2005.

[34]

{34} S. O. Memik, G. Memik, R. Jafari, and E. Kursun. Global resource sharing for synthesis of control data flow graphs on fpgas. In ACM/IEEE Design Automation Conference (DAC), pages 604-609, Anaheim, CA, June 2003.

Digital Library

[35]

{35} P. C. Nguyen, K. Ohara, H. Motoda, and T. Washio. Cl-gbi: A novel approach for extracting typical patterns from graph-structured data. In T. B. Ho, D. Cheung, and H. Liu, editors, Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conference, volume 3518 of Lecture Notes in Computer Science, pages 639-649, Hanoi, Vietnam, May 2005. Springer.

Digital Library

[36]

{36} S. Nijssen and J. N. Kok. The Gaston Tool for Frequent Subgraph Mining. Proc. Int'l Workshop on Graph-Based Tools (Grabats), Electronic Notes in Theoretical Computer Science, 127(1):77-87, 2004.

Digital Library

[37]

{37} S. Nijssen and J. N. Kok. Frequent subgraph miners: Run-time don't say everything. In T. Gärtner, G. C. Garriga, and T. Meinl, editors, Proceedings of the Int'l Workshop on Mining and Learning with Graphs (MLG 2006), pages 173-180, Berlin, Germany, 2006.

[38]

{38} W. Pugh. Compressing Java Class Files. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), pages 247-258, Atlanta, GA, USA, May 1999.

Digital Library

[39]

{39} D. Rao and F. Kurdahi. Partitioning by regularity extraction. In ACM/IEEE Design Automation Conference, pages 235- 238, Anaheim, CA, June 1992.

Digital Library

[40]

{40} B. D. Sutter, B. D. Bus, and K. D. Bosschere. Link-time binary rewriting techniques for program compaction. ACM Trans. Prog. Lang. Syst., 27(5):882-945, 2005.

Digital Library

[41]

{41} F. Tip, P. F. Sweeney, C. Laffra, A. Eisma, and D. Streeter. Practical Extraction Techniques for Java. ACM Trans. on Programming Languages and Systems, 24(6):625-666, Nov. 2002.

Digital Library

[42]

{42} N. Vanetik, E. Gudes, and S. Shimony. Computing frequent graph patterns from semistructured data. In Proc. of 2002 IEEE Int'l Conf. on Data Mining, pages 458-464, Maebashi City, Japan, Dec. 2002. IEEE Computer Society.

Digital Library

[43]

{43} T. Washio and H. Motoda. State of the Art of Graph-based Data Mining. SIGKDD Explorations Newsletter, 5(1):59- 68, July 2003.

Digital Library

[44]

{44} M. Wörlein, T. Meinl, I. Fischer, and M. Philippsen. A quantitative comparison of the subgraph miners MoFa, gSpan, FFSM, and Gaston. In A. Jorge, L. Torgo, P. Brazdil, R. Camacho, and J. Gama, editors, Knowledge Discovery in Database: PKDD 2005, Lecture Notes in Computer Science, pages 392-403, Porto, Portugal, 2005. Springer.

[45]

{45} X. Yan and J. Han. gSpan: Graph-based substructure pattern mining. In Proceedings IEEE Int'l Conf. on Data Mining ICDM, pages 721-723, Maebashi City, Japan, 2002. IEEE Computer Society Press.

Digital Library

[46]

{46} X. Yan and J. Han. Closegraph: Mining closed frequent graph patterns. In Proceedings of the 9th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pages 286-295, Washington, DC, USA, 2003. ACM Press.

Digital Library

[47]

{47} M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New Algorithms for Fast Discovery of Association Rules. In D. Heckerman, H. Mannila, D. Pregibon, R. Uthurusamy, and M. Park, editors, Proc. of 3rd Int'l Conf. on Knowledge Discovery and Data Mining, pages 283-296, Newport Beach, CA, USA, Aug. 1997. AAAI Press.

[48]

{48} D. C. Zaretsky, G. Mittal, R. P. Dick, and P. Banerjee. Dynamic template generation for resource sharing in control and data flow graphs. In Proceedings of the 19th Int'l Conf. on VLSI Design, pages 465-468, Hyderabad, India, Jan. 2006. IEEE Computer Society.

Digital Library

Cited By

Chen KSu CHakert CBuschjäger SLee CLee JMorik KChen J(2022)Efficient Realization of Decision Trees for Real-Time InferenceACM Transactions on Embedded Computing Systems10.1145/350801921:6(1-26)Online publication date: 18-Oct-2022
https://dl.acm.org/doi/10.1145/3508019
Rocha RPetoumenos PWang ZCole MHazelwood KLeather HHenkel JLiu X(2021)HyFM: function merging for freeProceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3461648.3463852(110-121)Online publication date: 22-Jun-2021
https://dl.acm.org/doi/10.1145/3461648.3463852
da Silva AKind Bde Souza Magalhães JRocha JGuimarães BPereira FLee J(2021)AnghaBenchProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370322(378-390)Online publication date: 27-Feb-2021
https://dl.acm.org/doi/10.1109/CGO51591.2021.9370322
Show More Cited By

Index Terms

Graph-Based Procedural Abstraction

Recommendations

Graph-Based Procedural Abstraction
Procedural Abstraction with Reverse Prefix Trees
CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization

For memory constrained environments like embedded systems, optimization for size is often as important as, if not more important than, optimization for execution speed. A common technique for compacting code is procedural abstraction. Equivalent code ...
Detecting malware based on DNS graph mining
Special issue on Big Data in Future Sensing

Malware remains a major threat to nowadays Internet. In this paper, we propose a DNS graph mining-based malware detection approach. A DNS graph is composed of DNS nodes, which represent server IPs, client IPs, and queried domain names in the process of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CGO '07: Proceedings of the International Symposium on Code Generation and Optimization

March 2007

346 pages

ISBN:0769527647

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 11 March 2007

Check for updates

Qualifiers

Article

Conference

CGO07

Sponsor:

CGO07: 5th Annual IEEE / ACM International Symposium on Code Generation and Optimization

March 11 - 14, 2007

Acceptance Rates

CGO '07 Paper Acceptance Rate 27 of 84 submissions, 32%;

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
394
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen KSu CHakert CBuschjäger SLee CLee JMorik KChen J(2022)Efficient Realization of Decision Trees for Real-Time InferenceACM Transactions on Embedded Computing Systems10.1145/350801921:6(1-26)Online publication date: 18-Oct-2022
https://dl.acm.org/doi/10.1145/3508019
Rocha RPetoumenos PWang ZCole MHazelwood KLeather HHenkel JLiu X(2021)HyFM: function merging for freeProceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3461648.3463852(110-121)Online publication date: 22-Jun-2021
https://dl.acm.org/doi/10.1145/3461648.3463852
da Silva AKind Bde Souza Magalhães JRocha JGuimarães BPereira FLee J(2021)AnghaBenchProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370322(378-390)Online publication date: 27-Feb-2021
https://dl.acm.org/doi/10.1109/CGO51591.2021.9370322
Mokri PHempstead MXie Y(2020)Early-stage automated accelerator identification tool for embedded systems with limited areaProceedings of the 39th International Conference on Computer-Aided Design10.1145/3400302.3415733(1-8)Online publication date: 2-Nov-2020
https://dl.acm.org/doi/10.1145/3400302.3415733
Rocha RPetoumenos PWang ZCole MLeather HKandemir MJimborean AMoseley T(2019)Function merging by sequence alignmentProceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization10.5555/3314872.3314892(149-163)Online publication date: 16-Feb-2019
https://dl.acm.org/doi/10.5555/3314872.3314892
Elghazel HHacid M(2011)Aggregated search in graph databasesProceedings of the 8th international conference on Graph-based representations in pattern recognition10.5555/2009206.2009218(92-101)Online publication date: 18-May-2011
https://dl.acm.org/doi/10.5555/2009206.2009218
Schaeckeler SShang W(2009)Procedural Abstraction with Reverse Prefix TreesProceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO.2009.25(243-253)Online publication date: 22-Mar-2009
https://dl.acm.org/doi/10.1109/CGO.2009.25

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents