Nothing Special   »   [go: up one dir, main page]

skip to main content
survey

A Survey of Automatic Protocol Reverse Engineering Tools

Published: 09 December 2015 Publication History

Abstract

Computer network protocols define the rules in which two entities communicate over a network of unique hosts. Many protocol specifications are unknown, unavailable, or minimally documented, which prevents thorough analysis of the protocol for security purposes. For example, modern botnets often use undocumented and unique application-layer communication protocols to maintain command and control over numerous distributed hosts. Inferring the specification of closed protocols has numerous advantages, such as intelligent deep packet inspection, enhanced intrusion detection system algorithms for communications, and integration with legacy software packages. The multitude of closed protocols coupled with existing time-intensive reverse engineering methodologies has spawned investigation into automated approaches for reverse engineering of closed protocols. This article summarizes and organizes previously presented automatic protocol reverse engineering tools by approach. Approaches that focus on reverse engineering the finite state machine of a target protocol are separated from those that focus on reverse engineering the protocol format.

References

[1]
Rakesh Agrawal and Srikant Ramakrishnan. 1994. Fast algorithms for mining association rules. In 20th International Conference on Very Large Data Bases (VLDB), Vol. 1215.
[2]
Glenn Ammons, Rastislav Bodík, and James R. Larus. 2002. Mining specifications. In 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’02). ACM, New York, NY, 4--16.
[3]
João Antunes, Nuno Neves, and Paulo Verissimo. 2011. Reverse engineering of protocols from network traces. In 2011 18th Working Conference on Reverse Engineering (WCRE), 169,178.
[4]
Marshall Beddoe. 2004. The protocol informatics project. Retrieved March 19, 2014 from http://www.4tphi.net/∼awalters/PI/PI.html.
[5]
Nikita Borisov, David J. Brumley, Helen J. Wang, and Chuanxiong Guo. 2007. Generic application-level protocol analyzer and its language. In Network and Distributed System Security Symposium.
[6]
Juan Caballero, Heng Yin, Zhenkai Liang, and Dawn Song. 2007. Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In 14th ACM Conference on Computer and Communications Security (CCS’07). ACM, New York, NY, 317--329. http://doi.acm.org/10.1145/1315245.1315286
[7]
Juan Caballero, Pongsin Poosankam, Christian Kreibich, and Dawn Song. 2009. Dispatcher: Enabling active botnet infiltration using automatic protocol reverse-engineering. In Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS’09). ACM, New York, NY, 621--634. http://doi.acm.org/10.1145/1653662.1653737
[8]
Juan Caballero and Dawn Song. 2013. Automatic protocol reverse-engineering: Message format extraction and field semantics inference. International Journal of Computer and Telecommunications Networking 57, 2. Elsevier, 451--474.
[9]
Chia Yuan Cho, Domagoj Babić, Eui Chul Richard Shin, and Dawn Song. 2010. Inference and analysis of formal models of botnet command and control protocols. In Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS’10). ACM, New York, NY, 426--439. http://doi.acm.org/10.1145/1866307.1866355
[10]
Paolo Milani Comparetti, Gilbert Wondracek, Christopher Kruegel, and Engin Kirda. 2009. Prospex: Protocol specification extraction. In 2009 30th IEEE Symposium on Security and Privacy, 110--125.
[11]
Ed Crocker. 2008. Augmented BNF for Syntax Specifications: ABNF. Retrieved February 27, 2014 from http://tools.ietf.org/html/rfc5234.
[12]
Weidong Cui, Vern Paxson, Nicholas C. Weaver, and Randy H. Katz. 2006. Protocol-independent adaptive replay of application dialog. In Proceedings of the 13th Symposium on Network and Distributed System Security (NDSS’06).
[13]
Weidong Cui, Jayanthkumar Kannan, and Helen J. Wang. 2007. Discoverer: Automatic protocol description generation from network traces. In USENIX Security Symposium.
[14]
Weidong Cui, Marcus Peinado, Karl Chen, Helen J. Wang, and Luis Irun-Briz. 2008. Tupni: Automatic reverse engineering of input formats. In 15th ACM Conference on Computer and Communications Security (CCS’08). ACM, New York, NY, 391--402. http://doi.acm.org/10.1145/1455770.1455820
[15]
Alberto Dainotti, Antonio Pescape, and Kimberly Claffy. 2012. Issues and future directions in traffic classification. IEEE Network 26, 1, (Jan.-Feb. 2012), 35--40.
[16]
Serge Gorbunov and Arnold Rosenbloom. 2010. AutoFuzz: Automated network protocol fuzzing framework. International Journal of Computer Science and Network Security 10, 8, 239--245.
[17]
IEEE Standards Association. 2012. IEEE Standard for Electric Power Systems Communications—Distributed Network Protocol (DNP3).
[18]
IETF.org. 1999. RFC 2616—Hypertext Transfer Protocol—HTTP/1.1. Retrieved July 20, 2015 from https://www.ietf.org/rfc/rfc2616.txt.
[19]
IETF.org. 2014. RFC 7230—Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. Retrieved July 20, 2015 from https://tools.ietf.org/html/rfc7230.
[20]
ITU.int. 2014. Introduction to ASN.1. Retrieved February 27, 2014 from http://www.itu.int/en/ITU-T/asn1/Pages/introduction.aspx.
[21]
Jim Kurose and Keith Ross. 2013. Computer Networking: A Top-Down Approach (6th ed.). Addison-Wesley, Upper Saddle River, NJ.
[22]
Patrick LaRoche, A. Nur Zincir-Heywood, and Malcolm I. Heywood. 2012. Network protocol discovery and analysis via live interaction. In Applications of Evolutionary Computation. Springer, Berlin, 11--20.
[23]
Patrick LaRoche, Aimee Burrows, and A. Nur Zincir-Heywood. 2013. How far an evolutionary approach can go for protocol state analysis and discovery. In 2013 IEEE Congress on Evolutionary Computation, 3228--3235.
[24]
David Lee and Krishan Sabnani. 1993. Reverse-engineering of communication protocols. In IEEE International Conference on Network Protocols (ICNP), 208--216.
[25]
David Lee and Mihalis Yannakakis. 1996. Principles and methods of testing finite state machines—A survey. Proceedings of the IEEE 84, 8, 1090--1123.
[26]
Corrado Leita, Ken Mermoud, and Marc Dacier. 2005. ScriptGen: An automated script generation tool for HoneyD. In 21st Annual Computer Security Applications Conference (ACSAC’05), 200--214.
[27]
Xiangdong Li and Li Chen. 2011. A survey on methods of automatic protocol reverse engineering. In 2011 7th International Conference on Computational Intelligence and Security (CIS), 685--689.
[28]
Zhiqiang Lin, Xuxian Jiang, Dongyan Xu, and Xiangyu Zhang. 2008. Automatic protocol format reverse engineering through context-aware monitored execution. In NDSS, 1--15.
[29]
Zhiqiang Lin, Xiangyu Zhang, and Dongyan Xu. 2010. Reverse engineering input syntactic structure from program execution and its applications. In IEEE Transactions on Software Engineering 36, 5 (2010) 688--703.
[30]
Min Liu, Chunfu Jia, Lu Liu, and Zhi Wang. 2013. Extracting sent message formats from executables using backward slicing. In 2013 4th International Conference on Emerging Intelligent Data and Web Technologies (EIDWT), 377--384.
[31]
Jian-Zhen Luo, and Shun-Zheng Yu. 2013. Position-based automatic reverse engineering of network protocols. Journal of Network and Computer Applications 36, 3 (2013), 1070--1077.
[32]
Justin Ma, Kirill Levchenko, Christian Kreibich, Stefan Savage, and Geoffrey M. Voelker. 2006. Unexpected means of protocol inference. In 6th ACM SIGCOMM Conference on Internet Measurement (IMC’06). ACM, New York, NY, 313--326. http://doi.acm.org/10.1145/1177080.1177123
[33]
George Mealy. 1955. A method for synthesizing sequential circuits. In Bell System Technical Journal 34, 5 (1955), 1045--1079.
[34]
Milton Mueller and Asghari Hadi. 2012. Deep packet inspection and bandwidth management: Battles over BitTorrent in Canada and the United States. Telecommunications Policy 36, 6 (2012), 462--475.
[35]
Saul Needleman and Christian Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48 (1970), 443--53.
[36]
Norton.com. 2014. Bots and botnets—A growing threat. Retrieved February 26, 2014 from https://us.norton.com/botnet/promo.
[37]
Sandip C. Patel, Ganesh D. Bhatt, and James H. Graham. 2009. Improving the cyber security of SCADA communication networks. Communications of the ACM 52, 7 (July 2009), 139--142. http://doi.acm.org/10.1145/1538788.1538820
[38]
PeachFuzzer.com. 2014. Peach Fuzzer Overview. Retrieved February 26, 2014 from http://peachfuzzer.com/pdf/Peach-Overview-DejaVuSecurity-Datasheet-2014.pdf.
[39]
Christian Rossow and Christian J. Dietrich. 2013. Provex: Detecting botnets with encrypted command and control channels. In Detection of Intrusions and Malware, and Vulnerability Assessment, Lecture Notes in Computer Science, Vol. 7967. Springer, Berlin, 21--40.
[40]
Maxim Shevertalov and Spiros Mancoridis. 2007. A reverse engineering tool for extracting protocols of networked applications. In 14th Working Conference on Reverse Engineering (WCRE’07). 229--238.
[41]
Skype.com. 2014. TLS and SRTP for Skype Connect: Technical Datasheet. Retrieved February 27, 2014 from https://support.skype.com/resources/sites/SKYPE/content/live/DOCUMENTS/0/DO14/en_US/skype-connect-technical-datasheet.pdf.
[42]
TCPDump/LibPCap. 2010. TCPDump & LibPCap. Retrieved March 19, 2014 from http://www.tcpdump.org/.
[43]
Naftali Tishby, Fernando Pereira, and William Bialek. 1999. The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, 368--377.
[44]
Li Tong, Yuan Liu, Chun-rui Zhang, Fan-zhi Meng, and Yang Yue. 2014. A novel method for delimiting frames of unknown protocol. In 2014 IEEE Workshop on Electronics, Computer and Applications, 552--555.
[45]
Andrew Tridgell. 2003. How SAMBA Was Written. Retrieved February 26, 2014 from http://www.samba.org/ftp/tridge/misc/french_cafe.txt.
[46]
Antonio Trifilo, Stefan Burschka, and Ernst Biersack. 2009. Traffic to protocol reverse engineering. In 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 1--8.
[47]
Helen J. Wang, Chuanxiong Guo, Daniel R. Simon, and Alf Zugenmaier. 2004. Shield: Vulnerability-driven network filters for preventing known vulnerability exploits. In Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM’04). ACM, New York, NY, 193--204.
[48]
Zhi Wang, Xuxian Jiang, Weidong Cui, Xinyuan Wang, and Mike Grace. 2009. ReFormat: Automatic reverse engineering of encrypted messages. In Computer Security—ESORICS 2009. Springer, Berlin, 200--215.
[49]
Yipeng Wang, Xingjian Li, Jiao Meng, Yong Zhao, Zhibin Zhang, and Li Guo. 2011a. Biprominer: Automatic mining of binary protocol features. In 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), 179--184.
[50]
Yipeng Wang, Zhibin Zhang, Danfeng Yao, Buyun Qu, and Li Guo. 2011b. Inferring protocol state machine from network traces: A probabilistic approach. Applied Cryptography and Network Security 2011.
[51]
Yipeng Wang, XiaoChun Yun, M. Zubair Shafiq, Liyan Wang, Alex X. Liu, Zhibin Zhang, Danfeng Yao, Yong Zheng Zhang, and Li Guo. 2012. A semantics aware approach to automated reverse engineering unknown protocols. In 2012 20th IEEE International Conference on Network Protocols (ICNP).
[52]
Yong Wang. 2013. Protocol Specification Inference Based on Keywords Identification. Advanced Data Mining and Applications. Springer, Berlin, 443--454.
[53]
T. A. Welch. 1984. A technique for high-performance data compression. Computer 17, 6 (1984), 8--19.
[54]
Wine.org. 2014. About Wine. Retrieved February 26, 2014 from http://www.winehq.org/about/.
[55]
Gilbert Wondracek, Paolo Milani Comparetti, Christopher Kruegel, and Engin Kirda. 2008. Automatic network protocol analysis. In NDSS, 1--14.
[56]
Ming-Ming Xiao, Shun-Zheng Yu, and Yu Wang. 2009. Automatic network protocol automaton extraction. In 2009 3rd International Conference on Network and System Security, 336--343.
[57]
Zhao Zhang, Qiao-Yan Wen, and Wen Tang. 2012. Mining protocol state machines by interactive grammar inference. In 2012 3rd International Conference on Digital Manufacturing and Automation (ICDMA), 524--527.

Cited By

View all
  • (2024)Toward Automated Field Semantics Inference for Binary Protocol Reverse EngineeringIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332666619(764-776)Online publication date: 1-Jan-2024
  • (2024)Multigranularity Feature Automatic Marking-Based Deep Learning for Anomaly Detection of Industrial Control SystemsIEEE Open Journal of Instrumentation and Measurement10.1109/OJIM.2024.34184663(1-10)Online publication date: 2024
  • (2024)PRETT2: Discovering HTTP/2 DoS Vulnerabilities via Protocol Reverse EngineeringComputer Security – ESORICS 202410.1007/978-3-031-70890-9_1(3-23)Online publication date: 16-Sep-2024
  • Show More Cited By

Index Terms

  1. A Survey of Automatic Protocol Reverse Engineering Tools

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 48, Issue 3
      February 2016
      619 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/2856149
      • Editor:
      • Sartaj Sahni
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 December 2015
      Accepted: 01 September 2015
      Revised: 01 July 2015
      Received: 01 August 2014
      Published in CSUR Volume 48, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Protocol reverse engineering
      2. communication security

      Qualifiers

      • Survey
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)176
      • Downloads (Last 6 weeks)20
      Reflects downloads up to 22 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Toward Automated Field Semantics Inference for Binary Protocol Reverse EngineeringIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332666619(764-776)Online publication date: 1-Jan-2024
      • (2024)Multigranularity Feature Automatic Marking-Based Deep Learning for Anomaly Detection of Industrial Control SystemsIEEE Open Journal of Instrumentation and Measurement10.1109/OJIM.2024.34184663(1-10)Online publication date: 2024
      • (2024)PRETT2: Discovering HTTP/2 DoS Vulnerabilities via Protocol Reverse EngineeringComputer Security – ESORICS 202410.1007/978-3-031-70890-9_1(3-23)Online publication date: 16-Sep-2024
      • (2023)Extracting protocol format as state machine via controlled static loop analysisProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620630(7019-7036)Online publication date: 9-Aug-2023
      • (2023)PREIUD: An Industrial Control Protocols Reverse Engineering Tool Based on Unsupervised Learning and Deep Neural Network MethodsSymmetry10.3390/sym1503070615:3(706)Online publication date: 11-Mar-2023
      • (2023)Using SAT Solvers to Reverse-Engineer FSM Models of Digital DevicesElectronics10.3390/electronics1222468012:22(4680)Online publication date: 17-Nov-2023
      • (2023)Anomaly Detection Method for Unknown Protocols in a Power Plant ICS Network with Decision TreeApplied Sciences10.3390/app1307420313:7(4203)Online publication date: 26-Mar-2023
      • (2023)Unknown Binary Protocol Recognition Algorithm Based on One Class of Classification and One‐Dimensional CNNMathematical Problems in Engineering10.1155/2023/19190452023:1Online publication date: 26-Apr-2023
      • (2023)Lifting Network Protocol Implementation to Precise Format Specification with Security ApplicationsProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3616614(1287-1301)Online publication date: 15-Nov-2023
      • (2023)Using of NLP Methods to Separate Traffic Packets of Different Protocols2023 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT)10.1109/USBEREIT58508.2023.10158858(344-347)Online publication date: 15-May-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media