Improving Password Guessing Using Byte Pair Encoding

Xingxing Wang¹⁵,
Dakui Wang¹⁶,
Xiaojun Chen^15,16,
Rui Xu^15,16,
Jinqiao Shi^15,16 &
…
Li Guo^15,16

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10599))

Included in the following conference series:

International Conference on Information Security

1332 Accesses

Abstract

Recent many password guessing algorithms based on the Probabilistic Context-Free Grammars (PCFGs) model brought significant improvements in password cracking. These algorithms analyzed common semantic patterns (letter semantic patterns, date patterns, keyboard patterns etc.) from passwords and modeled the construction process of passwords by using PCFGs. However, there still left a large fraction of integral segments in passwords which seem no semantics. Can those segments be deeply analyzed and help to make further improvements on password cracking? Motivated by this challenge, this paper employs Byte Pair Encoding (BPE) algorithm for password segmentation, extracting those non-semantical patterns which are frequently used in passwords subconsciously by people. Based on the segmentation, we propose a BPE-PCFGs model to generate password guesses. Furthermore, we also utilize the existing common semantic patterns and BPE patterns to construct a new Rich-BPE-PCFGs password generator. Experimental results on large-scale password sets show that our Rich-BPE-PCFGs model can obtain a 2.36%–37.56% improvement over the original PCFGs model, which is a good complement to existing password guessing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improving Deep Learning Based Password Guessing Models Using Pre-processing

Deep Learning vs. Traditional Probabilistic Models: Case Study on Short Inputs for Password Guessing

Password Guessing Based on Semantic Analysis and Neural Networks

References

Herley, C., Van Oorschot, P.: A research agenda acknowledging the persistence of passwords. IEEE Secur. Priv. 10(1), 28–36 (2012)
Article Google Scholar
Paar, C., Pelzl, J.: Understanding Cryptography: A Textbook for Students and Practitioners. Springer, Heidelberg (2010)
Book MATH Google Scholar
JohntheRipper: Password cracking wordlist. http://www.openwall.com/wordlists/
WikiPedia: Dictionary attack. https://en.wikipedia.org/wiki/Dictionary_attack
Weir, M., Aggarwal, S., De Medeiros, B., Glodek, B.: Password cracking using probabilistic context-free grammars. In: 2009 30th IEEE Symposium on Security and Privacy, pp. 391–405. IEEE (2009)
Google Scholar
Veras, R., Collins, C., Thorpe, J.: On semantic patterns of passwords and their security impact. In: NDSS (2014)
Google Scholar
Veras, R., Thorpe, J., Collins, C.: Visualizing semantics in passwords: the role of dates. In: Proceedings of the Ninth International Symposium on Visualization for Cyber Security, pp. 88–95. ACM (2012)
Google Scholar
Houshmand, S., Aggarwal, S., Flood, R.: Next gen PCFG password cracking. IEEE Trans. Inf. Forensics Secur. 10(8), 1776–1791 (2015)
Article Google Scholar
Li, Z., Han, W., Xu, W.: A large-scale empirical analysis of Chinese web passwords. In: USENIX Security Symposium, pp. 559–574 (2014)
Google Scholar
Li, Y., Wang, H., Sun, K.: A study of personal information in human-chosen passwords and its security implications. In: IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on Computer Communications, pp. 1–9. IEEE (2016)
Google Scholar
Wang, D., Zhang, Z., Wang, P., Yan, J., Huang, X.: Targeted online password guessing: an underestimated threat. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1242–1254. ACM (2016)
Google Scholar
Gage, P.: A new algorithm for data compression. C Users J. 12(2), 23–38 (1994)
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units (2015). arXiv preprint: arXiv:1508.07909
Shibata, Y., Kida, T., Fukamachi, S., Takeda, M., Shinohara, A., Shinohara, T., Arikawa, S.: Byte pair encoding: a text compression scheme that accelerates pattern matching. Technical report DOI-TR-161, Department of Informatics, Kyushu University (1999)
Google Scholar
Kelley, P.G., Komanduri, S., Mazurek, M.L., Shay, R., Vidas, T., Bauer, L., Christin, N., Cranor, L.F., Lopez, J.: Guess again (and again and again): measuring password strength by simulating password-cracking algorithms. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 523–537. IEEE (2012)
Google Scholar
Zhang, Y., Monrose, F., Reiter, M.K.: The security of modern password expiration: an algorithmic framework and empirical analysis. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, pp. 176–186. ACM (2010)
Google Scholar
Bonneau, J.: The science of guessing: analyzing an anonymized corpus of 70 million passwords. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 538–552. IEEE (2012)
Google Scholar
Chou, H.C., Lee, H.C., Yu, H.J., Lai, F.P., Huang, K.H., Hsueh, C.W.: Password cracking based on learned patterns from disclosed passwords. IJICIC 9(2), 821–839 (2013)
Google Scholar
Weir, C.M.: Using probabilistic techniques to aid in password cracking attacks. The Florida State University (2010)
Google Scholar
Pinyin, S.: Chinese Pinyin names in sogous list (2015). http://pinyin.sogou.com/dict/detail/index/34816
SSA: Popular baby names. U.S. social security administration (2013). http://www.ssa.gov/oact/babynames/limits.html

Download references

Author information

Authors and Affiliations

School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, China
Xingxing Wang, Xiaojun Chen, Rui Xu, Jinqiao Shi & Li Guo
Institute of Information Engineering, Chinese Academy of Science, Beijing, China
Dakui Wang, Xiaojun Chen, Rui Xu, Jinqiao Shi & Li Guo

Authors

Xingxing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dakui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Rui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jinqiao Shi
View author publications
You can also search for this author in PubMed Google Scholar
Li Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dakui Wang .

Editor information

Editors and Affiliations

Inria, Tokyo, Japan
Phong Q. Nguyen
Singapore University of Technology and Design, Singapore, Singapore
Jianying Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Wang, D., Chen, X., Xu, R., Shi, J., Guo, L. (2017). Improving Password Guessing Using Byte Pair Encoding. In: Nguyen, P., Zhou, J. (eds) Information Security. ISC 2017. Lecture Notes in Computer Science(), vol 10599. Springer, Cham. https://doi.org/10.1007/978-3-319-69659-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-69659-1_14
Published: 20 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69658-4
Online ISBN: 978-3-319-69659-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving Password Guessing Using Byte Pair Encoding

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Deep Learning Based Password Guessing Models Using Pre-processing

Deep Learning vs. Traditional Probabilistic Models: Case Study on Short Inputs for Password Guessing

Password Guessing Based on Semantic Analysis and Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improving Password Guessing Using Byte Pair Encoding

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Deep Learning Based Password Guessing Models Using Pre-processing

Deep Learning vs. Traditional Probabilistic Models: Case Study on Short Inputs for Password Guessing

Password Guessing Based on Semantic Analysis and Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation