Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3460120.3484743acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Chunk-Level Password Guessing: Towards Modeling Refined Password Composition Representations

Published: 13 November 2021 Publication History

Abstract

Textual password security hinges on the guessing models adopted by attackers, in which a suitable password composition representation is an influential factor. Unfortunately, the conventional models roughly regard a password as a sequence of characters, or natural-language-based words, which are password-irrelevant. Experience shows that passwords exhibit internal and refined patterns, e.g., "4ever, ing or 2015", varying significantly among periods and regions. However, the refined representations and their security impacts could not be automatically understood by state-of-the-art guessing models (e.g., Markov).
In this paper, we regard a password as a composition of several chunks, where a chunk is a sequence of related characters that appear together frequently, to model passwords. Based on the concept, we propose a password-specific segmentation method that can automatically split passwords into several chunks, and then build three chunk-level guessing models, adopted from Markov, Probabilistic Context-free Grammar (PCFG) and neural-network-based models. Based on the extensive evaluation with over 250 million passwords, these chunk-level models can improve their guessing efficiency by an average of 5.7%, 51.2% and 41.9%, respectively, in an offline guessing scenario, showcasing the power of a suitable password representation during attacks. By analysing these efficient attacks, we find that the presence of common chunks in a password is a stronger indicator for password vulnerability than the character class complexity. To protect users against such attacks, we develop a client-side and real-time password strength meter to estimate the passwords' resistance based on chunk-level guessing models.

References

[1]
1Password. https://1password.com/. https://1password.com/.
[2]
Dashlane. https://www.dashlane.com/. https://www.dashlane.com/.
[3]
LastPass. https://lastpass.com/. https://lastpass.com/.
[4]
Jens, Steube Hashcat. https://hashcat.net/hashcat/, 2009-., 2009-.
[5]
Rockyou. https://www.rockyou.com/. https://www.rockyou.com/, 2009.
[6]
178. https://www.178.com/, 2011.
[7]
CSDN. https://www.csdn.net/. http://www.csdn.net/company/about.html, 2011.
[8]
Neopets. https://www.neopets.com/, 2011.
[9]
Youku. https://www.youku.com/. Youku. https://www.youku.com/, 2015.
[10]
NLTK, 2017. http://www.nltk.org/_modules/nltk/stem/snowball.html.
[11]
English affix, 2018. https://sites.google.com/site/itskys/englsih-study/ying-yu-zhong-chang-jian-de-qian-zhui-ji-qi-shi-yi.
[12]
Jeremiah Blocki, Benjamin Harsha, and Samson Zhou. On the economics of offline password cracking. In 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21--23 May 2018, San Francisco, California, USA, pages 853--871. IEEE Computer Society, 2018.
[13]
Joseph Bonneau, Cormac Herley, Paul C. van Oorschot, and Frank Stajano. The quest to replace passwords: A framework for comparative evaluation of web authentication schemes. In 2012 IEEE Symposium on Security and Privacy, pages 553--567, 2012.
[14]
Joseph Bonneau, Cormac Herley, Paul C. van Oorschot, and Frank Stajano. Passwords and the evolution of imperfect authentication. Commun. ACM, 58(7):78--87, 2015.
[15]
Joseph Bonneau and Ekaterina Shutova. Linguistic properties of multi-word passphrases. In Jim Blythe, Sven Dietrich, and L. Jean Camp, editors, Financial Cryptography and Data Security - FC 2012 Workshops, USEC and WECSR 2012, Kralendijk, Bonaire, March 2, 2012, Revised Selected Papers, volume 7398 of Lecture Notes in Computer Science, pages 1--12. Springer, 2012.
[16]
Claude Castelluccia, Markus Dü rmuth, and Daniele Perito. Adaptive password-strength meters from markov models. In 19th Annual Network and Distributed System Security Symposium, NDSS 2012, San Diego, California, USA, February 5--8, 2012. The Internet Society, 2012.
[17]
Anupam Das, Joseph Bonneau, Matthew Caesar, Nikita Borisov, and Xiaofeng Wang. The tangled web of password reuse. In Proceedings of NDSS, 2014.
[18]
M. Davies. The corpus of comtemporary american english:425 million words, 1990-persent, 2020.
[19]
Xavier de Carné de Carnavalet and Mohammad Mannan. A large-scale evaluation of high-impact password strength meters. ACM Trans. Inf. Syst. Secur., 18(1):1:1--1:32, 2015.
[20]
Matteo Dell'Amico and Maurizio Filippone. Monte carlo strength evaluation: Fast and reliable password checking. In Indrajit Ray, Ninghui Li, and Christopher Kruegel, editors, Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, October 12--16, 2015, pages 158--169. ACM, 2015.
[21]
Matteo Dell'Amico, Pietro Michiardi, and Yves Roudier. Password strength: An empirical analysis. In Proceedings IEEE INFOCOM 2010, pages 1--9. IEEE, 2010.
[22]
Markus Dürmuth, Fabian Angelstorf, Claude Castelluccia, Daniele Perito, and Abdelberi Chaabane. Omen: Faster password guessing using an ordered markov enumerator. In Frank Piessens, Juan Caballero, and Nataliia Bielova, editors, Engineering Secure Software and Systems, pages 119--132, Cham, 2015. Springer International Publishing.
[23]
Brian Everitt. The Cambridge dictionary of statistics. Cambridge University Press, Cambridge, UK; New York, 2002.
[24]
Dinei Florê ncio, Cormac Herley, and Paul C. van Oorschot. Pushing on string: the 'don't care' region of password strength. Commun. ACM, 59(11):66--74, 2016.
[25]
Philip Gage. A new algorithm for data compression. The C Users Journal, 12:23--38, 02 1994.
[26]
Maximilian Golla and Markus Dü rmuth. On the accuracy of password strength meters. In David Lie, Mohammad Mannan, Michael Backes, and XiaoFeng Wang, editors, Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15--19, 2018, pages 1567--1582. ACM, 2018.
[27]
Grant.Jenks. Wordsegment 1.3.1:description, 2018.
[28]
Mumtaz Abdul Hameed and Nalin Asanka Gamagedara Arachchilage. On the impact of perceived vulnerability in the adoption of information systems security innovations. CoRR, abs/1904.08229, 2019.
[29]
Weili Han, Zhigong Li, Minyue Ni, Guofei Gu, and Wenyuan Xu. Shadow attacks based on password reuses: A quantitative empirical analysis. IEEE Trans. Dependable Sec. Comput., 15(2):309--320, 2018.
[30]
Weili Han, Ming Xu, Junjie Zhang, Chuanwang Wang, Kai Zhang, and X. Sean Wang. Transpcfg: Transferring the grammars from short passwords to guess long passwords effectively. IEEE Trans. Inf. Forensics Secur., 16:451--465, 2021.
[31]
Markus Jakobsson and Mayank Dhiman. The benefits of understanding passwords. In Patrick Traynor, editor, 7th USENIX Workshop on Hot Topics in Security, HotSec'12, Bellevue, WA, USA, August 7, 2012. USENIX Association, 2012.
[32]
Slava M. Katz. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Signal Process., 35(3):400--401, 1987.
[33]
Patrick Gage Kelley, Saranga Komanduri, Michelle L. Mazurek, Richard Shay, Timothy Vidas, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, and Julio Ló pez. Guess again (and again and again): Measuring password strength by simulating password-cracking algorithms. In IEEE Symposium on Security and Privacy, SP 2012, 21--23 May 2012, San Francisco, California, USA, pages 523--537. IEEE Computer Society, 2012.
[34]
Johannes Kiesel, Benno Stein, and Stefan Lucks. A large-scale analysis of the mnemonic password advice. In 24th Annual Network and Distributed System Security Symposium, NDSS 2017, San Diego, California, USA, February 26 - March 1, 2017. The Internet Society, 2017.
[35]
Saranga Komanduri. Modeling the Adversary to Evaluate Password Strength with Limited Samples. PhD thesis, Carnegie Mellon University, 2016.
[36]
Yue Li, Haining Wang, and Kun Sun. A study of personal information in human-chosen passwords and its security implications. In 35th Annual IEEE International Conference on Computer Communications, INFOCOM 2016, San Francisco, CA, USA, April 10--14, 2016, pages 1--9. IEEE, 2016.
[37]
Zhigong Li, Weili Han, and Wenyuan Xu. A large-scale empirical analysis of chinese web passwords. In Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, August 20--22, 2014., pages 559--574, 2014.
[38]
Enze Liu, Amanda Nakanishi, Maximilian Golla, David Cash, and Blase Ur. Reasoning analytically about password-cracking software. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19--23, 2019, pages 380--397. IEEE, 2019.
[39]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692, 2019.
[40]
Sanam Ghorbani Lyastani, Michael Schilling, Sascha Fahl, Michael Backes, and Sven Bugiel. Better managed than memorized? studying the impact of managers on password strength and reuse. In William Enck and Adrienne Porter Felt, editors, 27th USENIX Security Symposium, USENIX Security 2018, Baltimore, MD, USA, August 15--17, 2018, pages 203--220. USENIX Association, 2018.
[41]
Sanam Ghorbani Lyastani, Michael Schilling, Michaela Neumayr, Michael Backes, and Sven Bugiel. Is FIDO2 the kingslayer of user authentication? A comparative usability study of FIDO2 passwordless authentication. In 2020 IEEE Symposium on Security and Privacy, SP 2020, San Francisco, CA, USA, May 18--21, 2020, pages 268--285. IEEE, 2020.
[42]
Jerry Ma, Weining Yang, Min Luo, and Ninghui Li. A study of probabilistic password models. In 2014 IEEE Symposium on Security and Privacy, SP 2014, Berkeley, CA, USA, May 18--21, 2014, pages 689--704. IEEE Computer Society, 2014.
[43]
William Melicher, Blase Ur, Saranga Komanduri, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. Fast, lean, and accurate: Modeling password guessability using neural networks. In 2017 USENIX Annual Technical Conference, USENIX ATC 2017, Santa Clara, CA, USA, July 12--14, 2017., 2017.
[44]
George A. Miller. The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review, 63(2):81--97, March 1956.
[45]
Arvind Narayanan and Vitaly Shmatikov. Fast dictionary attacks on passwords using time-space tradeoff. In Proceedings of the 12th ACM Conference on Computer and Communications Security, CCS '05, pages 364--372, New York, NY, USA, 2005. ACM.
[46]
Miranda Lee Pao. An empirical examination of lotka's law. J. Am. Soc. Inf. Sci., 37(1):26--33, 1986.
[47]
Dario Pasquini, Ankit Gangwal, Giuseppe Ateniese, Massimo Bernaschi, and Mauro Conti. Improving password guessing via representation learning. IACR Cryptol. ePrint Arch., 2019:1188, 2019.
[48]
Ivan Provilkov, Dmitrii Emelianenko, and Elena Voita. BPE-dropout: Simple and effective subword regularization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1882--1892, Online, July 2020. Association for Computational Linguistics.
[49]
qntm. L33t transformations, 2005. https://qntm.org/l33t.
[50]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2018.
[51]
Ashwini Rao, Birendra Jha, and Gananand Kini. Effect of grammar on security of long passwords. In Elisa Bertino, Ravi S. Sandhu, Lujo Bauer, and Jaehong Park, editors, Third ACM Conference on Data and Application Security and Privacy, CODASPY'13, San Antonio, TX, USA, February 18--20, 2013, pages 317--324. ACM, 2013.
[52]
Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7--12, 2016, Berlin, Germany, Volume 1: Long Papers. The Association for Computer Linguistics, 2016.
[53]
Richard Shay, Saranga Komanduri, Adam L. Durity, Phillip (Seyoung) Huh, Michelle L. Mazurek, Sean M. Segreti, Blase Ur, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. Can long passwords be secure and usable? In Matt Jones, Philippe A. Palanque, Albrecht Schmidt, and Tovi Grossman, editors, CHI Conference on Human Factors in Computing Systems, CHI'14, Toronto, ON, Canada - April 26 - May 01, 2014, pages 2927--2936. ACM, 2014.
[54]
Jaryn Shen, Timothy T. Yuen, Kim-Kwang Raymond Choo, and Qingkai Zeng. AMOGAP: defending against man-in-the-middle and offline guessing attacks on passwords. In Julian Jang-Jaccard and Fuchun Guo, editors, Information Security and Privacy - 24th Australasian Conference, ACISP 2019, Christchurch, New Zealand, July 3--5, 2019, Proceedings, volume 11547 of Lecture Notes in Computer Science, pages 514--532. Springer, 2019.
[55]
Joshua Tan, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. Practical recommendations for stronger, more usable passwords combining minimum-strength, minimum-length, and blocklist requirements. In Jay Ligatti, Xinming Ou, Jonathan Katz, and Giovanni Vigna, editors, CCS '20: 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, USA, November 9--13, 2020, pages 1407--1426. ACM, 2020.
[56]
Blase Ur, Saranga Kom, Richard Shay, Stephanos Matsumoto, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, Patrick Gage Kelley, Michelle L. Mazurek, and Timothy Vidas. Poster: The art of password creation, 2013.
[57]
Blase Ur, Fumiko Noma, Jonathan Bees, Sean M. Segreti, Richard Shay, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. "i added textquoteright!textquoteright at the end to make it secure": Observing password creation in the lab. In Eleventh Symposium On Usable Privacy and Security (SOUPS 2015), pages 123--140, Ottawa, July 2015. USENIX Association.
[58]
Blase Ur, Sean M. Segreti, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, Saranga Komanduri, Darya Kurilova, Michelle L. Mazurek, William Melicher, and Richard Shay. Measuring real-world accuracies and biases in modeling password guessability. In 24th USENIX Security Symposium, USENIX Security 15, Washington, D.C., USA, August 12--14, 2015., pages 463--481, 2015.
[59]
Rafael Veras, Christopher Collins, and Julie Thorpe. On semantic patterns of passwords and their security impact. In 21st Annual Network and Distributed System Security Symposium (NDSS 2014), San Diego, California, USA, February 23--26, 2014. The Internet Society, 2014.
[60]
Ding Wang, Haibo Cheng, Ping Wang, Xinyi Huang, and Gaopeng Jian. Zipf's law in passwords. IEEE Trans. Information Forensics and Security, 12(11):2776--2791, 2017.
[61]
Ding Wang, Haibo Cheng, Ping Wang, Jeff Yan, and Xinyi Huang. A security analysis of honeywords. In 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18--21, 2018. The Internet Society, 2018.
[62]
Ding Wang, Debiao He, Haibo Cheng, and Ping Wang. fuzzypsm: A new password strength meter using fuzzy probabilistic context-free grammars. In 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2016, Toulouse, France, June 28 - July 1, 2016, pages 595--606. IEEE Computer Society, 2016.
[63]
Ding Wang, Ping Wang, Debiao He, and Yuan Tian. Birthday, name and bifacial-security: Understanding passwords of chinese web users. In Nadia Heninger and Patrick Traynor, editors, 28th USENIX Security Symposium, USENIX Security 2019, Santa Clara, CA, USA, August 14--16, 2019, pages 1537--1555. USENIX Association, 2019.
[64]
Ding Wang, Zijian Zhang, Ping Wang, Jeff Yan, and Xinyi Huang. Targeted online password guessing: An underestimated threat. In Proceedings of ACM CCS, Vienna, Austria, October 24--28, 2016, pages 1242--1254, 2016.
[65]
Matt Weir. The version of 4.1 for pcfg models, 2019. https://github.com/lakiw/pcfg_cracker.
[66]
Matt Weir, Sudhir Aggarwal, Breno de Medeiros, and Bill Glodek. Password cracking using probabilistic context-free grammars. In Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, SP '09, pages 391--405, Washington, DC, USA, 2009. IEEE Computer Society.
[67]
Daniel Lowe Wheeler. zxcvbn: Low-budget password strength estimation. In 25th USENIX Security Symposium (USENIX Security 16), pages 157--173, Austin, TX, 2016. USENIX Association.
[68]
Ming Xu and Weili Han. An explainable password strength meter addon via textual pattern recognition. Security and Communication Networks, 2019:5184643:1--5184643:10, 2019.
[69]
Jeff Jianxin Yan, Alan F. Blackwell, Ross J. Anderson, and Alasdair Grant. Password memorability and security: empirical results. IEEE Security Privacy, 2(5):25--31, 2004.
[70]
Leah Zhang-Kennedy, Sonia Chiasson, and Robert Biddle. Password advice shouldn't be boring: Visualizing password guessing attacks. In 2013 APWG eCrime Researchers Summit, pages 1--11, 2013.
[71]
Moshe Zviran and William J. Haga. Password security: An empirical study. J. Manag. Inf. Syst., 15(4):161--186, 1999.

Cited By

View all
  • (2025)Similarities: The Key Factors Influencing Cross-Site Password Guessing PerformanceElectronics10.3390/electronics1405094514:5(945)Online publication date: 27-Feb-2025
  • (2025)Password region attribute classification based on multi-granularity cascade fusionConnection Science10.1080/09540091.2025.246109237:1Online publication date: 7-Feb-2025
  • (2024)The impact of exposed passwords on honeyword efficacyProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3698932(559-576)Online publication date: 14-Aug-2024
  • Show More Cited By

Index Terms

  1. Chunk-Level Password Guessing: Towards Modeling Refined Password Composition Representations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '21: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security
    November 2021
    3558 pages
    ISBN:9781450384544
    DOI:10.1145/3460120
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 November 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. chunks
    2. data driven guessing models
    3. password composition

    Qualifiers

    • Research-article

    Funding Sources

    • NSFC
    • National Key Projects of Research and Development
    • STCSM Key Projects

    Conference

    CCS '21
    Sponsor:
    CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security
    November 15 - 19, 2021
    Virtual Event, Republic of Korea

    Acceptance Rates

    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)170
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Similarities: The Key Factors Influencing Cross-Site Password Guessing PerformanceElectronics10.3390/electronics1405094514:5(945)Online publication date: 27-Feb-2025
    • (2025)Password region attribute classification based on multi-granularity cascade fusionConnection Science10.1080/09540091.2025.246109237:1Online publication date: 7-Feb-2025
    • (2024)The impact of exposed passwords on honeyword efficacyProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3698932(559-576)Online publication date: 14-Aug-2024
    • (2024)Prob-Hashcat: Accelerating Probabilistic Password Guessing with Hashcat by Hundreds of TimesProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678919(674-692)Online publication date: 30-Sep-2024
    • (2024)Stealing Trust: Unraveling Blind Message Attacks in Web3 AuthenticationProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670323(555-569)Online publication date: 2-Dec-2024
    • (2024) GuessFuse : Hybrid Password Guessing With Multi-View IEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.337624619(4215-4230)Online publication date: 2024
    • (2024)Sequential-Spatial-Aware Attribute Classification of Ultra-Short Password2024 4th International Conference on Electronic Information Engineering and Computer Science (EIECS)10.1109/EIECS63941.2024.10800624(1123-1126)Online publication date: 27-Sep-2024
    • (2024)PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00049(429-442)Online publication date: 24-Jun-2024
    • (2024)Password cracking using chunk similarityFuture Generation Computer Systems10.1016/j.future.2023.09.013150:C(380-394)Online publication date: 1-Jan-2024
    • (2024)Decoding developer password patternsComputers and Security10.1016/j.cose.2024.103974145:COnline publication date: 1-Oct-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media