Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3658664.3659657acmconferencesArticle/Chapter ViewAbstractPublication Pagesih-n-mmsecConference Proceedingsconference-collections
short-paper
Open access

Co-Stega: Collaborative Linguistic Steganography for the Low Capacity Challenge in Social Media

Published: 24 June 2024 Publication History

Abstract

Social media platforms, with their extensive and real-time text data, are important application environments for linguistic generative steganography. However, the fragmented and context-constrained nature of social media text leads to a notably low capacity for hiding messages, making linguistic steganography impractical in real social media platforms. More frustratingly, even for high-capacity linguistic generative steganography, the upper bound of capacity is limited to a low level when required to meet a slightly strict security level under Cachin's model. To overcome the low capacity challenge, we identified an indicator of capacity that is independent of any specific steganography method to analyze the origins of this challenge, then we proposed a novel linguistic steganography framework named Collaborative Steganography (Co-Stega). Co-Stega utilizes existing texts and contextual relevance between texts in social media, collaboratively embedding secret messages in an existing text and its contextually related text via efficient retrieval and generation respectively. Additionally, we proposed an innovative and simple technique called "Entropy Enhancement Strategy", which effectively increases the entropy of generated text, thereby enhancing capacity further. Our evaluation shows that Co-Stega significantly improves capacity and maintains text quality, making it a valuable extension for linguistic generative steganography in social media platforms.

References

[1]
Neil F. Johnson and Sushil Jajodia. Exploring steganography: Seeing the unseen. Computer, 31(2):26--34, 1998.
[2]
Huili Wang, Zhongliang Yang, Jinshuai Yang, Yue Gao, and Yong Huang. Histega: A hierarchical linguistic steganography framework combining retrieval and generation. In International Conference on Neural Information Processing, 2023.
[3]
Luke A Bauer, James K Howes IV, Sam A Markelon, Vincent Bindschaedler, and Thomas Shrimpton. Covert message passing over public internet platforms using model-based format-transforming encryption. arXiv preprint arXiv:2110.07009, 2021.
[4]
Changhao Ding, Zhangjie Fu, Qi Yu, FanWang, and Xianyi Chen. Joint linguistic steganography with bert masked language model and graph attention network. IEEE Transactions on Cognitive and Developmental Systems, 2023.
[5]
Honai Ueoka, Yugo Murawaki, and Sadao Kurohashi. Frustratingly easy editbased linguistic steganography with a masked language model. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5486--5492, 2021.
[6]
Biao Yi, HanzhouWu, Guorui Feng, and Xinpeng Zhang. Alisa: Acrostic linguistic steganography based on bert and gibbs sampling. IEEE Signal Processing Letters, 29:687--691, 2022.
[7]
Zhong-Liang Yang, Xiao-Qing Guo, Zi-Ming Chen, Yong-Feng Huang, and Yu- Jin Zhang. RNN-Stega: Linguistic Steganography Based on Recurrent Neural Networks. IEEE Transactions on Information Forensics and Security, 14(5):1280-- 1295, 2019.
[8]
Zachary Ziegler, Yuntian Deng, and Alexander M Rush. Neural linguistic steganography. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1210--1215, 2019.
[9]
Siyu Zhang, Zhongliang Yang, Jinshuai Yang, and Yongfeng Huang. Provably secure generative linguistic steganography. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3046--3055, 2021.
[10]
Gabriel Kaptchuk, Tushar M Jois, Matthew Green, and Aviel D Rubin. Meteor: Cryptographically secure steganography for realistic distributions. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 1529--1548, 2021.
[11]
Jinyang Ding, Kejiang Chen, YaofeiWang, Na Zhao,Weiming Zhang, and Nenghai Yu. Discop: Provably Secure Steganography in Practice Based on "Distribution Copies". In 2023 IEEE Symposium on Security and Privacy (SP), pages 2238--2255. IEEE Computer Society, 2023.
[12]
Jinshuai Yang, Zhongliang Yang, Jiajun Zou, Haoqin Tu, and Yongfeng Huang. Linguistic Steganalysis Toward Social Network. IEEE Transactions on Information Forensics And Security, 18, 2023.
[13]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
[14]
Christian Cachin. An information-theoretic model for steganography. In International Workshop on Information Hiding, pages 306--318. Springer, 1998.
[15]
Phil Sallee. Model-based steganography. In International workshop on digital watermarking, pages 154--167. Springer, 2003.
[16]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
[17]
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
[18]
Tianyu Gao, Xingcheng Yao, and Danqi Chen. Simcse: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894--6910, 2021.

Index Terms

  1. Co-Stega: Collaborative Linguistic Steganography for the Low Capacity Challenge in Social Media

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IH&MMSec '24: Proceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security
    June 2024
    305 pages
    ISBN:9798400706370
    DOI:10.1145/3658664
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 June 2024

    Check for updates

    Author Tags

    1. entropy
    2. large language model
    3. linguistic steganography
    4. social media

    Qualifiers

    • Short-paper

    Conference

    IH&MMSEC '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 128 of 318 submissions, 40%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 179
      Total Downloads
    • Downloads (Last 12 months)179
    • Downloads (Last 6 weeks)56
    Reflects downloads up to 24 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media