Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3427228.3427281acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

VibLive: A Continuous Liveness Detection for Secure Voice User Interface in IoT Environment

Published: 08 December 2020 Publication History

Abstract

The voice user interface (VUI) has been progressively used to authenticate users to numerous devices and applications. Such massive adoption of VUIs in IoT environments like individual homes and businesses arises extensive privacy and security concerns. Latest VUIs adopting traditional voice authentication methods are vulnerable to spoofing attacks, where a malicious party spoofs the VUIs with pre-recorded or synthesized voice commands of the genuine user. In this paper, we design VibLive, a continuous liveness detection system for secure VUIs in IoT environments. The underlying principle of VibLive is to catch the dissimilarities between bone-conducted vibrations and air-conducted voices when human speaks for liveness detection. VibLive is a text-independent system that verifies live users and detects spoofing attacks without requiring users to enroll specific passphrases. Moreover, VibLive is practical and transparent as it requires neither additional operations nor extra hardwares, other than a loudspeaker and a microphone that are commonly equipped on VUIs. Our evaluation with 25 participants under different IoT intended experiment settings shows that VibLive is highly effective with over 97% detection accuracy. Results also show that VibLive is robust to various use scenarios.

References

[1]
2008. Personal Music Players and Hearing. https://ec.europa.eu/health/scientific_committees/opinions_layman/en/hearing-loss-personal-music-player-mp3/l-2/2-sound-measurement-decibel.htm.
[2]
2012. Dumping of sound level vs distance. http://www.sengpielaudio.com/calculator-distance.htm.
[3]
2013. Why does your voice sound different on a recording. http://www.bbc.com/future/story/20130913-why-we-hate-hearing-our-own-voice.
[4]
2015. VocalPassword. http://www.nuance.com/ucmprod/groups/enterprise/@webenus/documents/collateral/nc_015226.pdf.
[5]
2016. Adobe Voco ’Photoshop-for-voice’ causes concern. https://www.bbc.com/news/technology-37899902.
[6]
2016. All the Technology Inside Your Google Home. https://electronics360.globalspec.com/article/7718/all-the-technology-inside-your-google-home.
[7]
2016. FACTS ABOUT SPEECH INTELLIGIBILITY. https://www.dpamicrophones.com/mic-university/facts-about-speech-intelligibility.
[8]
2016. VoiceVault. http://www.nuance.com/landingpages/products/voicebiometrics/vocalpassword.asp.
[9]
2017. Burger King’s new ad forces Google Home to advertise the Whopper. https://www.theverge.com/2017/4/12/15259400/burger-king-google-home-ad-wikipedia.
[10]
2017. Google Home now recognizes your individual voice. https://money.cnn.com/2017/04/20/technology/google-home-voice-recognition/index.html.
[11]
2017. The Source Filter Theory. http://my.ilstu.edu/~jsawyer/resonancesoftchalk/resonancesoftchalk7.html.
[12]
2017. The Voice Foundation. https://voicefoundation.org/health-science/voice-disorders/anatomy-physiology-of-voice-production/understanding-voice-production/.
[13]
2018. Designing a VUI-Voice User Interface. https://uxplanet.org/designing-a-vui-voice-user-interface-c0b3b9b57ace.
[14]
2018. How Voice User Interface is taking over the world, and why you should care. https://medium.com/@goodrebels/how-voice-user-interface-is-taking-over-the-world-and-why-you-should-care-54474bd56f81.
[15]
2018. protecting privacy on VUI. https://medium.com/grandstudio/protecting-privacy-in-voice-user-interfaces-b800e47728.
[16]
2019. The 13 Best Smart Home Devices and Systems of 2019. https://blog.hubspot.com/marketing/smart-home-devices.
[17]
2019. 7 Key Predictions For The Future Of Voice Assistants And AI. https://clearbridgemobile.com/7-key-predictions-for-the-future-of-voice-assistants-and-ai/.
[18]
2019. Aliasing. https://en.wikipedia.org/wiki/Aliasing.
[19]
2019. Amazon Echo Studio: First impressions. https://www.techhive.com/article/3441216/amazon-echo-studio-first-impressions.html.
[20]
2019. Amazon files patent for replay attack detection method to protect voice authentication. https://www.biometricupdate.com/201901/amazon-files-patent-for-replay-attack-detection-method-to-protect-voice-authentication.
[21]
2019. Biometrics: authentication and identification (2019 review). https://www.gemalto.com/govt/inspired/biometrics.
[22]
2019. Decibel Meter. https://www.pce-instruments.com/us/measuring-instruments/test-meters/decibel-meter-kat_162375.htm.
[23]
2019. Google express store. https://express.google.com/stores.
[24]
2019. in-car voice assistant consumer adoption report. https://voicebot.ai/wp-content/uploads/2019/01/in-car_voice_assistant_consumer_adoption_report_2019_voicebot.pdf.
[25]
2019. INMP621 Wide Dynamic Range Microphone with PDM Digital Output. https://www.invensense.com/products/digital/inmp621/product-documentation.
[26]
2019. Nyquist frequency. https://en.wikipedia.org/wiki/Nyquist_frequency.
[27]
2019. smartLav+ Lavalier micropohne for smartphones. http://www.rode.com/microphones/smartlav-plus.
[28]
2019. Standard score. https://en.wikipedia.org/wiki/Standard_score.
[29]
2019. Voice Is New UI. What Does It Mean for Enterprises?https://www.linkedin.com/pulse/voice-new-ui-what-does-mean-enterprises-sujatha-visweswara.
[30]
Federico Alegre, Artur Janicki, and Nicholas Evans. 2014. Re-assessing the threat of replay spoofing attacks against automatic speaker verification. In 2014 International Conference of the Biometrics Special Interest Group (BIOSIG). IEEE, 1–6.
[31]
Logan Blue, Hadi Abdullah, Luis Vargas, and Patrick Traynor. 2018. 2ma: Verifying voice commands via two microphone authentication. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security. ACM, 89–100.
[32]
Steven Boll. 1979. Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on acoustics, speech, and signal processing 27, 2(1979), 113–120.
[33]
Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden voice commands. In 25th {USENIX} Security Symposium ({USENIX} Security 16). 513–530.
[34]
Si Chen, Kui Ren, Sixu Piao, Cong Wang, Qian Wang, Jian Weng, Lu Su, and Aziz Mohaisen. 2017. You can hear but you cannot steal: Defending against voice impersonation attacks on smartphones. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 183–195.
[35]
Phillip L De Leon, Michael Pucher, Junichi Yamagishi, Inma Hernaez, and Ibon Saratxaga. 2012. Evaluation of speaker verification security and detection of HMM-based synthetic speech. IEEE Transactions on Audio, Speech, and Language Processing 20, 8(2012), 2280–2290.
[36]
Huan Feng, Kassem Fawaz, and Kang G Shin. 2017. Continuous authentication for voice assistants. In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking. ACM, 343–355.
[37]
Yuan Gong and Christian Poellabauer. 2018. Protecting voice controlled systems using sound source identification based on acoustic cues. In 2018 27th International Conference on Computer Communication and Networks (ICCCN). IEEE, 1–9.
[38]
Rosa González Hautamäki, Tomi Kinnunen, Ville Hautamäki, Timo Leino, and Anne-Maria Laukkanen. 2013. I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry. In Interspeech. 930–934.
[39]
Paula Henry and Tomasz R Letowski. 2007. Bone conduction: Anatomy, physiology, and communication. Technical Report. Army research lab aberdeen proving ground md human research and engineering.
[40]
Shyam M Khanna, Juergen Tonndorf, and Judith E Queller. 1976. Mechanical parameters of hearing by bone conduction. The Journal of the Acoustical Society of America 60, 1 (1976), 139–154.
[41]
Tomi Kinnunen, Md Sahidullah, Héctor Delgado, Massimiliano Todisco, Nicholas Evans, Junichi Yamagishi, and Kong Aik Lee. 2017. The ASVspoof 2017 challenge: Assessing the limits of replay spoofing attack detection. (2017).
[42]
Tomi Kinnunen, Zhi-Zheng Wu, Kong Aik Lee, Filip Sedlak, Eng Siong Chng, and Haizhou Li. 2012. Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4401–4404.
[43]
Xinyu Lei, Guan-Hua Tu, Alex X Liu, Chi-Yu Li, and Tian Xie. 2017. The insecurity of home digital voice assistants-amazon alexa as a case study. arXiv preprint arXiv:1712.03327(2017).
[44]
John Makhoul. 1973. Spectral analysis of speech by linear prediction. IEEE Transactions on Audio and Electroacoustics 21, 3(1973), 140–148.
[45]
Khalid Mahmood Malik, Hafiz Malik, and Roland Baumann. 2019. Towards Vulnerability Analysis of Voice-Driven Interfaces and Countermeasures for Replay. arXiv preprint arXiv:1904.06591(2019).
[46]
Yan Meng, Zichang Wang, Wei Zhang, Peilin Wu, Haojin Zhu, Xiaohui Liang, and Yao Liu. 2018. Wivo: Enhancing the security of voice control system via wireless signal in iot environment. In Proceedings of the Eighteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing. ACM, 81–90.
[47]
Yan Meng, Wei Zhang, Haojin Zhu, and Xuemin Sherman Shen. 2018. Securing consumer IoT in the smart home: architecture, challenges, and countermeasures. IEEE Wireless Communications 25, 6 (2018), 53–59.
[48]
Nafeesa Mubeen, A Shahina, A Nayeemulla Khan, and G Vinoth. 2012. Combining spectral features of standard and throat microphones for speaker identification. In 2012 International Conference on Recent Trends in Information Technology. IEEE, 119–122.
[49]
Dibya Mukhopadhyay, Maliheh Shirvanian, and Nitesh Saxena. 2015. All your voices are belong to us: Stealing voices to fool humans and machines. In European Symposium on Research in Computer Security. Springer, 599–621.
[50]
Jacob B Munger and Scott L Thomson. 2008. Frequency response of the skin on the head and neck during production of selected speech sounds. The Journal of the Acoustical Society of America 124, 6 (2008), 4001–4012.
[51]
William D O’Brien Jr. 2009. Evaluation of acoustic propagation paths into the human head. Technical Report. ILLINOIS UNIV AT URBANA BOARD OF TRUSTEES.
[52]
Nirupam Roy, Haitham Hassanieh, and Romit Roy Choudhury. 2017. Backdoor: Making microphones hear inaudible sounds. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 2–14.
[53]
B Sudhakar and R Bens Raj. 2013. Automatic speech segmentation to improve speech synthesis performance. In 2013 International Conference on Circuits, Power and Computing Technologies (ICCPCT). IEEE, 835–839.
[54]
Georg v. Békésy. 1932. Zur theorie des hörens bei der schallaufnahme durch knochenleitung. Annalen der Physik 405, 1 (1932), 111–136.
[55]
Palghat P Vaidyanathan. 2007. The theory of linear prediction. Synthesis lectures on signal processing 2, 1 (2007), 1–184.
[56]
Jesús Villalba and Eduardo Lleida. 2011. Detecting replay attacks from far-field recordings on speaker verification systems. In European Workshop on Biometrics and Identity Management. Springer, 274–285.
[57]
Qian Wang, Xiu Lin, Man Zhou, Yanjiao Chen, Cong Wang, Qi Li, and Xiangyang Luo. 2019. VoicePop: A Pop Noise based Anti-spoofing System for Voice Authentication on Smartphones. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2062–2070.
[58]
Zhi-Feng Wang, Gang Wei, and Qian-Hua He. 2011. Channel pattern noise based playback attack detection algorithm for speaker recognition. In 2011 International conference on machine learning and cybernetics, Vol. 4. IEEE, 1708–1713.
[59]
Zhizheng Wu, Nicholas Evans, Tomi Kinnunen, Junichi Yamagishi, Federico Alegre, and Haizhou Li. 2015. Spoofing and countermeasures for speaker verification: A survey. speech communication 66(2015), 130–153.
[60]
Chen Yan, Yan Long, Xiaoyu Ji, and Wenyuan Xu. 2019. The Catcher in the Field: A Fieldprint based Spoofing Detection for Text-Independent Speaker Verification. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1215–1229.
[61]
Bayya Yegnanarayana, A Shahina, and MR Kesheorey. 2004. Throat microphone signal for speaker recognition. In Eighth International Conference on Spoken Language Processing.
[62]
Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2017. Dolphinattack: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 103–117.
[63]
Linghan Zhang, Sheng Tan, and Jie Yang. 2017. Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 57–71.
[64]
Linghan Zhang, Sheng Tan, Jie Yang, and Yingying Chen. 2016. Voicelive: A phoneme localization based liveness detection for voice authentication on smartphones. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1080–1091.
[65]
Bing Zhou, Jay Lohokare, Ruipeng Gao, and Fan Ye. 2018. EchoPrint: Two-factor Authentication using Acoustics and Vision on Smartphones. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking. ACM, 321–336.

Cited By

View all
  • (2024)Application of Deep Learning Models for Bone-Conducted Speech Signals Extracted in the Form of Bone Conduction HeadphonesJournal of the Korean Society of Manufacturing Technology Engineers10.7735/ksmte.2024.33.1.2733:1(27-34)Online publication date: 15-Feb-2024
  • (2024)Live Speech Recognition via Earphone Motion SensorsIEEE Transactions on Mobile Computing10.1109/TMC.2023.333321423:6(7284-7300)Online publication date: Jun-2024
  • (2024)Room-scale Voice Liveness Detection for Smart DevicesIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.3367269(1-14)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. VibLive: A Continuous Liveness Detection for Secure Voice User Interface in IoT Environment
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ACSAC '20: Proceedings of the 36th Annual Computer Security Applications Conference
      December 2020
      962 pages
      ISBN:9781450388580
      DOI:10.1145/3427228
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 December 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. bone-conducted vibrations
      2. liveness detection
      3. voice user interface

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ACSAC '20

      Acceptance Rates

      Overall Acceptance Rate 104 of 497 submissions, 21%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)78
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Application of Deep Learning Models for Bone-Conducted Speech Signals Extracted in the Form of Bone Conduction HeadphonesJournal of the Korean Society of Manufacturing Technology Engineers10.7735/ksmte.2024.33.1.2733:1(27-34)Online publication date: 15-Feb-2024
      • (2024)Live Speech Recognition via Earphone Motion SensorsIEEE Transactions on Mobile Computing10.1109/TMC.2023.333321423:6(7284-7300)Online publication date: Jun-2024
      • (2024)Room-scale Voice Liveness Detection for Smart DevicesIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.3367269(1-14)Online publication date: 2024
      • (2024)IoT Avatar: Turning Various Objects into AvatarsAdvances in Network-Based Information Systems10.1007/978-3-031-72325-4_39(398-405)Online publication date: 20-Sep-2024
      • (2023) LiveProbe : Exploring Continuous Voice Liveness Detection via Phonemic Energy Response Patterns IEEE Internet of Things Journal10.1109/JIOT.2022.322881910:8(7215-7228)Online publication date: 15-Apr-2023
      • (2023)VoShield: Voice Liveness Detection with Sound Field DynamicsIEEE INFOCOM 2023 - IEEE Conference on Computer Communications10.1109/INFOCOM53939.2023.10229038(1-10)Online publication date: 17-May-2023
      • (2023)Navigating the Privacy and Cybersecurity Risks of Smart Homes2023 24th International Arab Conference on Information Technology (ACIT)10.1109/ACIT58888.2023.10453878(1-6)Online publication date: 6-Dec-2023
      • (2023)Security and privacy problems in voice assistant applicationsComputers and Security10.1016/j.cose.2023.103448134:COnline publication date: 1-Nov-2023
      • (2022)MetaEar: Imperceptible Acoustic Side Channel Continuous Authentication Based on ERTFElectronics10.3390/electronics1120340111:20(3401)Online publication date: 20-Oct-2022
      • (2022)A Continuous Articulatory-Gesture-Based Liveness Detection for Voice Authentication on Smart DevicesIEEE Internet of Things Journal10.1109/JIOT.2022.31999959:23(23320-23331)Online publication date: 1-Dec-2022
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media