Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Pushing the Limits of Acoustic Spatial Perception via Incident Angle Encoding

Published: 15 May 2024 Publication History

Abstract

With the growing popularity of smart speakers, numerous novel acoustic sensing applications have been proposed for low-frequency human speech and high-frequency inaudible sounds. Spatial information plays a crucial role in these acoustic applications, enabling various location-based services. However, typically commercial microphone arrays face limitations in spatial perception of inaudible sounds due to their sparse array geometries optimized for low-frequency speech. In this paper, we introduce MetaAng, a system designed to augment microphone arrays by enabling wideband spatial perception across both speech signals and inaudible sounds by leveraging the spatial encoding capabilities of acoustic metasurfaces. Our design is grounded in the fact that, while sensitive to high-frequency signals, acoustic metasurfaces are almost non-responsive to low-frequency speech due to significant wavelength discrepancy. This observation allows us to integrate acoustic metasurfaces with sparse array geometry, simultaneously enhancing the spatial perception of high-frequency and low-frequency acoustic signals. To achieve this, we first utilize acoustic metasurfaces and a configuration optimization algorithm to encode the unique features for each incident angle. Then, we propose an unrolling soft thresholding network that employs neural-enhanced priors and compressive sensing for high-accuracy, high-resolution multi-source angle estimation. We implement a prototype, and experimental results demonstrate that MetaAng maintains robustness across various scenarios, facilitating multiple applications, including localization and tracking.

References

[1]
HomePod (2nd generation). https://www.apple.com/homepod-2nd-generation/specs/.
[2]
Nest Mini (2nd Gen). https://store.google.com/us/product/google_nest_mini?hl=en-US.
[3]
COMSOL: simulate real-world designs, devices, and processes with multiphysics software from comsol. https://www.ti.com/product/LM386, 2023.2.
[4]
M. Arar, C. Jung, J. Awad, and A. H. Chohan. Analysis of smart home technology acceptance and preference for elderly in dubai, uae. Designs, 5(4):70, 2021.
[5]
Y. Arjoune, N. Kaabouch, H. El Ghazi, and A. Tamtaoui. A performance comparison of measurement matrices in compressive sensing. International Journal of Communication Systems, 31(10):e3576, 2018.
[6]
Bela platform, 2017. https://bela.io.
[7]
J. Benesty, J. Chen, and Y. Huang. Conventional beamforming techniques. Microphone array signal processing, pages 39-65, 2008.
[8]
Y. Cao, S. SRIDHARAN, and M. MOODY. Speech enhancement using microphone array with multi-stage processing. IEICE transactions on fundamentals of electronics, communications and computer sciences, 79(3):386-394, 1996.
[9]
T. Chen, J. Jiao, and D. Yu. Enhanced broadband acoustic sensing in gradient coiled metamaterials. Journal of Physics D: Applied Physics, 54(8):085501, 2020.
[10]
Z. Chen, X. Xiao, T. Yoshioka, H. Erdogan, J. Li, and Y. Gong. Multi-channel overlapped speech recognition with location guided speech extraction network. In 2018 IEEE Spoken Language Technology Workshop (SLT), pages 558--565. IEEE, 2018.
[11]
R. V. Cox, S. F. D. C. Neto, C. Lamblin, and M. H. Sherif. Itu-t coders for wideband, superwideband, and fullband speech communication [series editorial]. IEEE Communications Magazine, 47(10):106-109, 2009.
[12]
D. Desai and N. Mehendale. A review on sound source localization systems. Archives of Computational Methods in Engineering, 29(7):4631-4642, 2022.
[13]
Y. Fu, S. Wang, L. Zhong, L. Chen, J. Ren, and Y. Zhang. Svoice: Enabling voice communication in silence via acoustic sensing on commodity devices. In Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, pages 622-636, 2022.
[14]
N. Gao, Z. Zhang, J. Deng, X. Guo, B. Cheng, and H. Hou. Acoustic metamaterials for noise reduction: a review. Advanced Materials Technologies, 7(6):2100698, 2022.
[15]
N. Garg, Y. Bai, and N. Roy. Owlet: enabling spatial information in ubiquitous acoustic devices. In Proc. of ACM MobiSys, 2021.
[16]
U. Ha, J. Leng, A. Khaddaj, and F. Adib. Food and liquid sensing in practical environments using rfids. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 1083-1100, 2020.
[17]
S. Haykin and Z. Chen. The cocktail party problem. Neural computation, 17(9):1875-1902, 2005.
[18]
Q. Huang, G. Zhang, and K. Liu. Near-field source localization using spherical microphone arrays. Chinese Journal of Electronics, 25(1):159-166, 2016.
[19]
J. Lan, X. Zhang, X. Liu, and Y. Li. Wavefront manipulation based on transmissive acoustic metasurface with membrane-type hybrid structure. Scientific reports, 8(1):14171, 2018.
[20]
T. B. Lavate, V. Kokate, and A. Sapkal. Performance analysis of music and esprit doa estimation algorithms for adaptive array smart antenna in mobile communication. In Computer and Network Technology (ICCNT), 2010 Second International Conference on, pages 308--311. IEEE, 2010.
[21]
A. Lazaro, D. Girbau, P. Moravek, and R. Villarino. A study on localization in wireless sensor networks using frequency diversity for mitigating multipath effects. Elektronika ir Elektrotechnika, 19(3):82-87, 2013.
[22]
D. Li, J. Liu, S. I. Lee, and J. Xiong. Fm-track: pushing the limits of contactless multi-target tracking using acoustic signals. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems, pages 150-163, 2020.
[23]
D. Li, J. Liu, S. I. Lee, and J. Xiong. Room-scale hand gesture recognition using smart speakers. In Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, pages 462-475, 2022.
[24]
X. Li, Y. Yang, Z. Ye, Y. Wang, and Y. Chen. Earcase: Sound source localization leveraging mini acoustic structure equipped phone cases for hearing-challenged people. In Proceedings of the Twenty-fourth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing, pages 240-249, 2023.
[25]
J. Lian, J. Lou, L. Chen, and X. Yuan. Echospot: Spotting your locations via acoustic sensing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(3):1-21, 2021.
[26]
J. Lian, X. Yuan, M. Li, and N.-F. Tzeng. Fall detection via inaudible acoustic sensing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(3):1-21, 2021.
[27]
M. U. Liaquat, H. S. Munawar, A. Rahman, Z. Qadir, A. Z. Kouzani, and M. P. Mahmud. Sound localization for ad-hoc microphone arrays. Energies, 14(12):3446, 2021.
[28]
Y. Luo and N. Mesgarani. Tasnet: time-domain audio separation network for real-time, single-channel speech separation. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 696--700. IEEE, 2018.
[29]
F. Ma, Z. Huang, C. Liu, and J. H. Wu. Acoustic focusing and imaging via phononic crystal and acoustic metamaterials. Journal of Applied Physics, 131(1), 2022.
[30]
M. Maciejewski, G. Wichern, E. McQuinn, and J. Le Roux. Whamr!: Noisy and reverberant single-channel speech separation. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 696--700. IEEE, 2020.
[31]
R. J. Mailloux. Phased array antenna handbook. Artech house, 2017.
[32]
W. Mao, J. He, and L. Qiu. CAT: high-precision acoustic motion tracking. In Proc. of ACM MobiCom, 2016.
[33]
W. Mao, M. Wang, and L. Qiu. Aim: Acoustic imaging on a mobile. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, pages 468--481. ACM, 2018.
[34]
W. Mao, M. Wang, W. Sun, L. Qiu, S. Pradhan, and Y.-C. Chen. Rnn-based room scale hand motion tracking. In The 25th Annual International Conference on Mobile Computing and Networking, pages 1-16, 2019.
[35]
G. Memoli, M. Caleap, M. Asakawa, D. R. Sahoo, B. W. Drinkwater, and S. Subramanian. Metamaterial bricks and quantization of meta-surfaces. Nature Communication, 2017.
[36]
B. S. Moreira, A. Perkusich, and S. O. Luiz. An acoustic sensing gesture recognition system design based on a hidden markov model. Sensors, 20(17):4803, 2020.
[37]
R. Nandakumar, V. Iyer, D. Tan, and S. Gollakota. FingerIO: Using active sonar for fine-grained finger tracking. In Proc. of ACM CHI, pages 1515-1525, 2016.
[38]
Q. Pu, S. Gupta, S. Gollakota, and S. Patel. Whole-home gesture recognition using wireless signals. In Proc. of ACM MobiCom, 2013.
[39]
R. O. Schmidt. A signal subspace approach to multiple emitter location spectral estimation. Ph. D. Thesis, Stanford University, 1981.
[40]
I. Selesnick. Sparse regularization via convex analysis. IEEE Transactions on Signal Processing, 65(17):4481-4494, 2017.
[41]
S. Shen, D. Chen, Y.-L. Wei, Z. Yang, and R. R. Choudhury. Voice localization using nearby wall reflections. In Proc. of ACM MobiCom, 2020.
[42]
W. Shi, F. Jiang, S. Liu, and D. Zhao. Scalable convolutional neural network for image compressed sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12290-12299, 2019.
[43]
Y. Su, F. Zhang, K. Niu, T. Wang, B. Jin, Z. Wang, Y. Jiang, D. Zhang, L. Qiu, and J. Xiong. Embracing distributed acoustic sensing in car cabin for children presence detection. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 8(1):1-28, 2024.
[44]
K. Sun, T. Zhao, W. Wang, and L. Xie. Vskin: Sensing touch gestures on surfaces of mobile devices using acoustic signals. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pages 591-605, 2018.
[45]
W. Sun. From active to passive spatial acoustic sensing and applications. PhD thesis, 2022.
[46]
Z. Tian, C. Shen, J. Li, E. Reit, Y. Gu, H. Fu, S. A. Cummer, and T. J. Huang. Programmable acoustic metasurfaces. Advanced functional materials, 29(13):1808489, 2019.
[47]
J.-M. Valin, F. Michaud, J. Rouat, and D. Létourneau. Robust sound source localization using a microphone array on a mobile robot. In Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003)(Cat. No. 03CH37453), volume 2, pages 1228--1233. IEEE, 2003.
[48]
D. Vasisht, S. Kumar, and D. Katabi. Decimeter-level localization with a single wifi access point. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 165-178, 2016.
[49]
A. Wang, J. E. Sunshine, and S. Gollakota. Contactless infant monitoring using white noise. In The 25th Annual International Conference on Mobile Computing and Networking, pages 1-16, 2019.
[50]
J. Wang, D. Vasisht, and D. Katabi. Rf-idraw: virtual touch screen in the air using rf signals. ACM SIGCOMM Computer Communication Review, 44(4):235-246, 2014.
[51]
J. Wang, J. Xiong, X. Chen, H. Jiang, R. K. Balan, and D. Fang. Tagscan: Simultaneous target imaging and material identification with commodity rfid devices. In Proc. of ACM MobiCom, pages 288--300. ACM, 2017.
[52]
L. Wang, T. Gu, W. Li, H. Dai, Y. Zhang, D. Yu, C. Xu, and D. Zhang. Df-sense: Multi-user acoustic sensing for heartbeat monitoring with dualforming. In Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services, pages 1-13, 2023.
[53]
M. Wang, W. Sun, and L. Qiu. {MAVL}: Multiresolution analysis of voice localization. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), pages 845-858, 2021.
[54]
M. Wang, W. Sun, and L. Qiu. Mavl: Multiresolution analysis of voice localization. In Proc. of NSDI, 2021.
[55]
S. Wang, L. Zhong, Y. Fu, L. Chen, J. Ren, and Y. Zhang. Uface: Your smartphone can" hear" your facial expression! Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 8(1):1-27, 2024.
[56]
W. Wang, A. X. Liu, and K. Sun. Device-free gesture tracking using acoustic signals. In Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking, pages 82--94. ACM, 2016.
[57]
Y. Wang, J. Cao, and C. Yang. Recovery of seismic wavefields based on compressive sensing by an l1-norm constrained trust region method and the piecewise random subsampling. Geophysical Journal International, 187(1):199-213, 2011.
[58]
R. Watanabe, D. Kitamura, H. Saruwatari, Y. Takahashi, and K. Kondo. Dnn-based frequency component prediction for frequency-domain audio source separation. In 2020 28th European Signal Processing Conference (EUSIPCO), pages 805--809. IEEE, 2021.
[59]
T. Wei and X. Zhang. mTrack: high precision passive tracking using millimeter wave radios. In Proc. of ACM MobiCom, 2015.
[60]
Y.-L. Wei and R. R. Choudhury. Estimating angle of arrival (aoa) of multiple echoes in a steering vector space. arXiv preprint arXiv:2109.13072, 2021.
[61]
P. Xiao and B. Liao. Robust one-bit compressive sensing with weighted l1-norm minimization. Signal Processing, 164:380-385, 2019.
[62]
Y. Xiao, Q. Wang, and Q. Hu. Non-smooth equations based method for l1-norm problems with applications to compressed sensing. Nonlinear Analysis: Theory, Methods & Applications, 74(11):3570-3577, 2011.
[63]
J. Xiong, K. Sundaresan, and K. Jamieson. Tonetrack: Leveraging frequency-agile radios for time-based indoor wireless localization. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, pages 537--549. ACM, 2015.
[64]
D. You, J. Xie, and J. Zhang. Ista-net++: Flexible deep unfolding network for compressive sensing. In 2021 IEEE International Conference on Multimedia and Expo (ICME), pages 1--6. IEEE, 2021.
[65]
S. Yun, Y.-C. Chen, H. Zheng, L. Qiu, and W. Mao. Strata: Fine-grained acoustic-based device-free tracking. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services, pages 15--28. ACM, 2017.
[66]
F. Zhang, Z. Wang, B. Jin, J. Xiong, and D. Zhang. Your smart speaker can" hear" your heartbeat! Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(4):1-24, 2020.
[67]
F. Zhang, J. Xiong, Z. Chang, J. Ma, and D. Zhang. Mobi2sense: empowering wireless sensing with mobility. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, pages 268-281, 2022.
[68]
H. Zhang, Q. Fu, and Y. Yan. Speech enhancement using compact microphone array and applications in distant speech acquisition. Chinese Journal of Electronics, 18(3):481-486, 2009.
[69]
S. Zhang and X. Li. Microphone array generalization for multichannel narrowband deep speech enhancement. arXiv preprint arXiv:2107.12601, 2021.
[70]
Y. Zhang, H. Pan, Y.-C. Chen, L. Qiu, Y. Lu, G. Xue, J. Yu, F. Lyu, and H. Wang. Addressing practical challenges in acoustic sensing to enable fast motion tracking. In Proceedings of the 22nd International Conference on Information Processing in Sensor Networks, pages 82-95, 2023.
[71]
Y. Zhang, Y. Wang, L. Yang, M. Wang, Y.-C. Chen, L. Qiu, Y. Liu, G. Xue, and J. Yu. Acoustic sensing and communication using metasurface. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23), pages 1359-1374, 2023.
[72]
S. Zheng, S. Zhang, W. Huang, Q. Chen, H. Suo, M. Lei, J. Feng, and Z. Yan. Beamtransformer: Microphone array-based overlapping speech detection. arXiv preprint arXiv:2109.04049, 2021.
[73]
Y. Zhu and B. Assouar. Multifunctional acoustic metasurface based on an array of helmholtz resonators. Physical review B, 99(17):174109, 2019.

Cited By

View all
  • (2025)MetaSonic: Advancing Robot Localization With Directional Embedded Acoustic SignalsIEEE Robotics and Automation Letters10.1109/LRA.2024.352490310:2(1704-1711)Online publication date: Feb-2025

Index Terms

  1. Pushing the Limits of Acoustic Spatial Perception via Incident Angle Encoding

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 8, Issue 2
    June 2024
    1330 pages
    EISSN:2474-9567
    DOI:10.1145/3665317
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 May 2024
    Published in IMWUT Volume 8, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Acoustic sensing
    2. angle estimation
    3. compressive sensing
    4. metasurface

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)425
    • Downloads (Last 6 weeks)38
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)MetaSonic: Advancing Robot Localization With Directional Embedded Acoustic SignalsIEEE Robotics and Automation Letters10.1109/LRA.2024.352490310:2(1704-1711)Online publication date: Feb-2025

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media