research-article

Open access

TapNet: The Design, Training, Implementation, and Applications of a Multi-Task Learning CNN for Off-Screen Mobile Input

Authors:

Michael Xuelin Huang,

Nazneen Nazneen,

Alexander Chao,

Shumin ZhaiAuthors Info & Claims

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Article No.: 282, Pages 1 - 11

https://doi.org/10.1145/3411764.3445626

Published: 07 May 2021 Publication History

All formats PDF

Abstract

To make off-screen interaction without specialized hardware practical, we investigate using deep learning methods to process the common built-in IMU sensor (accelerometers and gyroscopes) on mobile phones into a useful set of one-handed interaction events. We present the design, training, implementation and applications of TapNet, a multi-task network that detects tapping on the smartphone. With phone form factor as auxiliary information, TapNet can jointly learn from data across devices and simultaneously recognize multiple tap properties, including tap direction and tap location. We developed two datasets consisting of over 135K training samples, 38K testing samples, and 32 participants in total. Experimental evaluation demonstrated the effectiveness of the TapNet design and its significant improvement over the state of the art. Along with the datasets, codebase1, and extensive experiments, TapNet establishes a new technical foundation for off-screen mobile input.

Supplementary Material

VTT File (3411764.3445626_videofigurecaptions.vtt)

Download
5.44 KB

Supplementary Materials (3411764.3445626_supplementalmaterials.zip)

Download
2.40 MB

MP4 File (3411764.3445626_videofigure.mp4)

Supplemental video

Download
27.87 MB

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16). USENIX Association, USA, 265–283.

[2]

Sara Amini, Vahid Noroozi, Sara Bahaadini, S Yu Philip, and Chris Kanich. 2018. DeepFP: A Deep Learning Framework For User Fingerprinting via Mobile Motion Sensors. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, USA, 84–91.

[3]

Rich Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41–75.

[4]

Ke-Yu Chen, Shwetak N Patel, and Sean Keller. 2016. Finexus: Tracking precise motions of multiple fingertips using magnetic sensing. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, USA, 1504–1514.

Digital Library

[5]

Christian Corsten, Bjoern Daehlmann, Simon Voelker, and Jan Borchers. 2017. BackXPress: Using back-of-device finger pressure to augment touchscreen input on smartphones. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, USA, 4654–4666.

Digital Library

[6]

Alexander De Luca, Emanuel Von Zezschwitz, Ngo Dieu Huong Nguyen, Max-Emanuel Maurer, Elisa Rubegni, Marcello Paolo Scipioni, and Marc Langheinrich. 2013. Back-of-device authentication on smartphones. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, USA, 2389–2398.

Digital Library

[7]

David Dobbelstein, Christian Winkler, Gabriel Haas, and Enrico Rukzio. 2017. PocketThumb: a Wearable Dual-Sided Touch Interface for Cursor-based Control of Smart-Eyewear. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 2 (2017), 9.

Digital Library

[8]

Emilio Granell and Luis A Leiva. 2016. Less is more: Efficient back-of-device tap input detection using built-in smartphone sensors. In Proceedings of the 2016 ACM International Conference on Interactive Surfaces and Spaces. ACM, USA, 5–11.

Digital Library

[9]

Chris Harrison, Julia Schwarz, and Scott E Hudson. 2011. TapSense: enhancing finger interaction on touch surfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, USA, 627–636.

Digital Library

[10]

David Holman, Andreas Hollatz, Amartya Banerjee, and Roel Vertegaal. 2013. Unifone: designing for auxiliary finger input in one-handed mobile interactions. In Proceedings of the 7th International Conference on Tangible, Embedded and Embodied Interaction. ACM, USA, 177–184.

Digital Library

[11]

Steven Hoober. 2013. How do users really hold mobile devices. Uxmatters (http://www.uxmatter.com). Published: Feburary 18 (2013), 2327–4662.

[12]

Michael Xuelin Huang, Jiajia Li, Grace Ngai, and Hong Va Leong. 2018. Quick Bootstrapping of a Personalized Gaze Model from Real-Use Interactions. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 4(2018), 43.

[13]

Lukasz Kaiser, Aidan N Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, and Jakob Uszkoreit. 2017. One Model To Learn Them All. arXiv preprint arXiv:1706.05137 0, 0 (2017), 1–10.

[14]

Huy Viet Le, Patrick Bader, Thomas Kosch, and Niels Henze. 2016. Investigating Screen Shifting Techniques to Improve One-Handed Smartphone Usage. In Proceedings of the 9th Nordic Conference on Human-Computer Interaction. ACM, USA, 27.

Digital Library

[15]

Huy Viet Le, Sven Mayer, Patrick Bader, and Niels Henze. 2018. Fingers’ Range and Comfortable Area for One-Handed Smartphone Interaction Beyond the Touchscreen. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, USA, 31.

Digital Library

[16]

Huy Viet Le, Sven Mayer, and Niels Henze. 2018. InfiniTouch: Finger-aware interaction on fully touch sensitive smartphones. In The 31st Annual ACM Symposium on User Interface Software and Technology. ACM, USA, 779–792.

Digital Library

[17]

Yi Liang, Zhipeng Cai, Jiguo Yu, Qilong Han, and Yingshu Li. 2018. Deep learning based inference of private information using embedded sensors in smart devices. IEEE Network 32, 4 (2018), 8–14.

Digital Library

[18]

Binbing Liao, Jingqing Zhang, Chao Wu, Douglas McIlwraith, Tong Chen, Shengwen Yang, Yike Guo, and Fei Wu. 2018. Deep sequence learning with auxiliary information for traffic prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, USA, 537–546.

Digital Library

[19]

Zhen Ling, Junzhou Luo, Yaowen Liu, Ming Yang, Kui Wu, and Xinwen Fu. 2018. SecTap: Secure Back of Device Input System for Mobile Devices. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, USA, 1520–1528.

[20]

William McGrath and Yang Li. 2014. Detecting tapping motion on the side of mobile devices by probabilistically combining hand postures. In Proceedings of the 27th annual ACM symposium on User interface software and technology. ACM, USA, 215–219.

Digital Library

[21]

Maryam Mehrnezhad, Ehsan Toreini, Siamak F Shahandashti, and Feng Hao. 2018. Stealing PINs via mobile sensors: actual risk versus user perception. International Journal of Information Security 17, 3 (2018), 291–313.

Digital Library

[22]

Hyeonseob Nam and Bohyung Han. 2016. Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, USA, 4293–4302.

[23]

Philip Quinn, Seungyon Claire Lee, Melissa Barnhart, and Shumin Zhai. 2019. Active Edge: Designing Squeeze Gestures for the Google Pixel 2. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, USA, 274.

Digital Library

[24]

Gabriel Reyes, Jason Wu, Nikita Juneja, Maxim Goldshtein, W Keith Edwards, Gregory D Abowd, and Thad Starner. 2018. SynchroWatch: One-handed synchronous smartwatch gestures using correlation and magnetic sensing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 158.

Digital Library

[25]

Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, and Tonio Ball. 2017. Deep learning with convolutional neural networks for EEG decoding and visualization. Human brain mapping 38, 11 (2017), 5391–5420.

[26]

Karsten Seipp and Kate Devlin. 2014. BackPat: one-handed off-screen patting gestures. In Proceedings of the 16th international conference on Human-computer interaction with mobile devices & services. ACM, USA, 77–80.

Digital Library

[27]

Marcos Serrano, Eric Lecolinet, and Yves Guiard. 2013. Bezel-Tap gestures: quick activation of commands from sleep mode on tablets. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, USA, 3027–3036.

Digital Library

[28]

Todd A Stephenson, Mathew Magimai Doss, and Hervé Bourlard. 2004. Speech recognition with auxiliary information. IEEE transactions on speech and audio processing 12, 3(2004), 189–203.

[29]

Ke Sun, Ting Zhao, Wei Wang, and Lei Xie. 2018. Vskin: Sensing touch gestures on surfaces of mobile devices using acoustic signals. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking. ACM, USA, 591–605.

Digital Library

[30]

Apple Support. 2019. Use Siri on all your Apple devices. https://support.apple.com/en-us/HT204389.

[31]

TensorFlow. 2020. TensorFlow Lite for Microcontrollers. https://www.tensorflow.org/lite/microcontrollers.

[32]

Yu-Chih Tung and Kang G Shin. 2016. Expansion of human-phone interface by sensing structure-borne sound propagation. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. ACM, USA, 277–289.

Digital Library

[33]

Pui Chung Wong, Hongbo Fu, and Kening Zhu. 2016. Back-Mirror: back-of-device one-handed interaction on smartphones. In SIGGRAPH ASIA 2016 Mobile Graphics and Interactive Applications. ACM, USA, 10.

Digital Library

[34]

xdadevelopers. 2020. Tap, Tap brings iOS 14/Android 11’s Back Tap gesture to any Android device. https://www.xda-developers.com/tap-tap-brings-ios-14-android-11-back-tap-gesture-any-android-device/.

[35]

Hui Xu, Yangfan Zhou, and Michael R Lyu. 2014. Towards continuous and passive authentication via touch biometrics: An experimental study on smartphones. In 10th Symposium On Usable Privacy and Security ({SOUPS} 2014). ACM, USA, 187–198.

[36]

Jianbo Yang, Minh Nhut Nguyen, Phyo Phyo San, Xiao Li Li, and Shonali Krishnaswamy. 2015. Deep convolutional neural networks on multichannel time series for human activity recognition. In Twenty-Fourth International Joint Conference on Artificial Intelligence. AAAI press, USA, 3995–4001.

[37]

Hui-Shyong Yeo, Juyoung Lee, Andrea Bianchi, and Aaron Quigley. 2016. Sidetap & slingshot gestures on unmodified smartwatches. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, USA, 189–190.

Digital Library

[38]

Chun Yu, Xiaoying Wei, Shubh Vachher, Yue Qin, Chen Liang, Yueting Weng, Yizheng Gu, and Yuanchun Shi. 2019. HandSee: Enabling Full Hand Interaction on Smartphone with Front Camera-based Stereo Vision. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, USA, 705.

Digital Library

[39]

Cheng Zhang, Anhong Guo, Dingtian Zhang, Yang Li, Caleb Southern, Rosa I Arriaga, and Gregory D Abowd. 2016. Beyond the touchscreen: an exploration of extending interactions on commodity smartphones. ACM Transactions on Interactive Intelligent Systems (TiiS) 6, 2(2016), 16.

[40]

Cheng Zhang, Aman Parnami, Caleb Southern, Edison Thomaz, Gabriel Reyes, Rosa Arriaga, and Gregory D Abowd. 2013. BackTap: robust four-point tapping on the back of an off-the-shelf smartphone. In Proceedings of the adjunct publication of the 26th annual ACM symposium on User interface software and technology. ACM, USA, 111–112.

Digital Library

[41]

Cheng Zhang, Junrui Yang, Caleb Southern, Thad E Starner, and Gregory D Abowd. 2016. WatchOut: extending interactions on a smartwatch with inertial sensing. In Proceedings of the 2016 ACM International Symposium on Wearable Computers. ACM, USA, 136–143.

Digital Library

[42]

Qilin Zhang, Gang Hua, Wei Liu, Zicheng Liu, and Zhengyou Zhang. 2014. Can visual recognition benefit from auxiliary information in training?. In Asian Conference on Computer Vision. Springer, USA, 65–80.

[43]

Xucong Zhang, Michael Xuelin Huang, Yusuke Sugano, and Andreas Bulling. 2018. Training person-specific gaze estimators from user interactions with multiple devices. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, USA, 624.

Digital Library

[44]

Xucong Zhang, Yusuke Sugano, Mario Fritz, and Andreas Bulling. 2017. Mpiigaze: Real-world dataset and deep appearance-based gaze estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 1(2017), 162–175.

Digital Library

[45]

Yang Zhang, Gierad Laput, and Chris Harrison. 2017. Electrick: Low-cost touch sensing using electric field tomography. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, USA, 1–14.

Digital Library

[46]

Nan Zheng, Kun Bai, Hai Huang, and Haining Wang. 2014. You are how you touch: User verification on smartphones via tapping behaviors. In 2014 IEEE 22nd International Conference on Network Protocols. IEEE, USA, 221–232.

Digital Library

[47]

Junhan Zhou, Yang Zhang, Gierad Laput, and Chris Harrison. 2016. AuraSense: enabling expressive around-smartwatch interactions with electric field sensing. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, USA, 81–86.

Digital Library

Cited By

Tsunoda RChoi MShizuki B(2024)Thumb-to-Finger Gesture Recognition Using COTS Smartwatch AccelerometersProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701600(184-195)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1145/3701571.3701600
Khanna PRamakrishnan IJain SBi XBalasubramanian A(2024)Hand Gesture Recognition for Blind Users by Tracking 3D Gesture TrajectoryProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642602(1-15)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642602
Khanna PFeiz SXu JRamakrishnan IJain SBi XBalasubramanian ACosta XAl Hassanieh HAsadi ACox LPerino DWidmer JGiustiniano D(2023)AccessWear: Making Smartphone Applications Accessible to Blind UsersProceedings of the 29th Annual International Conference on Mobile Computing and Networking10.1145/3570361.3592495(1-16)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3570361.3592495
Show More Cited By

Index Terms

TapNet: The Design, Training, Implementation, and Applications of a Multi-Task Learning CNN for Off-Screen Mobile Input
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Human-centered computing
  1. Human computer interaction (HCI)

Index terms have been assigned to the content through auto-classification.

Recommendations

Understanding back-to-front pinching for eyes-free mobile touch input
MobileHCI '16: Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services

Using a smartphone touchscreen to control apps mirrored to a distant display is hard, since the user cannot see where she is touching while looking at the distant screen. Tactile landmarks at the back of the phone can mitigate this problem, especially ...
Investigating Screen Shifting Techniques to Improve One-Handed Smartphone Usage
NordiCHI '16: Proceedings of the 9th Nordic Conference on Human-Computer Interaction

With increasingly large smartphones, it becomes more difficult to use these devices one-handed. Due to a large touchscreen, users can not reach across the whole screen using their thumb. In this paper, we investigate approaches to move the screen ...
JoyHolder: Tangible Back-of-Device Mobile Interactions
ISS '19: Proceedings of the 2019 ACM International Conference on Interactive Surfaces and Spaces

One-handed mobile use, which is predominantly thumb-driven, presents interaction challenges like screen occlusion, reachability of far and inside corners, and an increased chance of dropping the device. We adopt a Research through Design approach around ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

May 2021

10862 pages

ISBN:9781450380966

DOI:10.1145/3411764

General Chairs:
Yoshifumi Kitamura
Tohoku University, Japan
,
Aaron Quigley
University of New South Wales, Australia
,
Program Chairs:
Katherine Isbister
University of California Santa Cruz, USA
,
Takeo Igarashi
The University of Tokyo, Japan
,
Publications Chairs:
Pernille Bjørn
University of Copenhagen, Denmark
,
Steven Drucker
Microsoft Research, USA

Copyright © 2021 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2021

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CHI '21

Sponsor:

SIGCHI

CHI '21: CHI Conference on Human Factors in Computing Systems

May 8 - 13, 2021

Yokohama, Japan

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
1,432
Total Downloads

Downloads (Last 12 months)383
Downloads (Last 6 weeks)51

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tsunoda RChoi MShizuki B(2024)Thumb-to-Finger Gesture Recognition Using COTS Smartwatch AccelerometersProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701600(184-195)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1145/3701571.3701600
Khanna PRamakrishnan IJain SBi XBalasubramanian A(2024)Hand Gesture Recognition for Blind Users by Tracking 3D Gesture TrajectoryProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642602(1-15)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642602
Khanna PFeiz SXu JRamakrishnan IJain SBi XBalasubramanian ACosta XAl Hassanieh HAsadi ACox LPerino DWidmer JGiustiniano D(2023)AccessWear: Making Smartphone Applications Accessible to Blind UsersProceedings of the 29th Annual International Conference on Mobile Computing and Networking10.1145/3570361.3592495(1-16)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3570361.3592495
Choi HCutkosky MStanley A(2023)Integrated Pneumatic Sensing and Actuation for Soft Haptic DevicesIEEE Robotics and Automation Letters10.1109/LRA.2023.33204948:11(7591-7598)Online publication date: Nov-2023
https://doi.org/10.1109/LRA.2023.3320494

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten