Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Evaluating Intelligibility and Battery Drain of Mobile Sign Language Video Transmitted at Low Frame Rates and Bit Rates

Published: 14 November 2015 Publication History

Abstract

Mobile sign language video conversations can become unintelligible if high video transmission rates cause network congestion and delayed video. In an effort to understand the perceived lower limits of intelligible sign language video intended for mobile communication, we evaluated sign language video transmitted at four low frame rates (1, 5, 10, and 15 frames per second [fps]) and four low fixed bit rates (15, 30, 60, and 120 kilobits per second [kbps]) at a constant spatial resolution of 320 × 240 pixels. We discovered an “intelligibility ceiling effect,” in which increasing the frame rate above 10fps did not improve perceived intelligibility, and increasing the bit rate above 60kbps produced diminishing returns. Given the study parameters, our findings suggest that relaxing the recommended frame rate and bit rate to 10fps at 60kbps will provide intelligible video conversations while reducing total bandwidth consumption to 25% of the ITU-T standard (at least 25fps and 100kbps). As part of this work, we developed the Human Signal Intelligibility Model, a new conceptual model useful for informing evaluations of video intelligibility and our methodology for creating linguistically accessible web surveys for deaf people. We also conducted a battery-savings experiment quantifying battery drain when sign language video is transmitted at the lower frame rates and bit rates. Results confirmed that increasing the transmission rates monotonically decreased the battery life.

References

[1]
N. Ahmen, T. Natarajan, and K. R. Rao. 1974. Discrete cosine transform. IEEE Transactions on Computers C-23, 1, 90--93.
[2]
L. Aimar, L. Merritt, E. Petit, et al. 2005. x264 - a free h264/avc encoder. Online (last accessed on: 04/01/07). http://www.videolan.org/developers/x264.html.
[3]
Apple. 2013. Apple - QuickTime - Download. Retrieved September 30, 2015 from http://www.apple.com/quicktime/download/.
[4]
ARM. 2008. The architecture for the digital world. Retrieved September 30, 2015 from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0419c/index.html.
[5]
B. Arons. 1997. SpeechSkimmer: A system for interactively skimming recorded speech. Proceedings of the CHI, 3--38.
[6]
F. Asim. 2013. AndroSensor. Retrieved September 30, 2015 from http://www.fivasim.com/androsensor.html.
[7]
Asterisk. 2014. Asterisk. Retrieved September 30, 2015 from http://www.asterisk.org/.
[8]
AT&T. 2014. AT&T. Retrieved September 30, 2015 from http://www.att.com/shop/wireless/data-plans.html#&T. Retrieved September 30, 2015 from http://www.att.com/shop/wireless/data-plans.html##fbid=027qt05YFJ6.
[9]
S. Bae, T. N. Pappas, and B. Juang. 2009. Spatial resolution and quantization noise tradeoffs for scalable image compression. ICASSP, IEEE, II--945--II--948.
[10]
D. Barnlund. 1970. A Transactional Model of Communication. Harper & Row. New York, NY.
[11]
D. K. Berlo. 1960. The Process of Communication. Holt, Rinehart, & Winston, New York, NY.
[12]
A. Cavender, R. Ladner, and E. Riskin. 2006. MobileASL: Intelligibility of sign language video as constrained by mobile phone technology. Proceedings of ASSETS, 71--78.
[13]
B. Chen. 2013. AT&T allows FaceTime for limited data users. What about unlimited? The New York Times. Retrieved September 30, 2015 from http://bits.blogs.nytimes.com/2013/01/16/facetime-limited-data-att/?_php=true&_type==blogs&_r==0.
[14]
J. Y. C. Chen and J. E. Thropp. 2007. Review of low frame rate effects on human performance. IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans 37, 6, 1063--1076.
[15]
N. Cherniavsky, J. Chon, J. O. Wobbrock, R. Ladner, and E. Riskin. 2009. Activity analysis enabling real-time video communication on mobile phones for deaf users. UIST.
[16]
J. Chon. 2011. Real-time sign language video communication over cell phones. Ph.D. thesis. University of Washington. 1--105.
[17]
J. Chon, N. Cherniavsky, E. Riskin, and R. Ladner. 2009. Enabling access through real-time sign language communication over cell phones. Asilomar Conference on Signals, Systems, and Computers, 588--592.
[18]
F. Ciaramello and S. Hemami. 2011. A computational intelligibility model for assessment and compression of American Sign Language video. IEEE Trans. IP. 20, 11.
[19]
L. Cicco, S. Mascolo, and V. Palmisano. 2008. Skype video responsiveness to bandwidth variations. NOSSDAV.
[20]
H. Clark. 1985. Language use and language users. In: Handbook of Social Psychology. Harper & Row, New York, NY, 179--231.
[21]
Convo. 2011. Convo. Retrieved September 30, 2015 from https://www.convorelay.com/.
[22]
C. Cumming and M. Rodda. 1989. Advocacy, prejudice, and role modeling in the Deaf community. Social Psychology 1, 129, 5--12.
[23]
Doubango Telecom. 2009. IMSDroid-High Quality Video SIP/IMS client for Google Android. Retrieved September 30, 2015 from http://code.google.com/p/imsdroid/.
[24]
R. Feghali, F. Speranza, D. Wang, and A. Vincent. 2007. Video quality metric for bit rate control via joint adjustment of quantization and frame rate. IEEE Transactions on Broadcasting 53, 1, 441--446.
[25]
D. Fitzgerald. 2013. How much smartphone data do you really need? The Wall Street Journal. Retrieved September 30, 2015 from http://blogs.wsj.com/digits/2013/08/01/how-much-smartphone-data-do-you-really-need/.
[26]
K. Harrigan. 1995. The SPECIAL system: Self-paced education with compressed interactive audio learning. Journal of Research on Computing in Education 3, 27, 361--370.
[27]
G. W. Heiman and R. D. Tweney. 1981. Intelligibility and comprehension of time compressed sign language narratives. Journal of Psycholinguistic Research 10, 1, 3--15.
[28]
J. J. Higgins and S. Tashtoush. 1994. An aligned rank transform test for interaction. Nonlinear World 1, 2, 201--2011.
[29]
J. Hollington. 2013. Costs associated with using FaceTime. iLounge. Retrieved September 30, 2015 from http://www.ilounge.com/index.php/articles/comments/costs-associated-with-using-facetime/.
[30]
S. Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 2, 65--70.
[31]
S. Hooper, C. Miller, S. Rose, and G. Veletsianos. 2007. The effects of digital video quality on learner comprehension in an American sign language assessment environment. Sign Language Studies 8, 1, 42--58.
[32]
B. F. Johnson and J. K. Caird. 1996. The effect of frame rate and video information redundancy on the perceptual learning of American sign language gestures. In Proceedings of the CHI’96 Conference Companion on Human Factors in Computing Systems, ACM, New York, NY. 121--122.
[33]
R. Koul. 2003. Synthetic speech perception in individuals with and without disabilities. 19, 1, 49--58.
[34]
Kurtnoise. 2009. Yet another MP4 box user interface for Windows users. Retrieved September 30, 2015 from http://yamb.unite-video.com/index.html.
[35]
H. Lane. 1992. The Mask of Benevolence: Disabling the Deaf Community. Alfred A. Knopf, Inc., New York, NY.
[36]
S. Lawson. 2011. Mobile growth driving out unlimited data. Retrieved September 30, 2015 from http://www.pcworld.com/businesscenter/article/242376/mobile_growth_driving_out_unlimited_data.html.
[37]
C. Lucas and C. Valli. 2000. Linguistics of American Sign Language: An Introduction. Gallaudet University Press, Washington, DC.
[38]
J. Maher. 1996. Seeing Language in Sign: The Work of William C. Stokoe. Gallaudet University Press, Washington, DC.
[39]
G. Marshall. 2014. How much 4G data do you really need? Retrieved September 30, 2015 from http://www.techradar.com/us/news/phone-and-communications/mobile-phones/how-much-4g-data-do-you-really-need--1176594.
[40]
M. Masry and S. S. Hemami. 2001. An analysis of subjective quality in low bit rate video. International Conference on Image Processing, IEEE, 465--468.
[41]
M. Masry and S. Hemami. 2003. CVQE: A metric for continuous video quality evaluation at low bit rates. SPIE Human Vision and Electronic Imaging.
[42]
J. McCarthy, M. A. Sasse, and D. Miras. 2004. Sharp or smooth? Comparing the effects of quantization vs. frame rate for streamed video. Proceedings of the CHI.
[43]
Merriam-Webster. 2003. The Merriam-Webster Dictionary. http://www.merriam-webster.com (8 May 2003).
[44]
Microsoft. 2013. How much data will Skype use on my mobile phone? http://community.skype.com/t5/Other-features/How-much-data-does-skype-use/td-p/897886.
[45]
I. Munoz-Baell and T. Ruiz. 2000. Empowering the deaf. Epidemiology and Community Health 1, 54, 40--44.
[46]
Cisco. 2015. Cisco visual networking index:global mobile data trafic forecast update, 2014--2019. http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white_paper_c11-520862.pdf.
[47]
A. Nemethova, M. Ries, M. Zavodsky, and M. Rupp. 2006. PSNR-based estimation of subjective time-variant video quality for mobiles. Proceedings of MESAQIN 2006, Prag, Tschechien, June, 2006.
[48]
N. Omoigui, L. He, A. Gupta, J. Grudin, and E. Sanocki. 1999. Time-compression. Proceedings of CHI, ACM Press, New York, NY, 136--143.
[49]
A. Oppenheim and R. Schafer. 1975. Discrete-Time Signal Processing. Pearson.
[50]
C. Padden and T. Humphries. 2005. Inside Deaf Culture. Harvard University Press, Boston, MA.
[51]
J. Postel. 1980. User Datagram Protocol--RFC 768. https://tools.ietf.org/html/rfc768.
[52]
Purple. 2014. Purple VRS on Your Devices. Retrieved September 30, 2015 from http://www.purple.us/.
[53]
T. Reagan. 1995. A social culture understanding of deafness: American Sign Language and the culture of deaf people. Intercultural Relations 19, 2, 239--251.
[54]
I. Richardson. 2004. vocdex: H.264 tutorial white papers. http://www.vcodex.com/h264.html.
[55]
E. Riskin, R. Ladner, and J. Wobbrock. 2012. MobileASL. University of Washington. Retrieved September 30, 2015 from http://mobileasl.cs.washington.edu/.
[56]
J. Rosenberg, H. Schulzrinee, G. Camarillo, et al. 2002. SIP: Session Initiation Protocol. RCS 3261. https://tools.ietf.org/html/rfc3261.
[57]
A. Saks and G. Hellström. 2006. Quality of conversation experience in sign language, lip reading and text. ITU-T Workshop on End-to-end QoE/QoS.
[58]
C. E. Shannon. 1948. A mathematical theory of communication. The Bell System Technical Journal 27, 379-426, 623--656.
[59]
Skype. 2011. Skype. Retrieved September 30, 2015 from http://www.skype.com/intl/en-us/home.
[60]
Sorenson. 2014. Sorenson Communications. Retrieved September 30, 2015 from http://www.sorenson.com/.
[61]
G. Sperling, M. Landy, Y. Cohen, and M. Pavel. 1985. Intelligible encoding of ASL image sequences at extremely low information rates. Computer Vision Graphics, and Image Processing 31, 335--391.
[62]
Static Brain Research Institute. 2012. Skype statistics. Retrieved September 30, 2015 from http://www.statisticbrain.com/skype-statistics.
[63]
H. Thu and M. Ghanbari. 2008. Scope of validity of PSNR in image/video quality assessment. Electronic Letters 44, 13, 800--801.
[64]
T-Mobile. 2014. T-Mobile. Retrieved September 30, 2015 from http://www.t-mobile.com/cell-phone-plans/individual.html#lshop_plans_1.
[65]
J. J. Tran, B. Flowers, E. Riskin, R. Ladner, and J. O. Wobbrock. 2014. Analyzing the intelligibility of real-time mobile sign language video transmitted below recommended standards. Proceedings of ASSETS, 177--184.
[66]
J. J. Tran, J. Kim, J. Chon, E. Riskin, R. Ladner, and J. O. Wobbrock. 2011. Evaluating quality and comprehension of real-time sign language video on mobile phones. Proceedings of ASSETS, 115--122.
[67]
J. J. Tran, E. Riskin, R. Ladner, and J. O. Wobbrock. 2013. Increasing mobile sign language video accessibility by relaxing video transmission standards. Third Mobile Accessibility Workshop at Proceedings of CHI.
[68]
Verizon. 2014. Verizon Wireless. Retrieved September 30, 2015 from http://www.verizonwireless.com/b2c/index.html.
[69]
Y. Wang and Y. Ou. 2012. Modeling rate and perceptual quality of scalable video as functions of quantization and frame rate and its application in scalable video adaptation. IEEE Transactions on Circuits and Systems for Video Technology, 671--682.
[70]
Z. Wang, A. Bovik, and L. Lu. 2002. Why is image quality assessment so difficult? ITASS, 3313--3316.
[71]
E. Weber. 1834. De pulsu, resorptione, auditu et tactu. Anatationes anatomicae et physiologicae.
[72]
T. Wiegang, H. Schwarz, A. Joch, F. Kossentini, and G. Sullivan. 2003. Rate-constrained coder control and comparison of video coding standards. IEEE Transactions on Circuits and Systems for Video Technology 13, 7, 688--703.
[73]
S. Winkler and P. Mohandas. 2008. The evolution of video quality measurement: From PSNR to hybrid metrics. IEEE Transactions on Broadcasting 54, 3, 660--668.
[74]
J. O. Wobbrock, L. Findlater, D. Gergie, and J. J. Higgins. 2011. The Aligned Rank Transform for nonparametric factorial analyses using only ANOVA procedures. Proceedings of CHI, 143--146.
[75]
G. Yadavalli, S. Hemami, and M. Masry. 2003. Frame rate preferences in low bit rate video. IEEE Trans. IP. 441--444.
[76]
E. Zeman. 2010. “iPhone 4 jailbreak unlocks 3G FaceTime calls. Information Week. Retrieved September 30, 2015 from http://www.informationweek.com/mobile/mobile-devices/iphone-4-jailbreak-unlocks-3g-facetime-calls/d/d-id/1091309?
[77]
ZVRS. 2014. ZVRS Communication Service for the Deaf, Inc. http://www.zvrs.com/products/softwareapps.

Cited By

View all
  • (2024)The Acceptance of Culturally Adapted Signing Avatars Among Deaf and Hard-of-Hearing IndividualsIEEE Access10.1109/ACCESS.2024.340712812(78624-78640)Online publication date: 2024
  • (2023)User Perceptions and Preferences for Online Surveys in American Sign Language: An Exploratory StudyProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608444(1-17)Online publication date: 22-Oct-2023
  • (2022)American Sign Language Words Recognition Using Spatio-Temporal Prosodic and Angle Features: A Sequential Learning ApproachIEEE Access10.1109/ACCESS.2022.314813210(15911-15923)Online publication date: 2022
  • Show More Cited By

Index Terms

  1. Evaluating Intelligibility and Battery Drain of Mobile Sign Language Video Transmitted at Low Frame Rates and Bit Rates

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Accessible Computing
    ACM Transactions on Accessible Computing  Volume 7, Issue 3
    Special Issue (Part 2) of Papers from ASSETS 2013
    November 2015
    79 pages
    ISSN:1936-7228
    EISSN:1936-7236
    DOI:10.1145/2836329
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 November 2015
    Accepted: 01 June 2015
    Revised: 01 June 2015
    Received: 01 June 2014
    Published in TACCESS Volume 7, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. American Sign Language
    2. Deaf community
    3. Intelligibility
    4. battery power
    5. bit rate
    6. communication model
    7. comprehension
    8. frame rate
    9. mobile phone
    10. smartphone
    11. video compression
    12. web survey

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Google
    • National Science Foundation

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The Acceptance of Culturally Adapted Signing Avatars Among Deaf and Hard-of-Hearing IndividualsIEEE Access10.1109/ACCESS.2024.340712812(78624-78640)Online publication date: 2024
    • (2023)User Perceptions and Preferences for Online Surveys in American Sign Language: An Exploratory StudyProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608444(1-17)Online publication date: 22-Oct-2023
    • (2022)American Sign Language Words Recognition Using Spatio-Temporal Prosodic and Angle Features: A Sequential Learning ApproachIEEE Access10.1109/ACCESS.2022.314813210(15911-15923)Online publication date: 2022
    • (2020)Creating questionnaires that align with ASL linguistic principles and cultural practices within the Deaf communityProceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3373625.3418071(1-4)Online publication date: 26-Oct-2020
    • (2019)Toward a Sign Language-Friendly Questionnaire DesignThe Journal of Deaf Studies and Deaf Education10.1093/deafed/enz02124:4(333-345)Online publication date: 4-Jul-2019
    • (2016)Deaf and Hard of Hearing Individuals' Perceptions of Communication with Hearing Colleagues in Small GroupsProceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/2982142.2982198(271-272)Online publication date: 23-Oct-2016

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media