short-paper

Speech modulated typography: towards an affective representation model

Authors:

Caluã de Lacerda Pataca,

Paula Dornhofer Paro CostaAuthors Info & Claims

IUI '20: Proceedings of the 25th International Conference on Intelligent User Interfaces

Pages 139 - 143

https://doi.org/10.1145/3377325.3377526

Published: 17 March 2020 Publication History

Abstract

The transcription of expressive speech into text is a lossy process, since traditional textual resources are typically not capable of fully representing prosody. Speech modulated typography aims to narrow the gap between expressive speech and its textual transcription, with potential applications in affect-sensitive text interfaces, closed-captioning, and automated voice transcriptions. This paper proposes and evaluates two different representation models of prosody-related acoustic features of expressive speech mapped as axes of a variable font. Our experiment tested its participants' preferences for four of these modulations: font weight, letter width, letter slant, and baseline shift. Each of these represented utterances expressed in one of five emotions (anger, happiness, neutrality, sadness, and surprise). Participants preferred font-weight for sentences spoken with intensity and baseline shift for quieter utterances. In both cases, the distance between each syllable's fundamental frequency and centroid frequency was a good predictor of these preferences.

References

[1]

Ann Bessemans. 2017. Expressive typography to improve communication. ATypI MONTRÉAL 2017. http://youtu.be/JfsixaAmNOw.

[2]

Ann Bessemans, Maarten Renckens, Kevin Bormans, Erik Nuyts, and Kevin Larson. 2019. Visual Prosody Supports Reading Aloud Expressively. Visible Language 53, 3 (2019).

[3]

Kirsten Boehner, Rogerio DePaula, Paul Dourish, and Phoebe Sengers. 2005. Affect: from information to interaction. In Proceedings of the 4th decennial conference on Critical computing: between sense and sensibility. ACM, 59--68.

Digital Library

[4]

Peter Constable and Mike Jacobs. 2017. OpenType Font Variations Overview. https://docs.microsoft.com/en-us/typography/opentype/spec/otvaroverview. Accessed April 7, 2018.

[5]

Paula Dornhofer Paro Costa. 2015. Two-Dimensional Expressive Speech Animation. Ph.D. Dissertation. Faculdade de Engenharia Elétrica, Universidade Estadual de Campinas, Campinas, SP, BRA. Advisor(s) de Martino, José Mario.

[6]

Alexis Deveria. 2019. Can I use: Variable fonts? https://caniuse.com/#feat=variable-fonts. Accessed September 20, 2019.

[7]

Paul Ekman. 1970. Universal facial expressions of emotions. California mental health research digest 8, 4 (1970), 151--158.

[8]

Letters from The Temporary State. 2019. Italics. http://letters.temporarystate.net/entry/4/. Accessed September 23, 2019.

[9]

Amic G Ho and KW Michael Siu. 2009. Emotionalise design, emotional design, emotion design. Proceedings of International Association of Societies of Design Research (2009).

[10]

Ondrej Jelinek. 2013. The Spoken Word in Typography. Master's thesis. FHNW HGK --- Visual Communication Institute, The Basel School of Design.

[11]

Andrew Johnson. 2018. Responsive typography that responds to ambient light. https://twitter.com/Aetherpoint/status/992041069008293890. Accessed September 22, 2019.

[12]

Andrew Johnson. 2018. Responsive typography that responds to viewing angle to stay more readable in 3D space. https://twitter.com/Aetherpoint/status/986678479847591936. Accessed September 22, 2019.

[13]

Beth E Koch. 2012. Emotion in typographic design: an empirical examination. Visible Language 46, 3 (2012), 206--227.

[14]

Jan-Louis Kruger, Stephen Doherty, and María-T Soto-Sanfiel. 2017. Original Language Subtitles: Their Effects on the Native and Foreign Viewer. Comunicar: Media Education Research Journal 25, 50 (2017), 23--32.

[15]

Marc Wilhelm Küster. 2016. Writing Beyond the Letter. Tijdschrift voor Mediageschiedenis 19, 2 (12 2016).

[16]

Gretchen McCulloch. 2019. Because Internet: Understanding the New Rules of Language (1st ed.). Riverhead Books, New York. Kindle version.

[17]

David D Paige, William H Rupley, Grant S Smith, Timothy V Rasinski, William Nichols, and Theresa Magpuri-Lavell. 2017. Is prosodic reading a strategy for comprehension? Journal for Educational Research Online/Journal für Bildungsforschung Online 9, 2 (2017), 245--275.

[18]

Elisa Perego, Fabio Del Missier, Marco Porta, and Mauro Mosconi. 2010. The cognitive effectiveness of subtitle processing. Media psychology 13, 3 (2010), 243--272.

[19]

K Sreenivasa Rao, Ramu Reddy, Sudhamay Maity, and Shashidhar G Koolagudi. 2010. Characterization of emotions using the dynamics of prosodic features. In Speech Prosody 2010-Fifth International Conference.

[20]

Mark Seidenberg. 2017. Language at the Speed of Sight: How We Read, Why So Many Can't, and What Can Be Done About It (1st ed.). Basic Books, New York. Kindle version.

[21]

Nick Sherman. 2019. Variable fonts support. https://v-fonts.com/support/. Accessed September 20, 2019.

[22]

Luiz Tatit. 2007. Todos Entoam --- Ensaios, Conversas e Canções (1st ed.). Publifolha, São Paulo, Brazil.

[23]

Theo Van Leeuwen. 2006. Towards a semiotics of typography. Information design journal 14, 2 (2006), 139--155.

[24]

Matthias Wölfel, Tim Schlippe, and Angelo Stitz. 2015. Voice driven type design. In 2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD).

Cited By

de Lacerda Pataca CHassan STinker NPeiris RHuenerfauth M(2024)Caption Royale: Exploring the Design Space of Affective Captions from the Perspective of Deaf and Hard-of-Hearing IndividualsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642258(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642258
Han KYou WShi SSun L(2024)Hearing with the eyes: modulating lyrics typography for music visualizationThe Visual Computer10.1007/s00371-023-03239-540:11(8345-8361)Online publication date: 19-Jan-2024
https://doi.org/10.1007/s00371-023-03239-5
Hassan SDing YKerure AMiller CBurnett JBiondo EGilbert B(2023)Exploring the Design Space of Automatically Generated Emotive Captions for Deaf or Hard of Hearing UsersExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585880(1-10)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544549.3585880
Show More Cited By

Index Terms

Speech modulated typography: towards an affective representation model
1. Human-centered computing
  1. Accessibility
    1. Accessibility technologies
  2. Human computer interaction (HCI)

Recommendations

EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition
Speech and Computer
Abstract
Emotional speech recognition is a challenging task for modern systems. The presence of emotions significantly changes the characteristics of speech. In this paper, we propose a novel approach for emotional speech recognition (EMO-AVSR). The ...
Speech-based recognition of self-reported and observed emotion in a dimensional space

The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition. We address this issue by comparing self-reported emotion ratings to observed emotion ratings ...
Speech Emotion Recognition Using Fourier Parameters
Recently, studies have been performed on harmony features for speech emotion recognition. It is found in our study that the first- and second-order differences of harmony features also play an important role in speech emotion recognition. Therefore, we ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '20: Proceedings of the 25th International Conference on Intelligent User Interfaces

March 2020

607 pages

ISBN:9781450371186

DOI:10.1145/3377325

General Chairs:
Fabio Paternò,
Nuria Oliver,
Program Chairs:
Cristina Conati,
Lucio Davide Spano,
Nava Tintarev

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 March 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

IUI '20

Sponsor:

IUI '20: 25th International Conference on Intelligent User Interfaces

March 17 - 20, 2020

Cagliari, Italy

Acceptance Rates

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
319
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)8

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

de Lacerda Pataca CHassan STinker NPeiris RHuenerfauth M(2024)Caption Royale: Exploring the Design Space of Affective Captions from the Perspective of Deaf and Hard-of-Hearing IndividualsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642258(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642258
Han KYou WShi SSun L(2024)Hearing with the eyes: modulating lyrics typography for music visualizationThe Visual Computer10.1007/s00371-023-03239-540:11(8345-8361)Online publication date: 19-Jan-2024
https://doi.org/10.1007/s00371-023-03239-5
Hassan SDing YKerure AMiller CBurnett JBiondo EGilbert B(2023)Exploring the Design Space of Automatically Generated Emotive Captions for Deaf or Hard of Hearing UsersExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585880(1-10)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544549.3585880
Shen JJin KZhang ABreazeal CPark H(2023)Affective Typography: The Effect of AI-Driven Font Design on Empathetic Story ReadingExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585625(1-7)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544549.3585625
de Lacerda Pataca CWatkins MPeiris RLee SHuenerfauth M(2023)Visualization of Speech Prosody and Emotion in Captions: Accessibility for Deaf and Hard-of-Hearing UsersProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581511(1-15)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581511
de Lacerda Pataca CCosta P(2023)Hidden Bawls, Whispers, and Yelps: Can Text Convey the Sound of Speech, Beyond Words?IEEE Transactions on Affective Computing10.1109/TAFFC.2022.317472114:1(6-16)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TAFFC.2022.3174721
Schlippe TFritsche KSun YWölfel M(2023)AI-Based Visualization of Voice Characteristics in Lecture Videos’ CaptionsArtificial Intelligence in Education Technologies: New Development and Innovative Practices10.1007/978-981-19-8040-4_8(111-124)Online publication date: 1-Jan-2023
https://doi.org/10.1007/978-981-19-8040-4_8
Schlippe TAlessai SEl-Taweel GWolfel MZaghouani W(2020)Visualizing Voice Characteristics with Type Design in Closed Captions for Arabic2020 International Conference on Cyberworlds (CW)10.1109/CW49994.2020.00039(196-203)Online publication date: Sep-2020
https://doi.org/10.1109/CW49994.2020.00039

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents