Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1111360.1111396acmotherconferencesArticle/Chapter ViewAbstractPublication PagesclihcConference Proceedingsconference-collections
Article

Software usando reconhecimento e síntese de voz: o estado da arte para o Português brasileiro

Published: 23 October 2005 Publication History

Abstract

Speech is a natural interface for human-computer interaction. Speech (or voice) technology is a well-developed field when one considers the international community. There is a wide variety of academic and industrial software. The majority of them assumes a recognizer or synthesizer is available, and can be programmed through an API. In contrast, there are few resources in public domain for Brazilian Portuguese. This work discusses some of these issues and compares SAPI and JSAPI, which are APIs promoted by Microsoft and Sun, respectively. We also present two examples: a tic-tac-toe JSAPI-based game using Portuguese digits recognition and a computer-aided language learning (CALL) application using SAPI-based speech recognition in English and synthesis in Portuguese.

References

[1]
J. Allen, M. S. Hunnicutt, D. H. Klatt, R. C. Armstrong, and D. B. Pisoni. From text to speech: The MITalk system. Cambridge University Press, 1987.
[2]
http://www.research.att.com/projects/tts/demo.html, Visited in March, 2005.
[3]
A. Black, K. Lenzo, and V. Pagel. Issues in building general letter to sound rules. In ESCA Synthesis Workshop, Australia 1998, 1998.
[4]
http://cslu.cse.ogi.edu/toolkit/, Visited in March, 2005.
[5]
T. Dutoit. An Introduction to Text-To-Speech Synthesis. Kluwer, 2001.
[6]
http://htk.eng.ac.uk, 2005.
[7]
http://hts.ics.nitech.ac.jp/, Visited in March, 2005.
[8]
X. Huang, A. Acero, and H.-W. Hon. Spoken language processing. Prentice-Hall, 2001.
[9]
S. Isard and D. Miller. Diphone synthesis techniques. In Proceedings of the IEE International Conference on Speech Input/Output, pages 77--82, 1986.
[10]
http://java.sun.com/products/java-media/speech/, Visited in March, 2005.
[11]
http://www.ldc.upenn.edu., Visited in March, 2005.
[12]
K. A. Lenzo and A. W. Black. Diphone collection and synthesis. In ICSLP, 2000.
[13]
L. Pessoa, F. Violaro, and P. Barbosa. Modelo de língua baseado em gramática gerativa aplicado ao reconhecimento de fala contínua. In XVII Simpósio Brasileiro de Telecomunicações, pages 455--458, 1999.
[14]
Patent EP984430 - speech recognizer with lexicon updateable by spelled word input, http://gauss.ffii.org/patentview/ep984430, 2004.
[15]
P. L. Rodrigues, B. Feijó, and L. Velho. Expressive talking heads: uma ferramenta de animação com fala e expressão facial sincronizadas para o desenvolvimento de aplicações interativas. In Proceedings of Webmídia. SBC, 2004.
[16]
http://www.microsoft.com/speech/, Visited in March, 2005.
[17]
http://cmusphinx.sourceforge.net/sphinx4/, 2005.

Cited By

View all
  • (2017)Use of Automatic Speech Recognition Systems for Multimedia ApplicationsProceedings of the 23rd Brazillian Symposium on Multimedia and the Web10.1145/3126858.3131630(33-36)Online publication date: 17-Oct-2017
  • (2010)A baseline system for continuous speech recognition of Brazilian Prtuguese using the west point Brazilian Portuguese speech corpusProceedings of the 9th international conference on Computational Processing of the Portuguese Language10.1007/978-3-642-12320-7_18(132-141)Online publication date: 27-Apr-2010
  • (2008)Desenvolvimento e avaliação de um sistema multimodal e multiusuário de navegação webCompanion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web10.1145/1809980.1809989(29-32)Online publication date: 26-Oct-2008

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CLIHC '05: Proceedings of the 2005 Latin American conference on Human-computer interaction
October 2005
361 pages
ISBN:1595932240
DOI:10.1145/1111360
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Tecnologia Virtual
  • SIG-CHI Mexico
  • SIG-CHI Brazil
  • Create-Net
  • Microsoft Research: Microsoft Research
  • SMCC
  • ITESM Cuernavaca
  • Pullman de Morelos

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 14 of 42 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Use of Automatic Speech Recognition Systems for Multimedia ApplicationsProceedings of the 23rd Brazillian Symposium on Multimedia and the Web10.1145/3126858.3131630(33-36)Online publication date: 17-Oct-2017
  • (2010)A baseline system for continuous speech recognition of Brazilian Prtuguese using the west point Brazilian Portuguese speech corpusProceedings of the 9th international conference on Computational Processing of the Portuguese Language10.1007/978-3-642-12320-7_18(132-141)Online publication date: 27-Apr-2010
  • (2008)Desenvolvimento e avaliação de um sistema multimodal e multiusuário de navegação webCompanion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web10.1145/1809980.1809989(29-32)Online publication date: 26-Oct-2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media