Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/168642.168661acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
Article
Free access

SpeechSkimmer: interactively skimming recorded speech

Published: 01 December 1993 Publication History
First page of PDF

References

[1]
Aaronson, D., Markowitz, N., and Shapiro, H. Perception and Immediate Recall of Normal and Compressed Auditory Sequences. Perception and Psychophysics 9, 4 (1971), 338-344.]]
[2]
Arons, B. Hyperspeech: Navigating in Speech-Only Hypermedia. In Hypertext '91, ACM, 1991, pp. 133-146.]]
[3]
Arons, B. Techniques, Perception, and Applications of Time-Compressed Speech. In Proceedings of 1992 Conference, American Voice i/O Society, Sep. 1992, pp. 169-177.]]
[4]
Arons, B. Tools for Building Asynchronous Servers to Support Speech and Audio Applications. In UIST '92. Proceedings of the A CM Symposium on User Interface Software and Technology, Nov. 1992, pp. 71-78.]]
[5]
Beasley, D.S. and Maki, J.E. Time- and Frequency- Altered Speech. In Contemporary Issues in Experimental Phonetics. Academic Press, Lass, N.J., editor, Ch. 12, pp. 419--458, 1976.]]
[6]
Buxton, W., Gaver, B., and Bly, S., The Use of Non- Speech Audio at the Interface, ACM SIGCHI, 199 I, Tutorial Notes.]]
[7]
Chen, F.R. and Withgott, M. The Use of Emphasis to Automatically Summarize Spoken Discourse. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, IEEE, 1992, pp. 229-233.]]
[8]
De Souza, P. A Statistical Approach to the Design of an Adaptive Self-Normalizing Silence Detector. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-31, 3 (Jun. 1983), 678-684.]]
[9]
Degen, L., Mander, R., and Salomon, G. Working with Audio: Integrating Personal Tape Recorders and Desktop Computers. In CHI '92, ACM, Apr. 1992, pp. 413-418.]]
[10]
Fairbanks, G., Everitt, W.L., and Jaeger, R.P. Method for Time or Frequency Compression- Expansion of Speech. Transaction of the Institute of Radio Engineers, Professional Group on Audio A U-2 (1954), 7-12, Reprinted in G. Fairbanks. Experimental Phonetics: Selected Articles, University of Illinois Press, 1966.]]
[11]
Foulke, E. The Perception of Time Compressed Speech. In Perception of Language. Chm'les E. Merrill Publishing Company, Kjeldergaard, P.M., Horton, D.L., and Jenkins, J.J., editors, Ch. 4, pp. 79-107, 1971.]]
[12]
Furnas, G.W. Generalized Fisheye Views. In CHI '86, ACM, 1986, pp. 16-23.]]
[13]
Gaver, W.W. Auditory Icons: Using Sound in Computer Interfaces. Human-Computer Interaction 2 (1989), 167-177.]]
[14]
Gerber, S.E. and Wulfeck, B.H. The Limiting Effect of Discard Interval on Time-Compressed Speech. Language and Speech 20, 2 (1977), 108-115.]]
[15]
Glavitsch, U. and Sch~iuble, P. A System for Retrieving Speech Documents. In 15th Annual International SIGIR '92, ACM, 1992, pp. 168--176.]]
[16]
Gruber, J.G. A Comparison of Measured and Calculated Speech Temporal Parameters Relevant to Speech Activity Detection. iEEE Transactions on Communications COM-30, 4 (Apr. 1982), 728-738.]]
[17]
Gruber, J.G. and Le, N.H. Performance Requirements for Integrated Voice/Data Networks. IEEE Journal on Selected Areas in Communications SAC-i, 6 (Dec. 1983), 981-1005.]]
[18]
Grudin, J. Why CSCW applications fail: Problems in the Design and Evaluation of Organizational Interfaces. In CHI '88, 1988.]]
[19]
Heiman, G.W., Leo, R.J., Leighbody, G., and Bowler, K. Word Intelligibility Decrements and the Comprehension of Time-Compressed Speech. Perception and Psychophysics 40, 6 (1986), 407- 411.]]
[20]
Hejna Jr., D.J. Real-Time Time-Scale Modification of Speech via the Synchronized Overlap-Add Algorithm, Master's thesis, Department of Electrical Engineering and Computer Science, MIT, Feb. 1990.]]
[21]
Houle, G.R., Maksymowicz, A.T., and Penafiel, H.M. Back-End Processing for Automatic Gisting Systems. In Proceedings of 1988 Conference, American Voice I/O Society, 1988.]]
[22]
Jeffries, R., Miller, J.R., Wharton, C., and Uyeda, K.M. User Interface Evaluation in the Real World: A comparison of Four techniques. In CHI '91, ACM, Apr 1991, pp. 119-124.]]
[23]
Lamel, L.F., Rabiner, L.R., Rosenberg, A.E., and Wilpon, J.G. An Improved Endpoint Detec~tor for Isolated Word Recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-29, 4 (Aug. 1981), 777-785.]]
[24]
Lass, N.J. and Leeper, H.A. Listening Rate Preference: Comparison of Two Time Alteration Techniques. Perceptual and Motor Skills 44 (1977), 1163-1168.]]
[25]
Lee, H.H. and Un, C.K. A Study of on-off Characteristics of Conversational Speech. IEEE Transactions on Communications COM-34, 6 (Jun. 1986), 630-637.]]
[26]
Levelt, W.J.M. Speaking: From Intention to Articulation, MIT Press (1989).]]
[27]
Lynch Jr., J.F., Josenhans, J.G., and Crochiere, R.E. Speech/Silence Segmentation for Real-Time Coding via Rule Based Adaptive Endpoint Detection. In Proceedings of the international Conference on Acoustics, Speech, and Signal Processing, IEEE, 1987, pp. 1348-1351.]]
[28]
Mackinlay, J.D., Robertson, G.G., and Card, S.K. The Perspective Wall: Detail and Context Smoothly Integrated. In CHi '91, ACM, 1991, pp. 173-179.]]
[29]
UnMouse User's Manual, Microtouch Systems Inc., Wilmington, MA.]]
[30]
Mills, M., Cohen, J., and Wong, Y.Y. A Magnifier Tool for Video Data. In CHI '92, ACM, Apr. 1992, pp. 93-98.]]
[31]
Minifie, F.D. Durational Aspects of Connected Speech Samples. In Time-Compressed Speech. Scarecrow, Duker, S., editor, pp. 709-715, 1974.]]
[32]
Neuburg, E.P. Simple Pitch-Dependent Algorithm for High Quality Speech Rate Changing. Journal of the Acoustic Society of America 63, 2 (1978), 624-625.]]
[33]
O'Shaughnessy, D. Speech Communication: Human and Machine, Addison-Wesley (1987).]]
[34]
O'Shaughnessy, D. Recognition of Hesitations in Spontaneous Speech. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, iEEE, 1992, pp. 1521-1524.]]
[35]
Rabiner, L.R. and Sambur, M.R. An Algorithm for Determining the Endpoints of Isolated Utterances. The Bell System Technical Journal 54, 2 (Feb. 1975), 297-315.]]
[36]
Reich, S.S. Significance of Pauses for Speech Perception. Journal of Psycholinguistic Research 9, 4 (1980), 379-389.]]
[37]
Resnick, P. and Virzi, R.A. Skip and Scan: Cleaning Up Telephone Interfaces. In CH1 '92, ACM, Apr. 1992, pp. 419-426.]]
[38]
Rose, R.C. Techniques for Information Retrieval from Speech Messages. The Lincoln Lab Journal 4, 1 (1991), 45-60.]]
[39]
Roucos, S. and Wilgus, A.M. High Quality Time- Scale Modification for Speech. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, iEEE, 1985, pp. 493-496.]]
[40]
Savoji, M.H. A Robust Algorithm for Accurate Endpointing of Speech Signals. Speech Communication 8 (1989), 45-60.]]
[41]
Schmandt, C. and Arons, B. A Conversational Telephone Messaging System. IEEE Transactions on Consumer Electronics CE-30, 3 (Aug. 1984), xxixxiv.]]
[42]
Scott, R.J. Time Adjustment in Speech Synthesis. Journal of the Acoustic Society of America 41, 1 (1967), 60-65.]]
[43]
Stifelman, L.J., Arons, B., Schmandt, C., and Hulteen, E.A. VoiceNotes: A Speech Interface for a Hand-Held Voice Notetaker. In Proceedings of INTERCHI Conference, ACM SIGCHi, 1993.]]
[44]
Wightman, C.W. and Ostendorf, M. Automatic Recognition of Intonational Features. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, IEEE, 1992, pp. 1221-1224.]]
[45]
Wilcox, L., Smith, I., and Bush, M. Wordspotting for Voice Editing and Audio Indexing. In CHI '92, ACM SIGCHI, 1992, pp. 655-656.]]

Cited By

View all
  • (2021)Accessing Media Via an Audio-only Communication Channel: A Log AnalysisProceedings of the 3rd Conference on Conversational User Interfaces10.1145/3469595.3469623(1-6)Online publication date: 27-Jul-2021
  • (2020)Designing an Eyes-Reduced Document Skimming App for Situational ImpairmentsProceedings of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3313831.3376641(1-14)Online publication date: 21-Apr-2020
  • (2017)TypeTalkerProceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing10.1145/2998181.2998260(1970-1981)Online publication date: 25-Feb-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
UIST '93: Proceedings of the 6th annual ACM symposium on User interface software and technology
December 1993
267 pages
ISBN:089791628X
DOI:10.1145/168642
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 1993

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. browsing
  2. interactive listening
  3. non-speech audio
  4. speech as data
  5. speech detection
  6. speech skimming
  7. speech user interfaces
  8. time compression

Qualifiers

  • Article

Conference

PRS93

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)84
  • Downloads (Last 6 weeks)12
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Accessing Media Via an Audio-only Communication Channel: A Log AnalysisProceedings of the 3rd Conference on Conversational User Interfaces10.1145/3469595.3469623(1-6)Online publication date: 27-Jul-2021
  • (2020)Designing an Eyes-Reduced Document Skimming App for Situational ImpairmentsProceedings of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3313831.3376641(1-14)Online publication date: 21-Apr-2020
  • (2017)TypeTalkerProceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing10.1145/2998181.2998260(1970-1981)Online publication date: 25-Feb-2017
  • (2017)The effect of pause location on perceived fluencyApplied Psycholinguistics10.1017/S014271641700053439:3(569-591)Online publication date: 23-Nov-2017
  • (2016)Simplified Audio Production in Asynchronous Voice-Based DiscussionsProceedings of the 2016 CHI Conference on Human Factors in Computing Systems10.1145/2858036.2858416(1045-1054)Online publication date: 7-May-2016
  • (2014)BibliographySemantic Multimedia Analysis and Processing10.1201/b17080-21(421-512)Online publication date: 18-Jun-2014
  • (2014)Input/Output Devices and Interaction TechniquesComputing Handbook, Third Edition10.1201/b16812-25(1-54)Online publication date: 8-May-2014
  • (2013)Treemaps to visualise and navigate speech audioProceedings of the 25th Australian Computer-Human Interaction Conference: Augmentation, Application, Innovation, Collaboration10.1145/2541016.2541021(555-564)Online publication date: 25-Nov-2013
  • (2012)Input Technologies and TechniquesHuman–Computer Interaction Handbook10.1201/b11963-9(95-132)Online publication date: 14-May-2012
  • (2012)Unlocking the expressivity of point lightsProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/2207676.2208296(1683-1692)Online publication date: 5-May-2012
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media