Article

An automatic singing voice rectifier design

Authors:

Cheng-Yuan Lin,

J.-S. Roger Jang,

Mao-Yuan HsuAuthors Info & Claims

MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia

Pages 267 - 270

https://doi.org/10.1145/957013.957068

Published: 02 November 2003 Publication History

Get Access

Abstract

This paper proposes a new approach to automatic singing voice rectification. There are two components in the rectifier; one is the recognizer based on dynamic time warping and the other is the synthesizer based PSOLA (Pitch Synchronous Overlap and Add) for pitch shifting. The purpose of the recognizer is to identify the locations of off-key parts of the user's acoustic input. Then with the target music score, the synthesizer tries to correct the off-key parts by appropriate pitch shifting to match the give music score. We also attempt some singing and listening experiments for evaluating the feasibility of the rectifier and the results exhibit the satisfactory performance.

References

[1]

Chen, S.G. and Lin, G.J., "High Quality and Low Complexity Pitch Modification of Acoustic Signals," Proceedings of the 1995 IEEE International Conference on Acoustic, Speech, and Signal Processing, May, Detroit, USA, 1995, p2987--2990.

Google Scholar

[2]

Cheng-Yuan Lin, J.-S. Roger Jang, "New Refinement Schemes for Voice Conversion", IEEE International Conference on Multimedia & Expo 2003, Page: p725--p728.

Digital Library

Google Scholar

[3]

Cheng-Yuan Lin, J.-S. Roger Jang, Shaw-Hwa Hwang, "An On-The-Fly Mandarin Singing Voice Synthesis System", IEEE Pacific-Rim Conference on Multimedia 2002, Page: p631 -- p638.

Digital Library

Google Scholar

[4]

F. Charpentier and Moulines, "Pitch-synchronous Waveform Processing Technique for Text-to-Speech Synthesis Using Diphones," European Conf. On Speech Communication and Technology, pp.13--19, Paris, 1989.

Google Scholar

[5]

ITU-T, Methods for Subjective Determination of Transmission Quality, 1996, Int. Telecommunication Unit.

Google Scholar

[6]

J. R. Deller, J. G. Proakis, J. H. L. Hansen, "Discrete-time processing of speech signals," New York :Macmillan Pub. Co., 1993.

Digital Library

Google Scholar

[7]

J.-S. Roger Jang and Ming-Yang Gao, "A Query-by-Singing System based on Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP. 85--89, Hsinchu, Taiwan, Dec 2000.

Google Scholar

[8]

Macon, Michael W., M. W. Macon, "Speech Synthesis Based on Sinusoidal Modeling," PhD thesis, Georgia Institute of Technology, October 1996.

Digital Library

Google Scholar

Cited By

View all

Lee HHuang CHsu CWang W(2009)Rhythm Speech Lyrics Input for MIDI-Based Singing Voice SynthesisProceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing10.1007/978-3-642-10467-1_40(459-468)Online publication date: 15-Dec-2009
https://dl.acm.org/doi/10.1007/978-3-642-10467-1_40

Index Terms

An automatic singing voice rectifier design
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Speech recognition

Recommendations

Singing Voice Database
Speech and Computer
Abstract
The first publicly available singing voice database, which was first released in 2012, is presented in this paper. This database contains recordings of professional singers including one Grammy Award winner. The database includes so-called plain ...
A Query-by-Singing System for Retrieving Karaoke Music

This paper investigates the problem of retrieving karaoke music using query-by-singing techniques. Unlike regular CD music, where the stereo sound involves two audio channels that usually sound the same, karaoke music encompasses two distinct channels ...
Singing voice detection using perceptually-motivated features
MM '07: Proceedings of the 15th ACM international conference on Multimedia

Perceptual features are motivated by human perception of sounds. In this paper, several perceptually-motivated features such as harmonic, vibrato and timbre are studied to detect singing voice segments in a song. In addition, singing formant and attack-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia

November 2003

670 pages

ISBN:1581137222

DOI:10.1145/957013

General Chairs:
Lawrence Rowe
University of California, Berkeley
,
Harrick Vin
University of Texas, Austin
,
Program Chairs:
Thomas Plagemann
University of Oslo
,
Prashant Shenoy
University of Massachusetts, Amherst
,
John R. Smith
IBM T.J. Watson Research Center

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2003

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM03

Sponsor:

MM03: 2003 11th Annual ACM International Conference on Multimedia

November 2 - 8, 2003

CA, Berkeley, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
412
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Lee HHuang CHsu CWang W(2009)Rhythm Speech Lyrics Input for MIDI-Based Singing Voice SynthesisProceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing10.1007/978-3-642-10467-1_40(459-468)Online publication date: 15-Dec-2009
https://dl.acm.org/doi/10.1007/978-3-642-10467-1_40

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Singing Voice Database

A Query-by-Singing System for Retrieving Karaoke Music

Singing voice detection using perceptually-motivated features

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations