Nature of Speech Signal: Basanta Joshi, PHD
Nature of Speech Signal: Basanta Joshi, PHD
Nature of Speech Signal: Basanta Joshi, PHD
!1
Contents
!2
Voice
• The sound produced by humans and other vertebrates using
the lungs and the vocal folds in the larynx, or voice box.
!3
Speech
• Speech is one of the most information-laid signals; speech sounds have a
rich and multi-layered temporal-spectral variation that convey words,
intention, expression, intonation, accent, speaker identity, gender, age,
style of speaking, state of health of the speaker and emotion.
• a series of complex movements that alter and mold the basic tone created
by voice into specific, decodable sounds.
!4
Speech Production
• Speech sounds are
sensations of air pressure
vibrations produced by air
exhaled from the lungs and
!5
!6
Simple view of speech production
• Linguistic
• Phonetics
!7
!8
!9
Speech spectrum
!10
Spectrogram
!11
Speech chain linking speaker
and listener
!12
Speech Production/ Speech
perception process
!13
Speech signal types
• periodic vibration of the vocal tract resulting in voiced speech
• oral cavity is constricted ,velum is lowere and air flows through nasal
cavity to generate nasal sounds.
!14
Acoustic phonetics
!17
Waveform
Quasi-periodic
!18 response
Simplified digital model for
human speech production system
!19
Digital model for human
speech production
Speech signal is time variant signal and ideally the following points must
be taken into consideration.
For simplicity, vocal tract is modeled as tube of non uniform, time varying
cross-section with no losses due to viscosity and thermal conduction at
the wall of the tube.
!20
Discrete time model for
speech production
!21
Vocal tract
!22
Vocal transfer function
!23
Vocal transfer function
!24
Excitation and radiation
Excitation
Radiation
!25
Excitation
!26
!27
!28
!29
Representation of speech signal
!30
!31
!32
!33
!34
!35
!36
!37
!38
!39
!40
!41
!42
!43
!44
!45
!46
!47
Other quantization schemes
!48
!49
!50
!51
!52
!53
!54
!55
!56
!57
!58
!59
!60
Auditory perception: psychoacoustics
!61
SPL and loudness
!62
Masking
!63
Masking
!64
Critical bands
!65
Critical bands
!66
Pitch perception
!67