Cls v2 1 6
Cls v2 1 6
Cls v2 1 6
Version 2.1.6
--------------------------------------------------------------------------------------
Copyright (c) 2012 Indian Language TTS Consortium & ASR Consortium
Dr Samudravijaya, Tata Institute of Fundamental Research, Mumbai
(chief@tifr.res.in, samudravijaya@gmail.com)
Dr Hema A Murthy, Indian Institute of Technology, Madras
(hema@cse.iitm.ac.in, hema.arunachalam@gmail.com )
--------------------------------------------------------------------------------------------------------------------
This document specifies a standard set of labels (in Roman script) for speech sounds commonly
used in Indian languages. This document lists the label set for 13 languages (currently being
processed by ASR/TTS consortia of TDIL, DIT, GoI). These labels are to be used for computer
processing of spoken Indian languages.
(1) Similar sounds in different languages are given a single label.
(2) The IPA symbol refers to an exemplar (Hindi/Tamil/other) language.
(3) This is not an IPA chart of sounds of Indian languages.
(4) The label set is designed such that the native script is largely recoverable from the
transliteration.
A label may consist of a sequence of alphanumeric characters of the Roman alphabet; they will not
contain any special character such as quote, hyphen etc. All labels are in lower case even though the
labels are case insensitive. Since the number of speech sounds are larger than the Roman alphabet, a
system of suffixes as well as letter combinations are used for labels.
(C) Language specific notes* : This section has notes on sounds (and labels) specific to a
subset of the languages. Whenever possible, minimal pairs for language specific phonemes are
provided. In other cases, examplar words containing common phones in that language are written.
1. Hindi
(i) The glyph ञ is pronounced as a sequence of two phones: "g y". The morphological
analyser of the language will do this translation from glyph to the phone sequence.
(i) The compound glyph ज= “ज + ञ्" is pronounced as a sequence of two phones: "g y". The
morphological analyser of the language will do this translation from glyph to the phone
sequence.
2. Marathi
Label IPA Glyph Minimal pair Exemplar words/
Comment
nxh /ɳh/ णह
nh /nh/ नह नाणे (coin)
नहाणे (take bath)
mh /mh/ मह महण (say)
मण ()
rh /rh/ ऱह रास (heap)
Label IPA Glyph Minimal pair Exemplar words/
Comment
ऱहास (decay)
lh /lh/ लह उलहास,
केलहाय
wh /wh/ वह वाळ (a kind of cereal)
वहाळ (will become)
(1) This is a common Label Set (in Roman script) for the purpose of
computer processing of spoken Indian languages.
(2) Similar sounds in different languages are given a single label.
(3) The IPA label refers to an exemplar (Hindi/Tamil/other) language.
(4) This is NOT an IPA chart of sounds of Indian languages.
1 a a अ अ अ અ -
2 ax ɔ - ऑ - ઑ ଅ
3 aa aː आ आ आ આ ଆ
4 axx ə - - - - -
5 i ɪ, i इ इ इ ઇ ଇ, ଈ
6 ii iː ई ई ई ઈ -
7 u u, ʊ उ उ उ ઉ ଉ, ଊ
8 eu ɯ - - - - -
9 uu uː ऊ ऊ ऊ ઊ -
10 rq - ऋ,ॠ ऋ,ॠ ऋ - ଋ
11 e e - - - - ଏ
12 ee eː ए ए,ऎ ए એ -
13 ea ɛ - - - - -
14 ei ɛː ऐ - ऐ ઐ -
15 ai aI - ऐ - - ଐ
16 oi oj - - - - -
17 o o ओ ओ,ऒ ओ ઓ ଓ
18 oo oː - - - - -
19 ae ae - ऍ - ઍ -
Page 1
Sheet1
20 au aʊ - औ - - ଔ
21 ou oʊ औ - औ ઔ -
22 k k क क क ક କ
23 kh kʰ ख ख ख ખ ଖ
24 g g ग ग ग ગ ଗ
25 gh ɡʰ घ घ घ ઘ ଘ
26 ng ŋ ङ ङ ङ ઙ ଙ
27 c tʃ च च च ચ ଚ
28 ch tʃʰ छ छ छ છ ଛ
29 cx t̪ʃ - च - - -
30 j dʒ ज ज ज જ ଜ,ଯ
31 jh dʒʰ झ झ झ ઝ ଝ
32 jx d̪ʃ - ज - - -
33 nj ɲ ञ ञ - ઞ ଞ
34 tx ʈ ट ट ट ટ ଟ
35 txh ʈʰ ठ ठ ठ ઠ ଠ
36 dx ɖ ड ड ड ડ ଡ
37 dxh ɖʰ ढ ढ ढ ઢ ଢ
38 nx ɳ ण ण ण ણ ଣ
39 t t̪ त त त ત ତ
40 th t̪ʰ थ थ थ થ ଥ
41 d d̪ द द द દ ଦ
42 dh d̪ʰ ध ध ध ધ ଧ
43 n n न,ऩ न,ऩ न ન ନ
Page 2
Sheet1
44 nd - - - - - -
45 p p प प प પ ପ
46 ph pʰ फ फ फ ફ ଫ
47 b b ब ब ब બ ବ
48 bh bʰ भ भ भ ભ ଭ
49 m m म म म મ ମ
50 y j य,य़ य,य़ य ય ୟ
51 r r र,ऱ र,ऱ र ર ର
52 l l ल ल ल લ ଲ
53 lx ɭ - ळ,ऴ ळ ળ ଳ
54 w ʋ व व व વ ୱ,ଵ
55 sh ʃ श श श શ -
56 sx ʂ ष ष ष ષ -
57 s s स स स સ ସ, ଶ, ଷ
58 h ɦ ह ह ह હ ହ
59 kq q क़ क़ - - -
60 khq x ख़ ख़ - - -
61 gq ɣ ग़ ग़ - - -
62 z z ज़ ज़ - - -
63 jhq ʒ झ़ झ़ - - -
64 dxq ɽ ड़ ड़ ड़ - ଡ
65 dxhq ɽʰ ढ़ ढ़ ढ़ - ଢ
66 dhq - - - ध़ - -
67 f f फ़ फ़ - - -
Page 3
Sheet1
68 bq - - - ब़ - -
69 yq - - - - - -
70 nq - - - - - -
71 rx ɾ̪ - - - - -
72 sq - - - स़ - -
73 zh ɻ - - - - -
74 nxh ɳʰ - ण्ह - - -
75 nh nʰ - न्ह - - -
76 mh mʰ - म्ह म्ह - -
77 rh lʰ - ऱ्ह - - -
78 lh lʰ - ल्ह - - -
79 wh wʰ - व्ह व्ह - -
80 q - ंं ंं ंं ંં ଂଂ
81 hq - ंः ंः ंः ંઃ ଂଃ
82 mq - ंँ ंँ ंँ ંઁ ଂଁ
83 x x - - - - -
Page 4
Sheet1
(1) This is a common Label Set (in Roman script) for the purpose of
computer processing of spoken Indian languages.
(2) Similar sounds in different languages are given a single label.
(3) The IPA label refers to an exemplar (Hindi/Tamil/other) language.
(4) This is NOT an IPA chart of sounds of Indian languages.
1 a - - - - অ অ अ
2 ax অ অ অ অ - - -
3 aa আ আ আ আ আ আ आ
4 axx - - - - অ অ -
5 i ই,ঈ ই ই,ঈ ই ই ই इ
6 ii - ঈ - ঈ ঈ ঈ ई
7 u উ,ঊ উ উ,ঊ উ উ উ उ
8 eu - - - - - - -
9 uu - ঊ - ঊ ঊ ঊ ऊ
10 rq ঋ,ৠ ঋ,ৠ ঋ ঋ - - ऋ
11 e এ এ এ এ এ এ ए
12 ee - - - - - - -
13 ea - - - - - - -
14 ei - - - - - - ऐ
15 ai - - - - ঐ ঐ -
16 oi ঐ ঐ ঐ ঐ ওই ওই -
17 o ও ও ও ও ও ও ओ
18 oo - - - - - - -
19 ae - অযা - - - - -
Page 5
Sheet1
20 au - - - - ঔ ঔ -
21 ou ঔ ঔ ঔ ঔ - - औ
22 k ক ক ক ক ক ক क
23 kh খ খ খ খ খ খ ख
24 g গ গ গ গ গ গ ग
25 gh ঘ ঘ ঘ ঘ ঘ ঘ घ
26 ng ঙ,ংং ঙ ঙ,ংং ঙ ঙ ঙ ङ
27 c চ চ - - চ চ च
28 ch ছ ছ - - ছ ছ छ
29 cx - - - - - - -
30 j জ,য জ জ,য জ,য জ জ ज
31 jh ঝ ঝ ঝ ঝ ঝ ঝ झ
32 jx - - - - - - -
33 nj - - ঞ ঞ ঞ ঞ ञ
34 tx ট ট - ট - ট ट
35 txh ঠ ঠ - ঠ - ঠ ठ
36 dx ড ড - ড - ড ड
37 dxh ঢ ঢ - ঢ - ঢ ढ
38 nx ণ ণ - ণ - ণ ण
40 th থ থ ঠ,থ থ ঠ,থ থ थ
41 d দ দ ড,দ দ ড,দ দ द
42 dh ধ ধ ঢ,ধ ধ ঢ,ধ ধ ध
43 n ন ন ণ,ন ন ণ,ন ন न
Page 6
Sheet1
44 nd - - - - - - -
45 p প প প প প প प
46 ph ফ ফ ফ ফ ফ ফ फ
47 b ব ব ব ব ব ব ब
48 bh ভ ভ ভ ভ ভ ভ भ
49 m ম ম ম ম ম ম म
50 y য় য য় য় য়,য য়,য य
51 r র র ৰ,ড় ৰ,ড় র র र
52 l ল ল ল ল ল ল ल
53 lx - - - - - - -
54 w - ওয় ৱ ৱ ৱ ৱ व
55 sh শ,ষ শ - - শ শ श
56 sx - ষ - - ষ ষ ष
57 s স স চ, ছ চ, ছ স স स
58 h হ হ হ হ হ হ ह
59 kq - - - - - - -
60 khq - - - - - - -
61 gq - - - - - - -
62 z ়
জ - - - - - -
63 jhq - - - - - - झ़
64 dxq ড় ড় - - - - ड़
65 dxhq ঢ় ঢ় ঢ় ঢ় - - ढ़
66 dhq - - - - - - -
67 f - ়ফ - - - - -
Page 7
Sheet1
68 bq - - - - - - -
69 yq - য় - - - - -
70 nq - - - - - - -
71 rx - - - - - - -
72 sq - - - - - - -
73 zh - - - - - - -
74 nxh - - - - - - -
75 nh - - - - - - -
76 mh - - - - - - -
77 rh - - - - - - -
78 lh - - - - - - -
79 wh - - - - - - -
80 q - ংং - ংং ংং ংং ंं
81 hq - ংঃ ংঃ ংঃ - - ंः
82 mq ংঁ ংঁ ংঁ ংঁ ংঁ ংঁ -
83 x - - শ,ষ,স শ,ষ,স - - -
Page 8
Sheet1
(1) This is a common Label Set (in Roman script) for the purpose of
computer processing of spoken Indian languages.
(2) Similar sounds in different languages are given a single label.
(3) The IPA label refers to an exemplar (Hindi/Tamil/other) language.
(4) This is NOT an IPA chart of sounds of Indian languages.
1 a அ അ అ ಅ
2 ax - - - -
3 aa ஆ ആ ఆ ಆ
4 axx - - - -
5 i இ ഇ ఇ ಇ
6 ii ஈ ഈ ఈ ಈ
7 u உ ഉ ఉ ಉ
8 eu உ ്് - ್್
9 uu ஊ ഊ ఊ ಊ
10 rq - ഋ ఋ,ౠ ಋ,ೠ
11 e எ എ ఎ ಎ
12 ee ஏ ഏ ఏ ಏ
13 ea - - - ಎ
14 ei - - - -
15 ai ஐ ഐ ఐ ಐ
16 oi - - - -
17 o ஒ ഒ ఒ ಒ
18 oo ஓ ഓ ఓ ಓ
19 ae - - - -
Page 9
Sheet1
20 au ஔ ഔ ఔ ಔ
21 ou - - - -
22 k க ക క ಕ
23 kh - ഖ ఖ ಖ
24 g கv ഗ గ ಗ
25 gh - ഘ ఘ ಘ
26 ng ங ങ ఙ ಙ
27 c ச ച చ ಚ
28 ch - ഛ ఛ ಛ
29 cx - - - -
30 j ஜ ജ జ ಜ
31 jh - ഝ ఝ ಝ
32 jx - - - -
33 nj ஞ ഞ ఞ ಞ
34 tx ட ട ట ಟ
35 txh - ഠ ఠ ಠ
36 dx டv ഡ డ ಡ
37 dxh - ഢ ఢ ಢ
38 nx ண ണ ణ ಣ
39 t த ത త ತ
40 th - ഥ థ ಥ
41 d தv ദ ద ದ
42 dh - ധ ధ ಧ
43 n ந,ன ന న ನ
Page 10
Sheet1
44 nd ந - - -
45 p ப പ ప ಪ
46 ph - ഫ ఫ ಫ
47 b பv ബ బ ಬ
48 bh - ഭ భ ಭ
49 m ம മ మ ಮ
50 y ய യ య ಯ
51 r ர ര ర ರ
52 l ல ല ల ಲ
53 lx ள ള ళ ಳ
54 w வ വ వ ವ
55 sh - ശ శ ಶ
56 sx ஷ ഷ ష ಷ
57 s ஸ സ స ಸ
58 h ஹ ഹ హ ಹ
59 kq - - - -
60 khq - - - -
61 gq - - - -
62 z - - - -
63 jhq - - - -
64 dxq - ട - -
65 dxhq - - - -
66 dhq - - - -
67 f ஃப - - -
Page 11
Sheet1
68 bq - - - -
69 yq - - - -
70 nq - ന - -
71 rx ற റ ఱ ಱ
72 sq - - - -
73 zh ழ ഴ - -
74 nxh - - - -
75 nh - - - -
76 mh - - - -
77 rh - - - -
78 lh - - - -
79 wh - - - -
80 q - ്ം ంం ್ಂ
81 hq - ്ഃ ంః ್ಃ
82 mq - - - -
83 x - - - -
Page 12