Abstract
The major language of Nepal, known today as Nepali, is spoken as mother tongue by nearly half the population, and as a second language by nearly all of the rest. A considerable volume of computational linguistics work has been done on Nepali, both in research establishments and commercial organizations. However there are another 94 languages indigenous to the country, and the situation for these is not good. In order to apply computational linguistics methods to a language it must first be represented in the computer, but most of the languages of Nepal have no written tradition, let alone any support by computers. It is the written form that is needed for full computational processes, and it is here that we encounter barriers or at best inappropriate compromises. We will look at the situation in Nepal, ignoring the 17 cross-border languages where the major speaker population lies outside Nepal. We are left with only three languages with written traditions: Nepali which is well served, Newari with over 1000 years of written tradition but which so far has been frustrated in attempts to encode its writing, and Limbu which does have its writing encoded though with defects. Many of the remaining languages may be written in Devanagari, but aspire to something different that relates to their languages and has a more visually distinctive writing to mark their identity. We look at what can be done for these remaining languages and speculate whether a common writing system and encoding could cover all the languages of Nepal. Inevitably we must focus on the current standard for the computer encoding of writing, Unicode, but we find that while language activists in Nepal do not adequately understand what is possible with the technology and pursue objectives within Unicode that are not necessary or helpful, external experts only have limited understanding of all the issues involved and the requirements of living languages and their users and instead pursuing scholarly interests which offer limited support for living users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adhikari, S.: History of Nepali Language. Bhundi Puran Prakashan, Kathmandu (2056 BS) (in Nepali)
Allwood, J., Dhakhwa, S., Regmi, B.N., Shrestha, P.: Multimodal corpus using multimodal dictionary in Lohorung. In: Proceedings of the International Conference on Speech Database and Assessments - Poster Sessions, Hsinchu, Taiwan, pp. 109–114. National Chiao Tung University (October 2011), The dictionary is viewable at desceco.org/lohorung
Anderson, D.: Liaison Report, Script Encoding Initiative, UC Berkeley. Document ISO/IEC JTC1/SC2/WG2 N4220 (February 12, 2012)
Bal, B.K.: Towards Building Advance Natural Language Applications – An Overview of the Existing Primary Resources and Applications in Nepali. In: 7th Workshop on Asian Language Resources, ACL-IJCNLP 2009, pp. 165–170 (2009)
Bal, B.K., Gurung, S., Hall, P.: Towards Universal Access to ICTs in Nepal. In: Computer Society of India Conference, Kolkata, India (2006)
Bandhu, C.M.: Origin of Nepali, 5th edn. Sajha Prakashan, Lalitpur (2052 BS) (in Nepali)
BIS: Indian Standard Indian Script Code for Information Interchange – ISCII. IS 12194: 1991. Bureau of Indian Standards, New Delhi (1991)
Central Bureau of Statistics: National Population and Housing Census 2011 (National Report). Central Bureau of Statistics, Kathmandu (2012)
Coulmas, F.: The Writing Systems of the World. Blackwell (1989)
Diringer, D.: Writing. Thames and Hudson (1962)
Erard, M.: For the World’s A B C’s, He Makes 1’s and 0’s. New York Times (September 26, 2003)
Everson, M.: Newari code table and names list (2000), http://www.evertype.com/standards/tai/newari.pdf
Everson, M.: Preliminary proposal for encoding the Rañjana script in the SMP of the UCS. ISO/IEC JTC1/SC2/WG2 N3649 (2009), std.dkuug.dk/jtc1/sc2/wg2/docs/n3649.pdf
Everson, M.: Roadmapping the scripts of Nepal. ISO/IEC JTC1/SC2/WG2 N3692 (2009), std.dkuug.dk/jtc1/sc2/wg2/docs/n3692.pdf
Hale, A., Shrestha, K.P.: Newar. Languages of the World/Meterials, vol. 256. Lincom Europa, Munich and Newcastle (2005)
Hall, P.: Proposal to Encode Nepal Himalayish Scripts in ISO/IEC 10646, ISO/IEC JTC1/SC2/WG2 N4347 (2012)
Lewis, M.P., Simons, G.F.: Assessing Endangerment: Expanding Fishman’s GIDS. Revue Roumaine de Linguistique (RRL) LV(2), 103–120 (2010), www.lingv.ro/RRL-2010.html
Lewis, M.P., Simons, G.F., Fennig, C.D. (eds.): Ethnologue: Languages of the World, 17th edn., SIL International, Dallas, Texas (2013), http://www.ethnologue.com
Manandhar, D.D., Karmacharya, S., Chitrakar, B.: Proposal for the Nepaalalipi script in the UCS. ISO/IEC JTC1/SC2/WG2 N4322 (2012), std.dkuug.dk/jtc1/sc2/wg2/docs/n4322.pdf
Manandhar, D.D., Karmacharya, S., Chitrakar, B.: Proposal to Encode Ranjana Script in ISO/IEC 10646 (draft submitted to ISO on December 31, 2013)
Michailovsky, B., Everson, M.: Revised proposal to encode the Limbu script in the UCS. ISO/IEC JTC1/SC2/WG2 N2410 (2002), std.dkuug.dk/jtc1/sc2/wg2/docs/n2410.pdf
Noonan, M.: Recent Adaptations of the Devanagari Script for the Tibeto-Burman Languages of Nepal (2003), pantherfile.uwm.edu/noonan/www/Papers.html
Pandey, A.: Preliminary Proposal to Encode the Prachalit Nepal Script in ISO.IEC 10646. ISO/IEC JTC1/SC2/WG2 N4038 (May 3, 2011), std.dkuug.dk/jtc1/sc2/wg2/docs/n4038.pdf
Pandey, A.: Proposal to Encode the Letters GYAN and TRA for Limbu in the UCS. ISO/IEC JTC1/SC2/WG2 N3975 (January 14, 2011)
Pandey, A.: Proposal to Encode the Newar Script in ISO.IEC 10646. ISO/IEC JTC1/SC2/WG2 N4184 (January 5, 2012), std.dkuug.dk/jtc1/sc2/wg2/docs/n4184.pdf
Pandey, A.: Proposal to Encode the Tirhuta Script in ISO/IEC 10646. ISO/IEC JTC1/SC2/WG2 N4035 (May 5, 2011)
Pokharel, B.K.: Five Hundred Years, 4th edn. Sajha Prakashan, Lalitpur (2050 BS) (in Nepali)
Regmi, B.N.: Developing a Devanagari-based multi-language orthography for Nepalese languages. In: Second International Conference on Language Development, Language Revitalization, and Multilingual Education in Ethnolinguistic Communities, Bangkok, Thailand, July 1-3 (2008)
Regmi, B.N., Regmi, D.R., Acharya, M., Mahato, H.N., Lamichhane, B.: Typological Study of the Languages of Nepal. Report Submitted to Second Higher Education Project University Grant Commission, Nepal (2012) (in Nepali)
Robinson, C., Gadelii, K.: Writing Unwritten Languages. UNESCO website, search for title (2003)
Rogers, H.: Writing Systems. A Linguistic Approach. Blackwell Publishing (2005)
Ross, F.: The printed Bengali character and its evolution. Curzon Press (1999)
Sebba, M.: Sociolinguistic approaches to writing systems research. Writing Systems Research, vol. 1(1). Oxford University Press (2009)
Shakya, R.: Alphabet of the Nepalese Script. Motiraj Shakya and Sanunani Shakya, Patan, Nepal (2002)
Shakyavansha, H.: Nepalese Alphabets = Nepāl lip i saṁgraha, 7th edn (1985)
TDIL: Devanagari Script Behaviour for Hindi Draft issued for comment by the Government of India’s Technology Development for Indian Languages Programme (2013)
Unicode Consortium: www.unicode.org includes Unicode Version 6.3 (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hall, P., Bal, B.K., Dhakhwa, S., Regmi, B.N. (2014). Issues in Encoding the Writing of Nepal’s Languages . In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-54906-9_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54905-2
Online ISBN: 978-3-642-54906-9
eBook Packages: Computer ScienceComputer Science (R0)