Nothing Special   »   [go: up one dir, main page]

Skip to main content

Issues in Encoding the Writing of Nepal’s Languages

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8403))

  • 2099 Accesses

Abstract

The major language of Nepal, known today as Nepali, is spoken as mother tongue by nearly half the population, and as a second language by nearly all of the rest. A considerable volume of computational linguistics work has been done on Nepali, both in research establishments and commercial organizations. However there are another 94 languages indigenous to the country, and the situation for these is not good. In order to apply computational linguistics methods to a language it must first be represented in the computer, but most of the languages of Nepal have no written tradition, let alone any support by computers. It is the written form that is needed for full computational processes, and it is here that we encounter barriers or at best inappropriate compromises. We will look at the situation in Nepal, ignoring the 17 cross-border languages where the major speaker population lies outside Nepal. We are left with only three languages with written traditions: Nepali which is well served, Newari with over 1000 years of written tradition but which so far has been frustrated in attempts to encode its writing, and Limbu which does have its writing encoded though with defects. Many of the remaining languages may be written in Devanagari, but aspire to something different that relates to their languages and has a more visually distinctive writing to mark their identity. We look at what can be done for these remaining languages and speculate whether a common writing system and encoding could cover all the languages of Nepal. Inevitably we must focus on the current standard for the computer encoding of writing, Unicode, but we find that while language activists in Nepal do not adequately understand what is possible with the technology and pursue objectives within Unicode that are not necessary or helpful, external experts only have limited understanding of all the issues involved and the requirements of living languages and their users and instead pursuing scholarly interests which offer limited support for living users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Adhikari, S.: History of Nepali Language. Bhundi Puran Prakashan, Kathmandu (2056 BS) (in Nepali)

    Google Scholar 

  2. Allwood, J., Dhakhwa, S., Regmi, B.N., Shrestha, P.: Multimodal corpus using multimodal dictionary in Lohorung. In: Proceedings of the International Conference on Speech Database and Assessments - Poster Sessions, Hsinchu, Taiwan, pp. 109–114. National Chiao Tung University (October 2011), The dictionary is viewable at desceco.org/lohorung

  3. Anderson, D.: Liaison Report, Script Encoding Initiative, UC Berkeley. Document ISO/IEC JTC1/SC2/WG2 N4220 (February 12, 2012)

    Google Scholar 

  4. Bal, B.K.: Towards Building Advance Natural Language Applications – An Overview of the Existing Primary Resources and Applications in Nepali. In: 7th Workshop on Asian Language Resources, ACL-IJCNLP 2009, pp. 165–170 (2009)

    Google Scholar 

  5. Bal, B.K., Gurung, S., Hall, P.: Towards Universal Access to ICTs in Nepal. In: Computer Society of India Conference, Kolkata, India (2006)

    Google Scholar 

  6. Bandhu, C.M.: Origin of Nepali, 5th edn. Sajha Prakashan, Lalitpur (2052 BS) (in Nepali)

    Google Scholar 

  7. BIS: Indian Standard Indian Script Code for Information Interchange – ISCII. IS 12194: 1991. Bureau of Indian Standards, New Delhi (1991)

    Google Scholar 

  8. Central Bureau of Statistics: National Population and Housing Census 2011 (National Report). Central Bureau of Statistics, Kathmandu (2012)

    Google Scholar 

  9. Coulmas, F.: The Writing Systems of the World. Blackwell (1989)

    Google Scholar 

  10. Diringer, D.: Writing. Thames and Hudson (1962)

    Google Scholar 

  11. Erard, M.: For the World’s A B C’s, He Makes 1’s and 0’s. New York Times (September 26, 2003)

    Google Scholar 

  12. Everson, M.: Newari code table and names list (2000), http://www.evertype.com/standards/tai/newari.pdf

  13. Everson, M.: Preliminary proposal for encoding the Rañjana script in the SMP of the UCS. ISO/IEC JTC1/SC2/WG2 N3649 (2009), std.dkuug.dk/jtc1/sc2/wg2/docs/n3649.pdf

  14. Everson, M.: Roadmapping the scripts of Nepal. ISO/IEC JTC1/SC2/WG2 N3692 (2009), std.dkuug.dk/jtc1/sc2/wg2/docs/n3692.pdf

  15. Hale, A., Shrestha, K.P.: Newar. Languages of the World/Meterials, vol. 256. Lincom Europa, Munich and Newcastle (2005)

    Google Scholar 

  16. Hall, P.: Proposal to Encode Nepal Himalayish Scripts in ISO/IEC 10646, ISO/IEC JTC1/SC2/WG2 N4347 (2012)

    Google Scholar 

  17. Lewis, M.P., Simons, G.F.: Assessing Endangerment: Expanding Fishman’s GIDS. Revue Roumaine de Linguistique (RRL) LV(2), 103–120 (2010), www.lingv.ro/RRL-2010.html

    Google Scholar 

  18. Lewis, M.P., Simons, G.F., Fennig, C.D. (eds.): Ethnologue: Languages of the World, 17th edn., SIL International, Dallas, Texas (2013), http://www.ethnologue.com

  19. Manandhar, D.D., Karmacharya, S., Chitrakar, B.: Proposal for the Nepaalalipi script in the UCS. ISO/IEC JTC1/SC2/WG2 N4322 (2012), std.dkuug.dk/jtc1/sc2/wg2/docs/n4322.pdf

  20. Manandhar, D.D., Karmacharya, S., Chitrakar, B.: Proposal to Encode Ranjana Script in ISO/IEC 10646 (draft submitted to ISO on December 31, 2013)

    Google Scholar 

  21. Michailovsky, B., Everson, M.: Revised proposal to encode the Limbu script in the UCS. ISO/IEC JTC1/SC2/WG2 N2410 (2002), std.dkuug.dk/jtc1/sc2/wg2/docs/n2410.pdf

  22. Noonan, M.: Recent Adaptations of the Devanagari Script for the Tibeto-Burman Languages of Nepal (2003), pantherfile.uwm.edu/noonan/www/Papers.html

  23. Pandey, A.: Preliminary Proposal to Encode the Prachalit Nepal Script in ISO.IEC 10646. ISO/IEC JTC1/SC2/WG2 N4038 (May 3, 2011), std.dkuug.dk/jtc1/sc2/wg2/docs/n4038.pdf

  24. Pandey, A.: Proposal to Encode the Letters GYAN and TRA for Limbu in the UCS. ISO/IEC JTC1/SC2/WG2 N3975 (January 14, 2011)

    Google Scholar 

  25. Pandey, A.: Proposal to Encode the Newar Script in ISO.IEC 10646. ISO/IEC JTC1/SC2/WG2 N4184 (January 5, 2012), std.dkuug.dk/jtc1/sc2/wg2/docs/n4184.pdf

  26. Pandey, A.: Proposal to Encode the Tirhuta Script in ISO/IEC 10646. ISO/IEC JTC1/SC2/WG2 N4035 (May 5, 2011)

    Google Scholar 

  27. Pokharel, B.K.: Five Hundred Years, 4th edn. Sajha Prakashan, Lalitpur (2050 BS) (in Nepali)

    Google Scholar 

  28. Regmi, B.N.: Developing a Devanagari-based multi-language orthography for Nepalese languages. In: Second International Conference on Language Development, Language Revitalization, and Multilingual Education in Ethnolinguistic Communities, Bangkok, Thailand, July 1-3 (2008)

    Google Scholar 

  29. Regmi, B.N., Regmi, D.R., Acharya, M., Mahato, H.N., Lamichhane, B.: Typological Study of the Languages of Nepal. Report Submitted to Second Higher Education Project University Grant Commission, Nepal (2012) (in Nepali)

    Google Scholar 

  30. Robinson, C., Gadelii, K.: Writing Unwritten Languages. UNESCO website, search for title (2003)

    Google Scholar 

  31. Rogers, H.: Writing Systems. A Linguistic Approach. Blackwell Publishing (2005)

    Google Scholar 

  32. Ross, F.: The printed Bengali character and its evolution. Curzon Press (1999)

    Google Scholar 

  33. Sebba, M.: Sociolinguistic approaches to writing systems research. Writing Systems Research, vol. 1(1). Oxford University Press (2009)

    Google Scholar 

  34. Shakya, R.: Alphabet of the Nepalese Script. Motiraj Shakya and Sanunani Shakya, Patan, Nepal (2002)

    Google Scholar 

  35. Shakyavansha, H.: Nepalese Alphabets = Nepāl lip i saṁgraha, 7th edn (1985)

    Google Scholar 

  36. TDIL: Devanagari Script Behaviour for Hindi Draft issued for comment by the Government of India’s Technology Development for Indian Languages Programme (2013)

    Google Scholar 

  37. Unicode Consortium: www.unicode.org includes Unicode Version 6.3 (2014)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hall, P., Bal, B.K., Dhakhwa, S., Regmi, B.N. (2014). Issues in Encoding the Writing of Nepal’s Languages . In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54906-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54905-2

  • Online ISBN: 978-3-642-54906-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics