Abstract
In today’s global village, it is critical that the key information tools, such as web search engines, e-Commerce portals and e-Governance, work across multiple natural languages, seamlessly. We propose a new flexible architecture – Multilingual Information processing on Relational Architecture (MIRA) – that supports the multilingual processing functionality of the primary storage mechanism for such deployments – the relational database systems, effectively and efficiently. We propose new linguistic matching operators that enhances the standard lexicographic matching of database systems into phonetic and semantic domains. We further show that the performance of the systems may be made language-neutral. Our proposed architecture is based on standards and hence amenable for easy implementation in any type of query processing and information retrieval systems. In this paper, we present our approach to implement the above architecture and outline the host of research issues that are opened up due to the inherently fuzzy nature of the alternative matching semantics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
The Computer Scope Limited, http://www.NUA.ie/Surveys
The Web Fountain Project, http://www.almaden.ibm.com/WebFountain
The International Phonetic Association, http://www.arts.gla.ac.uk/IPA/
The Unicode Consortium, http://www.unicode.org
The WordNet, http://www.cogsci.princeton.edu/~wn
The Global WordNet Association, http://www.globalwordnet.org
Gravano, L., Ipeirotis, P., Jagadish, H., Koudas, N., Muthukrishnan, S., Srivastava, D.: Approximate string joins in a database (almost) for free. In: Proc. of the 27th VLDB Conf., Rome, Italy (2001)
ISO/IEC. Standard 9075-1-5:1999, Information Technology – Database Languages – SQL. International Organization for Standardization (1999)
Jurafskey, D., Martin, J.: Speech and Language Processing. Pearson Education, London (2000)
Knuth, D.E.: The Art of Computer Programming, 2nd edn., vol. 3 (Sorting and Searching). Addison–Wesley, Reading (1993)
Kumaran, A., Haritsa, J.R.: On database support for multilingual environments. In: Proc. of the 13th IEEE Research Issues in Data Engineering Workshop (held in conjunction with 19th IEEE Intl. Conf. on Data Engineering), Bangalore/Hyderabad, India (2003)
Kumaran, A., Haritsa, J.R.: On the costs of multilingualism in database systems. In: Proc. of the 29th VLDB Conf., Berlin, Germany (2003)
Kumaran, A., Haritsa, J.R.: LexEQUAL: Multilexical matching operator in SQL. In: Proc. of the 23rd ACM SIGMOD Intl. Conf. on Management of Data, Paris, France (2004)
Kumaran, A., Haritsa, J.R.: Supporting multilexical queries in SQL. In: Proc. of the 20th IEEE Intl. Conf. on Data Engineering, Boston, United States (2004)
Kumaran, A., Haritsa, J.R.: Supporting multiscript matching in database systems. In: Proc. of the 9th Extending Database Technology Conf., Heraklion-Crete, Greece (2004)
Zobel, J., Dart, P.: Phonetic string matching: Lessons from information retrieval. In: Proc. of 19th ACM SIGIR Conf., Zurich, Switzerland (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kumaran, A. (2004). MIRA: Multilingual Information Processing on Relational Architecture. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds) Current Trends in Database Technology - EDBT 2004 Workshops. EDBT 2004. Lecture Notes in Computer Science, vol 3268. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30192-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-30192-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23305-3
Online ISBN: 978-3-540-30192-9
eBook Packages: Computer ScienceComputer Science (R0)