Abstract
The maintenance cost is a critical issue for the success of integrating database and information retrieval systems (IRS). For a robust integration of search engines, the signature file filter can effectively eliminate the mainte-nance cost and offer a more natural fit between the database and text retrieval systems. Extending the usability of merged database and signature based text-retrieval systems by building on an object-oriented database management system (OODBMS) provides better and complementary advantages to both data-bases and information retrieval systems (IRSs). In this paper, we present a new approach for integrating OODBMSs and IRSs that maintains the flexibility and avoids overheads of mapping process, by means of encapsulating the documents and signature based IR methods into storable objects which are be-ing stored in the database. In addition, we develop a novel signature file ap-proach based on a statistical corpus extraction technique, which can effectively reduce false drop probability for text retrieval from the underneath document database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Chang, W.W. and Schek, H.J.: A Signature Access Method for the Starburst Database System, Proceedings of the 15th VLDB Conference, Amsterdam, The Netherlands (1989) 145–153
Chien, Lee-Feng: Fast and Quasi-Natural Language Search for Gigabytes of Chinese Texts, Proceedings of the 18th Annual International ACM SIGIR conference on Research and De-velopment in Information Retrieval, (1995) 112–120
Christophides, V., Abiteboul, S., Cluet, S. and Scholl, M.: From Structured Documents to Novel Query Facilities, Proceedings of the ACM SIGMOD’94, (1994) 313–324
Croft, W.B. Smith, L.A. and Turtle, H.R.: A Loosely-Coupled Integration of a Text Retrieval System and an Object-Oriented Database System. Proc. ACM SIGIR Conference. (1992) 223–231
Faloutsos, C., Access Methods for Text, ACM Computing Surveys, (1985) 49–74
Fuhr, Norbert.: A Probabilistic Relational Model for the Integration of IR and Databases., Proceedings of the 16th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, (1993) 309–317
Fuhr, Norbert.: Integration of Information Retrieval and Database Systems., Proceedings of the 17th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, Dublin, (1994) 360
Lee, W.L. and Woelk, D.: Integration of Text Research with ORION, Database Engineering, Vol. 9. (1991) 58–64
Lee, D.L. et al.: Efficient Signature File Methods for Text Retrieval, IEEE Transactions on Knowledge and Data Engineering, Vol. 7, No 3, (1995) 423–435
Macleod, I.A. and Narine, D.: A Depository for Structured Text Objects. Proc. DEXA’ 95, (1995) 272–282
Schutze, H. Part-of-speech Induction from Scratch, In Proceedings of the ACL’93, (1993) 251–258
Shoens, K. et al.: The Rufus System: Information Organization for Semi-Structured Data, Proceedings of the 19th VLDB Conference, Dublin, Ireland, (1993) 97–107
Stanfill, C. and Kahle, B.: Parallel Free-Text Search on the Connection Machine System, Comm. ACM, Vol. 29, No 12, (1986). 1229–1239
Stonebraker, M., Stettner, H., Lynn, N., Kalash, J. and Guttman, A.: “Document Processing in a Relational Database System.” ACM TOIS, 1(2): (1983) 143–158
Volz, M. et al.: Applying a Flexible OODBMS-IRS-Coupling to Structured Document Handling, Proceedings of the Twelfth International Conference on Data Engineering, New Orleans, Louisiana, USA (1996) 10–19
Yan, T.W. and Annevelink, J.: Integrating a Structured-Text Retrieval System with an Object-Oriented Database System, Proceedings of the 20th VLDB Conference, Santiago, Chile (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, CH., Chien, LF. (1999). An OODBMS-IRS Integration Based on a Statistical Corpus Extraction Method for Document Management. In: Bench-Capon, T.J., Soda, G., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 1999. Lecture Notes in Computer Science, vol 1677. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48309-8_5
Download citation
DOI: https://doi.org/10.1007/3-540-48309-8_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66448-2
Online ISBN: 978-3-540-48309-0
eBook Packages: Springer Book Archive