Abstract
The issue of quality control has become increasingly important as more online databases are integrated into digital libraries. This can have a dramatic effect on the search effectiveness of an online system. Authority work, the need to discover and reconcile variant forms of strings in bibliographic entries, will become more difficult. Spelling variants, misspellings, translation and transliteration differences all increase the difficulty of retrieving information. This paper is a case study of our efforts to automate the creation of an authority file for authors' institutional affiliations in the Astrophysics Data System. The techniques surveyed here for the detection and categorization of variant forms have broader applicability and may be used to help automate authority work for other bibliographic fields.
This work supported in part by Dept. of Energy grant no. DE-FG05-95ER25254, NSF grant CDA-9529253, DARPA contract N66001-97-C-8542, and a NASA Graduate Student Researchers Program fellowship.
The National Radio Astronomy Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc.
Preview
Unable to display preview. Download preview PDF.
References
H. A. Abt. Institutional Productivities. Publications of the Astronomical Society of the Pacific, 105:794–798, 1993.
A. Accomazzi, G. Eichhorn, M. J. Kurtz, C. S. Grant, and S. S. Murray. The ADS Article Service Data Holdings and Access Method. In G. Hunt and H. Payne, editors, Astronomical Data Analysis Software and Systems VI, volume 125 of A.S.P. Conference Series, pages 357–360, 1997.
L. Auld. Authority Control: An Eighty-Year Review. Library Resources & Technical Services, 26:319–330, 1982.
C. L. Borgman and S. L. Siegfried. Getty's Synoname and its Cousins: A Survey of Applications of Personal Name-Matching Algorithms. Journal of the American Society for Information Science, 43(7):459–476, 1992.
J. R. Davis. Creating a Networked Computer Science Technical Report Library. D-Lib Magazine, Sept. 1995.
J. C. French, A. L. Powell, and E. Schulman. Applications of Approximate Word Matching in Information Retrieval. In 6th International Conference on Information and Knowledge Management (CIKM'97), Las Vegas, Nevada, 10–14 November 1997. (to appear).
P. A. V. Hall and G. R. Dowling. Approximate String Matching. Computing Surveys, 12(4):381–402, Dec. 1980.
K. Kukich. Techniques for Automatically Correcting Words in Text. Computing Surveys, 24(4):377–440, Dec. 1992.
R. Lowrance and R. A. Wagner. An Extension of the String-to-String Correction Problem. Journal of the ACM, 22(2):177–183, Apr. 1975.
E. T. O'Neill and D. Vizine-Goetz. Quality Control in Online Databases. Annual Review of Information Science and Technology, 23:125–156, 1988.
E. Schulman, J. C. French, A. L. Powell, S. S. Murray, G. Eichhorn, and M. J. Kurtz. The Sociology of Astronomical Publication Using ADS and ADAMS. In G. Hunt and H. Payne, editors, Astronomical Data Analysis Software and Systems VI, volume 125 of A.S.P. Conference Series, pages 361–364, 1997.
E. Schulman, A. L. Powell, J. C. French, G. Eichhorn, M. J. Kurtz, and S. S. Murray. Using the ADS Database to Study Trends in Astronomical Publication. Bulletin of the American Astronomical Society, 28(4):1281, 1996.
S. L. Siegfried and J. Bernstein. Synoname: The Getty's New Approach to Pattern Matching for Personal Names. Computers and the Humanities, 25(4):211–226, 1991.
D. M. Strong, Y. W. Lee, and R. Y. Wang. Data Quality in Context. Communications of the ACM, 40(5):103–110, May 1997.
A. G. Taylor. Authority Files in Online Catalogs: An Investigation of Their Value. Cataloging & Classification Quarterly, 4(3):1–17, 1984.
V. Trimble. Postwar growth in the length of astronomical and other scientific papers. Publications of the Astronomical Society of the Pacific, 96:1007–1016, 1984.
R. A. Wagner and M. J. Fischer. The String-to-String Correction Problem. Journal of the ACM, 21(1):168–173, Jan. 1974.
M. E. Williams and L. Lannom. Lack of Standardization of the Journal Title Data Element in Databases. Journal of the American Society for Information Science, 32(3):229–233, May 1981.
J. Zobel and P. Dart. Phonetic String Matching: Lessons from Information Retrieval. In Proc. 19th Inter. Conf. on Research and Development in Information Retrieval (SIGIR'96), pages 166–172, Aug. 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
French, J.C., Powell, A.L., Schulman, E., Pfaltz, J.L. (1997). Automating the construction of authority files in digital libraries: A case study. In: Peters, C., Thanos, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1997. Lecture Notes in Computer Science, vol 1324. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026721
Download citation
DOI: https://doi.org/10.1007/BFb0026721
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63554-3
Online ISBN: 978-3-540-69597-4
eBook Packages: Springer Book Archive