Nothing Special   »   [go: up one dir, main page]

skip to main content
Free access

Customizing information capture and access

Published: 01 January 1997 Publication History


This article presents a customizable architecture for software agents that capture and access information in large, heterogeneous, distributed electronic repositories. The key idea is to exploit underlying structure at various levels of granularity to build high-level indices with task-specific interpretations. Information agents construct such indices and are configured as a network of reusable modules called structure detectors and segmenters. We illustrate our architecture with the design and implementation of smart information filters in two contexts: retrieving stock market data from Internet newsgroups and retrieving technical reports from Internet FTP sites.


ALLAN, J. AND SALTON, G. 1993. The identification of text relations using automatic hypertext linking. In the Workshop on Intelligent Hypertext, The ACM Conference on Information Knowledge Management. ACM, New York.]]
BALCAZAR, J. L., DfAZ, J., AND GABARRO, J. 1988. Structural Complexity. EATCS Monograph on Theoretical Computer Science, vol. 1. Springer-Verlag, Berlin.]]
BELKIN, N. AND CROFT, W. 1992. Information filtering and information retrieval: Two sides of the same coin. Commun. ACM 35, 12 (Dec.), 29-38.]]
BLUM, M. AND KOZEN, D. 1978. On the power of the compass (or, why mazes are easier to search than graphs). In Proceedings of the Symposium on the Foundations of Computer Science. IEEE, New York, 132-142.]]
BROOKS, R. 1986. A robust layered control system for a mobile robot. IEEE J. Robot. Automat. RA-2 (Apr.).]]
BROOKS, R. 1990. Elephants don't play chess. In Design of Autonomous Agents, P. Maes, Ed. MIT/Elsevier, Cambridge, Mass.]]
CANNY, J. AND GOLDBERG, K. 1993. A "RISC" paradigm for industrial robotics. In Proceedings of the International Conference on Robotics and Automation. IEEE, New York.]]
CATE, V. 1992. Alex: A global file system. In Proceedings of the Usenix Conference on File Systems. USENIX Assoc., Berkeley, Calif.]]
COHEN, J., Ed. 1993. Commun. ACM 36, 4 (Apr.).]]
CREAN, P., RUSSELL, C., AND DELLON, M.V. 1991. Overview and programming guide to the Mind image management systems. Tech. Rep. X9000627, Xerox, Inc., Palo Alto, Calif.]]
DAVIS, J. AND LAGOZE, C. 1995. Dienst--An architecture for distributed document libraries. Commun. ACM 38, 4 (Apr.), 47.]]
DONALD, B. 1995. Information invariants in robotics. Artif. Intell. 72, 217-304.]]
DONALD, B., JENNINGS, J., AND RUS, D. 1993. Information invariants for cooperating autonomous mobile robots. In Proceedings of the International Symposium on Robotics Research. Carnegie-Mellon Univ., Pittsburgh, Pa.]]
DONALD, B., JENNINGS, g., AND RUS, D. 1995. Minimalism + distribution = supermodularity. J. Exper. Theoret. Artif. Intell. To be published.]]
ETZIONI, O. AND WELD, D. 1994. A softbot-based interface to the Internet. Commun. ACM 37, 7 (July), 72-76.]]
FUJISAWA, H., NAKANO, Y., AND KURINO, K. 1992. Segmentation methods for character recognition: From segmentation to document structure analysis. Proc. IEEE 80, 7.]]
GENESERETH, M. AND KETCHPEL, S. 1994. Software agents. Commun. ACM 37, 7 (July), 48-53.]]
GRAY, R. 1995. Transportable agents. Tech. Rep. PCS-TR95-261, Dept. of Computer Science, Dartmouth College, Hanover, N.H.]]
GRAY, R. 1996. Agent Tcl: A flexible and secure mobile agent system. In Proceedings of the 4th Annual Tcl / Tk Workshop. ACM, New York.]]
HEARST, M. AND FLAUNT, C. 1993. Subtopic structuring for full-length document access. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 59-68.]]
HOPCROFT, J. AND ULLMAN, J. 1979. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, Mass.]]
HUTTENLOCHER, D., KLANDERMAN, G., AND RUCKLIDGE, W. 1993. Comparing images using the Hausdorff distance. IEEE Trans. Patt. Anal. Machine Intell. 15, 9, 850-863.]]
HUTTENLOCHER, D., NOH, J., AND RUCKLIDGE, W. 1992. Tracking non-rigid objects in complex scenes. Tech. Rep. TR92-1320, Cornell Univ., Ithaca, N.Y.]]
JAIN, A. AND BHATTCHARJEE, S. 1992. Address block location on envelopes using Gabor filters. Patt. Recog. 25, 12.]]
KAHLE, B. 1991. Overview of wide area information servers. WAIS Online Doc. Online 15 (Sept. 5), 56-60.]]
KAHN, R. AND CERF, V. 1988. The world ofknowbots. Report to the Corporation for National Research Initiative, Arlington, Va.]]
KAUTZ, H., SELMAN, B., AND COEN, M. 1994. Bottom-up design of software agents. Commun. ACM 37, 7 (July), 143-145.]]
KUCERA, H. AND FRANCIS, W. 1967. Computational Analysis of Present Day American English. Brown University Press, Providence, R.I.]]
LESK, M. 1991. The CORE electronic library. In Proceedings of SIGIR. ACM, New York.]]
MAES, P. 1994. Agents that reduce work and information overload. Commun. ACM 37, 7 (July), 31-40.]]
MITCHELL, T., CARUANA, R., FREITAG, D., MCDERMOTT, J., AND ZABOWSKI, D. 1994. Experience with a learning personal assistant. Commun. ACM 37, 7 (July), 81-91.]]
MIZUNO, M., TsuJI, Y., TANAKA, T., TANAKA, H., ISASHITA, M., AND TEMMA, T. 1991. Document recognition system with layout structure generator. NEC Res. Devel. 32, 3.]]
MUNKRES, J. 1975. Topology: A First Course. Prentice-Hall, Englewood Cliffs, N.J.]]
NAGY, G., SETH, S., AND VISHWANATHAN, M. 1992. A prototype document image analysis system for technical journals. Computer 25, 7.]]
PEARCE, C. AND NICHOLAS, C. 1993. Generating a dynamic hypertext environment with n-gram analysis. In Proceedings of the ACM Conference on Information Knowledge ManagemeAt. ACM, New York, 148-153.]]
ROBERTSON, S. 1981. The methodology of information retrieval experiment. In Information Retrieval Experiment, K. Sparck Jones, Ed. Butterworths, Durban, S. Africa, 9-31.]]
ROBERTSON, G., CARD, S., AND MACKINLAY, J. 1993. Information visualization using 3D interactive animation. Commun. ACM 36, 4 (Apr.), 57-70.]]
Rus, D. AND SUBRAMANIAN, D. 1993. Multi-media RISSC informatics: Retrieving information with simple structural components. In Proceedings of the ACM Conference on Information and Knowledge Management. ACM, New York.]]
Rus, D. AND SUMMERS, K. 1995. Using whitespace for automated document structuring. In Advances in Digital Libraries, N. Adam, B. Bhargava, and Y. Yesha, Eds. Lecture Notes in Computer Science, vol. 916. Springer-Verlag, New York.]]
Rus, D., GRAY, R., AND KOTZ, D. 1997. Transportable information agents. In Proceedings of the 1st International Conference on Autonomous Agents. ACM, New York. To be published.]]
SALTON, G. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Mass.]]
SALTON, G. AND BUCKLEY, C. 1990. Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41, 4, 288-297.]]
SALTON, G. AND MCGILL, M. 1983. Introduction to Modern Information Retrieval. McGraw- Hill, New York.]]
SANKOFF, D. AND KRUSKAL, J. 1983. Time Warps, String Edits, and Macromolecules: The Theory of Practice of Sequence Comparison. Addison-Wesley, Reading, Mass.]]
SCHWARTZ, M. AND TSIRIGOTIS, P. 1991. Experience with a semantically cognizant Internet white pages directory tool. J. Internetworking Res. Exper. (Mar.).]]
SCHWARTZ, M., EMTAGE, A., KAHLE, B., AND NEUMAN, B. 1992. A comparison of Internet discovery approaches. Comput. Syst. 5, 4.]]
STATISTICAL SCIENCES. 1991. Splus Reference Manual. Statistical Sciences, Inc., Seattle, Wash.]]
TSUJIMOTO, S. AND ASADA, H. 1992. Major components of a complete text reading system. Proc. IEEE 80, 7.]]
WANG, D. AND SRIHARI, S. 1989. Classification of newspaper image blocks using texture analysis. Comput. Vis. Graph. Image Process. 47.]]
TONG, K., CASEY, R., AND WAHL, F. 1982. Document analysis system. IBM J. Res. Devel. 26, 6.]]

Cited By

View all
  • (2020)Analysis and research of news gathering and editing process based on data Mining2020 2nd International Conference on Applied Machine Learning (ICAML)10.1109/ICAML51583.2020.00053(220-223)Online publication date: Oct-2020
  • (2008)Service oriented architecture for financial customer relationship managementProceedings of the second international conference on Distributed event-based systems10.1145/1385989.1386027(301-304)Online publication date: 1-Jul-2008
  • (2006)Table-processing paradigms: a research surveyInternational Journal of Document Analysis and Recognition (IJDAR)10.1007/s10032-006-0017-x8:2-3(66-86)Online publication date: 9-May-2006
  • Show More Cited By



Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors


Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 15, Issue 1
Jan. 1997
101 pages
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 1997
Published in TOIS Volume 15, Issue 1


Request permissions for this article.

Check for updates

Author Tags

  1. information gathering
  2. software agents
  3. table recognition


  • Article


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)144
  • Downloads (Last 6 weeks)21
Reflects downloads up to 05 Mar 2025

Other Metrics


Cited By

View all
  • (2020)Analysis and research of news gathering and editing process based on data Mining2020 2nd International Conference on Applied Machine Learning (ICAML)10.1109/ICAML51583.2020.00053(220-223)Online publication date: Oct-2020
  • (2008)Service oriented architecture for financial customer relationship managementProceedings of the second international conference on Distributed event-based systems10.1145/1385989.1386027(301-304)Online publication date: 1-Jul-2008
  • (2006)Table-processing paradigms: a research surveyInternational Journal of Document Analysis and Recognition (IJDAR)10.1007/s10032-006-0017-x8:2-3(66-86)Online publication date: 9-May-2006
  • (2005)Extraction of Keyterms by Simple Text Mining for Business Information RetrievalProceedings of the IEEE International Conference on e-Business Engineering10.1109/ICEBE.2005.66(332-339)Online publication date: 12-Oct-2005
  • (2005)Information retrieval, information structure, and information agentsIntelligent Hypertext10.1007/BFb0023964(145-182)Online publication date: 10-Jun-2005
  • (2004)Three-tier multi-agent architecture for asset management consultantIEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 200410.1109/EEE.2004.1287305(173-176)Online publication date: 2004
  • (2003)A lightweight tool for easy Web site navigationProceedings of the 7th International Conference on Properties and Applications of Dielectric Materials (Cat. No.03CH37417)10.1109/WISE.2003.1254477(134-143)Online publication date: 2003
  • (2003)Personalizing Interactions with Information Systems10.1016/S0065-2458(03)57007-3(323-382)Online publication date: 2003
  • (2002)A multi-agent decision support system for stock tradingIEEE Network: The Magazine of Global Internetworking10.1109/65.98054116:1(20-27)Online publication date: 1-Jan-2002
  • (2001)Information and knowledge exchange in a multi-agent system for stock trading2001 Enterprise Networking, Applications and Services Conference Proceedings.. EntNet@SUPERCOMM2001 (Cat. No.01EX543)10.1109/ENTNET.2001.981989(47-55)Online publication date: 2001
  • Show More Cited By

View Options

View options


View or Download as a PDF file.



View online with eReader.


Login options

Full Access






Share this Publication link

Share on social media