Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1321440.1321583acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Towards efficient search on unstructured data: an intelligent-storage approach

Published: 06 November 2007 Publication History

Abstract

Applications that create and consume unstructured data have grown both in scale of storage requirements and complexity of search primitives. We consider two such applications: exhaustive search and integration of structured and unstructured data. Current block-based storage systems are either incapable or inefficient to address the challenges bought forth by the above applications. We propose a storage framework to efficiently store and search unstructured and structured data while controlling storage management costs. Experimental results based on our prototype show that the proposed system can provide impressive performance and feature benefits.

References

[1]
[2]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The google file system. In Proceedings of the nineteenth ACM SOSP, pages 29--43, 2003.
[3]
L. Huston, R. Sukthankar, R. Wickremesinghe, M. Satyanarayanan, G. R. Ganger, E. Riedel, and A. Ailamaki. Diamond: A Storage Architecture for Early Discard in Interactive Search. In Proceedings of the International Conference on File and Storage Technologies, FAST, 2004.
[4]
M. Mesnier, G. Ganger, and E. Riedel. Object-based storage. IEEE Communications Magazine, 41(8):84--90, August 2003.
[5]
E. Riedel, G. A. Gibson, and C. Faloutsos. Active storage for large-scale data mining and multimedia. In Proc. 24th Int. Conf. Very Large Data Bases, VLDB, pages 62--73, 24--27 1998.
[6]
SCSI Object-based Storage Device Commands - 2 (OSD-2). Project T10/1721-D, Revision 0, October 2004.
[7]
R. Sears and C. van Ingen. Fragmentation in Large Object Repositories. In CIDR, 2007.
[8]
N. Spillers. Storage Challenges in the Medical Industry. In The Fourth Intelligent Storage Workshop, Digital Technology Center, University of Minnesota, 2006.
[9]
The DISC-OSD T10 Reference Implementation. http://sourceforge.net/projects/disc-osd.
[10]
The Linux SCSI Generic (sg) Driver. http://sg.torque.net/sg/.

Cited By

View all
  • (2024)Sherlock in OSS: A Novel Approach of Content-Based Searching in Object Storage SystemIEEE Access10.1109/ACCESS.2024.340107412(69456-69474)Online publication date: 2024
  • (2012)A Data-Intensive Approach to Named Entity Recognition Combining Contextual and Intrinsic IndicatorsInternational Journal of Business Intelligence Research10.4018/jbir.20120101043:1(55-71)Online publication date: 1-Jan-2012
  • (2011)On the benefits of transparent compression for cost-effective cloud data storageTransactions on large-scale data- and knowledge-centered systems III10.5555/2028190.2028197(167-184)Online publication date: 1-Jan-2011
  • Show More Cited By

Index Terms

  1. Towards efficient search on unstructured data: an intelligent-storage approach

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
      November 2007
      1048 pages
      ISBN:9781595938039
      DOI:10.1145/1321440
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 November 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Poster

      Conference

      CIKM07

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 23 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Sherlock in OSS: A Novel Approach of Content-Based Searching in Object Storage SystemIEEE Access10.1109/ACCESS.2024.340107412(69456-69474)Online publication date: 2024
      • (2012)A Data-Intensive Approach to Named Entity Recognition Combining Contextual and Intrinsic IndicatorsInternational Journal of Business Intelligence Research10.4018/jbir.20120101043:1(55-71)Online publication date: 1-Jan-2012
      • (2011)On the benefits of transparent compression for cost-effective cloud data storageTransactions on large-scale data- and knowledge-centered systems III10.5555/2028190.2028197(167-184)Online publication date: 1-Jan-2011
      • (2011)Analysis and evaluation of unstructured data: text mining versus natural language processing2011 5th International Conference on Application of Information and Communication Technologies (AICT)10.1109/ICAICT.2011.6111017(1-4)Online publication date: Oct-2011
      • (2010)High throughput data-compression for cloud storageProceedings of the Third international conference on Data management in grid and peer-to-peer systems10.5555/1885229.1885231(1-12)Online publication date: 1-Sep-2010
      • (2010)High Throughput Data-Compression for Cloud StorageData Management in Grid and Peer-to-Peer Systems10.1007/978-3-642-15108-8_1(1-12)Online publication date: 2010
      • (2009)BlobSeerProceedings of the 2009 EDBT/ICDT Workshops10.1145/1698790.1698796(18-25)Online publication date: 22-Mar-2009

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media