Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

FlexSaaS: A Reconfigurable Accelerator for Web Search Selection

Published: 17 February 2019 Publication History

Abstract

Web search engines deploy large-scale selection services on CPUs to identify a set of web pages that match user queries. An FPGA-based accelerator can exploit various levels of parallelism and provide a lower latency, higher throughput, more energy-efficient solution than commodity CPUs. However, maintaining such a customized accelerator in a commercial search engine is challenging because selection services are changed often. This article presents our design for FlexSaaS (Flexible Selection as a Service), an FPGA-based accelerator for web search selection. To address efficiency and flexibility challenges, FlexSaaS abstracts computing models and separates memory access from computation. Specifically, FlexSaaS (i) contains a reconfigurable number of matching processors that can handle various possible query plans, (ii) decouples index stream reading from matching computation to fetch and decode index files, and (iii) includes a universal memory accessor that hides the complex memory hierarchy and reduces host data access latency. Evaluated on FPGAs in the selection service of a commercial web search--the Bing web search engine—FlexSaaS can be evolved quickly to adapt to new updates. Compared to the software baseline, FlexSaaS on Arria 10 reduces average latency by 30% and increases throughput by 1.5×.

References

[1]
Intel. n.d. Altera SDK for OpenCL. Available at https://www.altera.com/.
[2]
Intel. n.d. Intel Vtune Amplifier. Retrieved January 26, 2019 from https://software.intel.com/en-us/intel-vtune-amplifier-xe.
[3]
OpenMP. n.d. OpenMP Home Page. Retrieved January 26, 2019 from https://www.openmp.org/.
[4]
Xilinx. n.d. SDAccel Development Environment. Available at https://www.xilinx.com/.
[5]
Falk Scholer, Hugh E. Williams, John Yiannis, and Justin Zobel. 2002. Compression of inverted indexes for fast query evaluation. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 222--229.
[6]
Vo Ngoc Anh and Alistair Moffat. 2005. Inverted index compression using word-aligned binary codes. Information Retrieval 8, 1 (2005), 151--166.
[7]
Naiyong Ao, Fan Zhang, Di Wu, Douglas S. Stones, Gang Wang, Xiaoguang Liu, et al. 2011. Efficient parallel lists intersection and index compression algorithms using graphics processing units. Proceedings of the VLDB Endowment 4, 8 (2011), 470--481.
[8]
Dan Blandford and Guy Blelloch. 2002. Index compression through document reordering. In Proceedings of the 2002 Data Compression Conference (DCC’02). IEEE, Los Alamitos, CA, 342--351.
[9]
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International World Wide Web Conference (WWW’98).
[10]
Andrei Z. Broder and Monika Rauch Henzinger. 1998. Information retrieval on the web. In Proceedings of the 39th Annual Symposium on Foundations of Computer Science (FOCS’98). 6.
[11]
Jared Casper and Kunle Olukotun. 2014. Hardware acceleration of database operations. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, New York, NY, 151--160.
[12]
Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, et al. 2016. A cloud-scale acceleration architecture. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE, Los Alamitos, CA, 1--13.
[13]
Jeffrey Dean. 2009. Challenges in building large-scale information retrieval systems: Invited talk. In Proceedings of the 2nd ACM International Conference on Web Search and Data Mining. ACM, New York, NY, 1.
[14]
Shuai Ding, Jinru He, Hao Yan, and Torsten Suel. 2009. Using graphics processors for high performance IR query processing. In Proceedings of the 18th International Conference on World Wide Web. ACM, New York, NY, 421--430.
[15]
Tiziano Fagni, Raffaele Perego, Fabrizio Silvestri, and Salvatore Orlando. 2006. Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Transactions on Information Systems 24, 1 (2006), 51--78.
[16]
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, et al. 2016. EIE: Efficient inference engine on compressed deep neural network. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE, Los Alamitos, CA, 243--254.
[17]
Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, et al. 2017. In-datacenter performance analysis of a tensor processing unit. arXiv:1704.04760.
[18]
Erik Lindholm, John Nickolls, Stuart Oberman, and John Montrym. 2008. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro 28, 2 (2008), 39--55.
[19]
Shaoli Liu, Zidong Du, Jinhua Tao, Dong Han, Tao Luo, Yuan Xie, et al. 2016. Cambricon: An instruction set architecture for neural networks. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE, Los Alamitos, CA, 393--405.
[20]
Ikuo Magaki, Moein Khazraee, Luis Vega Gutierrez, and Michael Bedford Taylor. 2016. ASIC clouds: Specializing the datacenter. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE, Los Alamitos, CA, 178--190.
[21]
DAC Manning. 1995. Introduction. In Introduction to Industrial Minerals. Springer, 1--16.
[22]
Jian Ouyang, Shiding Lin, Wei Qi, Yong Wang, Bo Yu, and Song Jiang. 2014. SDA: Software-defined accelerator for large-scale DNN systems. In Proceedings of the 2014 IEEE Hot Chips 26 Symposium (HCS’14). IEEE, Los Alamitos, CA, 1--23.
[23]
Jian Ouyang, Wei Qi, Yong Wang, Yichen Tu, Jing Wang, and Bowen Jia2016. SDA: Software-defined accelerator for general-purpose big data analysis system. In Proceedings of the 2016 IEEE Hot Chips 28 Symposium (HCS’16). IEEE, Los Alamitos, CA, 1--23.
[24]
Raghu Prabhakar, Yaqi Zhang, David Koeplinger, Matt Feldman, Tian Zhao, Stefan Hadjis, et al. 2017. Plasticine: A reconfigurable architecture for parallel patterns. In Proceedings of the 44th Annual International Symposium on Computer Architecture. ACM, New York, NY, 389--402.
[25]
Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, et al. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture (ISCA’14). IEEE, Los Alamitos, CA, 13--24.
[26]
Rimon Tadros. 2015. Accelerating Web Search Using GPUs. Ph.D. Dissertation. University of British Columbia.
[27]
Wim Vanderbauwhede, Leif Azzopardi, and Mahmoud Moadeli. 2009. FPGA-accelerated information retrieval: High-efficiency document filtering. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’09). IEEE, Los Alamitos, CA, 417--422.
[28]
Di Wu, Fan Zhang, Naiyong Ao, Fang Wang, Xiaoguang Liu, and Gang Wang. 2009. A batched GPU algorithm for set intersection. In Proceedings of the 10th International Symposium on Pervasive Systems, Algorithms, and Networks (ISPAN’09). IEEE, Los Alamitos, CA, 752--756.
[29]
Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, and Jing Liu. 2010. Efficient lists intersection by CPU-GPU cooperative computing. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing, Workshops, and Phd Forum (IPDPSW’10). IEEE, Los Alamitos, CA, 1--8.
[30]
Jing Yan, Zhan-Xiang Zhao, Ning-Yi Xu, Xi Jin, Lin-Tao Zhang, and Feng-Hsiung Hsu. 2012. Efficient query processing for web search engine with FPGAs. In Proceedings of the IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’12). IEEE, Los Alamitos, CA, 97--100.
[31]
Jiangong Zhang, Xiaohui Long, and Torsten Suel. 2008. Performance of compressed inverted list caching in search engines. In Proceedings of the 17th International Conference on World Wide Web. ACM, New York, NY, 387--396.
[32]
Justin Zobel and Alistair Moffat. 2006. Inverted files for text search engines. ACM Computing Surveys 38, 2 (2006), 6.
[33]
Marcin Zukowski, Sandor Heman, Niels Nes, and Peter Boncz. 2006. Super-scalar RAM-CPU cache compression. In Proceedings of the 22nd International Conference on Data Engineering (ICDE’06). IEEE, Los Alamitos, CA, 59.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems
ACM Transactions on Reconfigurable Technology and Systems  Volume 12, Issue 1
March 2019
115 pages
ISSN:1936-7406
EISSN:1936-7414
DOI:10.1145/3310278
  • Editor:
  • Deming Chen
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 February 2019
Accepted: 01 December 2018
Revised: 01 October 2018
Received: 01 June 2018
Published in TRETS Volume 12, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FPGA
  2. inverted index
  3. selection service
  4. web search engine

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Contribution during internship at Microsoft Research Asia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 254
    Total Downloads
  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media