Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2791405.2791470acmotherconferencesArticle/Chapter ViewAbstractPublication PageswciConference Proceedingsconference-collections
research-article

Entropy based content filtering for Mobile Web Page Adaptation

Published: 10 August 2015 Publication History

Abstract

A global increase in the usage of mobile devices and the availability of internet services on the phone has increased the usage of internet on these devices. However, these devices face a major challenge of limited screen space, less bandwidth and processing capability. A majority of the website are designed to be viewed on PCs, Desktop etc. Hence mobile Internet user finds it difficult to browse the web page. In this paper we present an approach to filter the informative web page content and rearrange it on the available screen space of mobile device. In this process the web page is segmented into visual blocks and then the entropy measure of each visual block is computed in terms of content entropy and feature entropy. The model is trained using neural network classifier to segregate main content and noise content.

References

[1]
Web Article, 2015. Research Analysis: global mobile statistics.(Jan 2015) DOI= http://mobiforge.com/research-analysis/global-mobile-statistics-2014-part-a-mobile-subscribers-handset-market-share-mobile-operators#subscribers.
[2]
Shian-Hua Lin, Jan-Ming Ho, 2002. Discovering Informative Content Blocks from Web Documents, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining Pages 588--593. DOI= http://dl.acm.org/citation.cfm?id=775134
[3]
Yan Liu, Qiang Wang, Qing Xian Wang, Yao Liu, and Liang Wei, 2006. An Adaptive Scoring Method for Block Importance Learning, Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence Pages 761--764. DOI= http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4061468
[4]
Wei-Ting Cho, Yu-Min Lin and Hung-Yu Kao, 2009. Entropy-based Visual Tree Evaluation on Block Extraction, IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2009. DOI= http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5286011
[5]
David Fernandes, Edleno S. de Moura, Berthier Ribeiro-Neto, Altigran S. da Silva, Marcos André Gonçalves, 2007. Computing Block Importance for Searching on Web Sites, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management Pages 165--174. DOI= http://dl.acm.org/citation.cfm?doid= 1321440.1321466
[6]
G. Salton, A. Wong, and C. S. Yang, 1975. A vector space model for automatic indexing. ACM Press. Communications of the ACM Volume 18 Issue 11, Nov. 1975 Pages 613--620. DOI =http://doi.acm.org/10.1145/361219.361220.
[7]
L. Yi, B. Liu, and X. Li., 2003. Eliminating noisy information in web pages for data mining. In KDD '03: Proceedings of the 9th International ACM Conference on Knowledge Discovery and Data Mining, pages 296--305, Washington, DC, USA, 2003. ACM Press. DOI = http://doi.acm.org/10.1145/956750.956785.
[8]
Yi, L. and Liu, B., 2003. Web Page Cleaning for Web Mining through Feature Weighting, Proceedings of the 18th international joint conference on Artificial intelligence Pages 43--48. DOI = http://dl.acm.org/citation.cfm?id=1630666
[9]
D. Cai, S. Yu, J. Wen, and W. Ma., 2004. Block-based web search. Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval Pages 456--463. DOI = http://doi.acm.org/10.1145/1008992.1009070.
[10]
Rong Li, 2013. A Novel Approach of Calculating Information Entropy in Information Extraction, International Journal of Database Theory and Application Vol. 6, No. 5 (2013), pp. 45--52. DOI= http://dx.doi.org/10.14257/ijdta.2013.6.5.05
[11]
K. Vieira, A. S. da Silva, N. Pinto, E. S. de Moura, J. M. B. Cavalcanti, and J. Freire., 2006. A fast and robust method for web page template detection and removal. Proceedings of the 15th ACM international conference on Information and knowledge management Pages 258--267. DOI =http://doi.acm.org/10.1145/1183614.1183654.
[12]
V. Vapnik., 1992. Principles of risk minimization for learning theory. Advances in Neural Information Processing Systems, pages 831--838. DOI= http://www.citeulike.org/user/mriba/article/431737.
[13]
Stephen Grossberg, 1988. Non Linear Neural Networks: Principles, Mechanisms and Architectures, Neural Networks, Pergammon Journal, Vol 1 pp 17--61. DOI = http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.385.8652
[14]
Jaiwei Han, Micheline Kamber, 2012. Data Mining Concepts and Techniques, Third Edition, ELSEVIER.
[15]
Neetu Narwal, Mayank Singh, 2013. Web Content Extraction A Heuristic Approach, International Journal Of Computer Science and Information Security, Vol 11, No 1 pp 1--4. DOI= https://sites.google.com/site/ijcsis/vol-11-no-1-jan-2013.

Cited By

View all
  • (2020)Web page filtering for kidsInternational Journal of Information Technology10.1007/s41870-020-00474-0Online publication date: 19-May-2020
  • (2017)Web content adaptation using 2D bin packing algorithmInternational Journal of Information Technology10.1007/s41870-017-0019-69:2(139-146)Online publication date: 7-Jun-2017

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
WCI '15: Proceedings of the Third International Symposium on Women in Computing and Informatics
August 2015
763 pages
ISBN:9781450333610
DOI:10.1145/2791405
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 August 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DOM
  2. Entropy
  3. Similarity measure
  4. Visual Blocks

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WCI '15

Acceptance Rates

WCI '15 Paper Acceptance Rate 98 of 452 submissions, 22%;
Overall Acceptance Rate 98 of 452 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Web page filtering for kidsInternational Journal of Information Technology10.1007/s41870-020-00474-0Online publication date: 19-May-2020
  • (2017)Web content adaptation using 2D bin packing algorithmInternational Journal of Information Technology10.1007/s41870-017-0019-69:2(139-146)Online publication date: 7-Jun-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media