research-article

HW-Forest: Deep Forest with Hashing Screening and Window Screening

Authors:

Xindong WuAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 16, Issue 6

Article No.: 123, Pages 1 - 24

https://doi.org/10.1145/3532193

Published: 30 July 2022 Publication History

Abstract

As a novel deep learning model, gcForest has been widely used in various applications. However, current multi-grained scanning of gcForest produces many redundant feature vectors, and this increases the time cost of the model. To screen out redundant feature vectors, we introduce a hashing screening mechanism for multi-grained scanning and propose a model called HW-Forest which adopts two strategies: hashing screening and window screening. HW-Forest employs perceptual hashing algorithm to calculate the similarity between feature vectors in hashing screening strategy, which is used to remove the redundant feature vectors produced by multi-grained scanning and can significantly decrease the time cost and memory consumption. Furthermore, we adopt a self-adaptive instance screening strategy called window screening to improve the performance of our approach, which can achieve higher accuracy without hyperparameter tuning on different datasets. Our experimental results show that HW-Forest has higher accuracy than other models, and the time cost is also reduced.

References

[1]

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.

[2]

Shuhui Cheng, Youxi Wu, Yan Li, Fang Yao, and Fan Min. 2021. TWD-SFNN: Three-way decision with single hidden layer feedforward neural network. Information Sciences 579 (2021), 15–32.

[3]

Youxi Wu, Lanfang Luo, Yan Li, Lei Guo, Philippe Fournier-Viger, Xingquan Zhu, and Xindong Wu. 2022. NTP-Miner: Nonoverlapping three-way sequential pattern mining. ACM Transactions on Knowledge Discovery from Data 16, 3 (2022), 51.

Digital Library

[4]

Ziyang Zhang, Chenrui Duan, Tao Lin, Shoujun Zhou, Yuanquan Wang, and Xuedong Gao. 2020. GVFOM: A novel external force for active contour based image segmentation. Information Sciences 506, C (2020), 1–18.

Digital Library

[5]

Zixing Zhang, Jürgen Geiger, Jouni Pohjalainen, Amr El-Desoky Mousa, Wenyu Jin, and Björn Schuller. 2018. Deep learning for environmentally robust speech recognition: An overview of recent developments. ACM Transactions on Intelligent Systems and Technology 9, 5 (2018), 49.

Digital Library

[6]

Jipeng Qiang, Zhenyu Qian, Yun Li, Yunhao Yuan, Xindong Wu. 2022. Short text topic modeling techniques, applications, and performance: A survey. IEEE Transactions on Knowledge and Data Engineering 34, 3, (2022), 1427–1445.

[7]

Dong Liu, YouXi Wu, and He Jiang. 2016. FP-ELM: An online sequential learning algorithm for dealing with concept drift. Neurocomputing 207, C, 322–334.

Digital Library

[8]

YouXi Wu, Dong Liu, and He Jiang. 2017. Length-changeable incremental extreme learning machine. Journal of Computer Science and Technology 32, 3 (2017), 630–643.

[9]

Zhihua Zhou and Ji Feng. 2019. Deep forest. National Science Review 6, 1 (2019), 74–86.

[10]

Ran Su, Xinyi Liu, Leyi Wei, and Quan Zou. 2019. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods 166 (2019), 91–102.

[11]

Feng Yang, Qizhi Xu, Bo Li, and Yan Ji. 2018. Ship detection from thermal remote sensing imagery through region-based deep forest. IEEE Geoscience & Remote Sensing Letters 15, 3 (2018), 449–453.

[12]

Ming Pang, Kaiming Ting, Peng Zhao, and Zhihua Zhou. 2020. Improving deep forest by screening. IEEE Transactions on Knowledge and Data Engineering (2020). DOI:

[13]

Pengfei Ma, Youxi Wu, Yan Li, Lei Guo, and Zhao Li. 2022. DBC-Forest: Deep forest with binning confidence screening. Neurocomputing 475, C (2022), 112–122.

Digital Library

[14]

Liang Sun, Zhanhao Mo, Fuhua Yan, Liming Xia, Fei Shan, Zhongxiang Ding, Bin Song, Wanchun Gao, Wei Shao, Feng Shi, Huan Yuan, Huiting Jiang, Dijia Wu, Ying Wei, Yaozong Gao, He Sui, Daoqiang Zhang, and Dinggang Shen. 2020. Adaptive feature selection guided deep forest for covid-19 classification with chest CT. IEEE Journal of Biomedical and Health Informatics 24, 10 (2020), 2798–2805.

[15]

Yang Guo, Shuhui Liu, Zhanhuai Li, and Xuequn Shang. 2018. BCDForest: A boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data. BMC Bioinformatics 19, S5 (2018), 1–13.

[16]

Yunyun Dong, Wenkai Yang, Jiawen Wang, Juanjuan Zhao, and Yan Qiang. 2019. MLW-gcForest: A multi-weighted gcForest model for cancer subtype classification by methylation data. Applied Sciences 9, 17 (2019), 3589.

[17]

Shouxiang Wang, Haiwen Chen, Luyang Guo, and Di Xu. 2021. Non-intrusive load identification based on the improved voltage-current trajectory with discrete color encoding background and deep-forest classifier. Energy and Buildings 244, 2 (2021), 111043.

[18]

Juan Cheng, Meiyao Chen, Chang Li, Yu Liu, Rencheng Song, Aiping Liu, and Xun Chen. 2020. Emotion recognition from multi-channel EEG via deep forest. IEEE Journal of Biomedical and Health Informatics 25, 2 (2020), 453–464.

[19]

Yinfeng Fang, Haiyang Yang, Xuguang Zhang, Han Liu, and Bo Tao. 2021. Multi-feature input deep forest for EEG-based emotion recognition. Frontiers in Neurorobotics 14 (2021), 617531.

[20]

Liu Zhang, Heng Sun, Zhenhong Rao, and Haiyan Ji. 2020. Hyperspectral imaging technology combined with deep forest model to identify frost-damaged rice seeds. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 229 (2020), 117973.

[21]

Kheir Eddine Daouadi, Rim Zghal Rebaï, and Ikram Amous. 2021. Optimizing semantic deep forest for tweet topic classification. Information Systems 101, 2 (2021), 101801.

[22]

Meng Zhou, Xianhua Zeng, and Aozhu Chen. 2019. Deep forest hashing for image retrieval. Pattern Recognition 95, C (2019), 114–127.

Digital Library

[23]

Chao Ma, Zhenbing Liu, Zhiguang Cao, Wen Song, Jie Zhang, and Weiliang Zeng. 2020. Cost-sensitive deep forest for price prediction. Pattern Recognition 107 (2020), 107499.

[24]

Shenhuan Lyu, Liang Yang, and Zhihua Zhou. 2019 A refined margin distribution analysis for forest representation learning. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vol. 32. 5530–5540.

[25]

Ludovic Arnould, Claire Boyer, and Erwan Scornet. 2021. Analyzing the tree-layer structure of Deep Forests. In Proceedings of the 38th International Conference on Machine Learning, Vol. 139. 342–350.

[26]

Yihe Cheng, Shenhuan Lyu, and Yuan Jiang. 2021. Improving deep forest by exploiting high-order interactions. In Proceedings of the 21th IEEE International Conference on Data Mining. 1036–1041.

[27]

Lev V. Utkin, Andrei V. Konstantinov, Viacheslav S. Chukanov, and Anna A. Meldo. 2020. A new adaptive weighted deep forest and its modifications. International Journal of Information Technology & Decision Making 19, 4 (2020), 963–986.

[28]

Liang Yang, Xizhu Wu, Yuan Jiang, and Zhihua Zhou. 2020. Multi-label learning with deep forest. In Proceedings of the European Conference on Artificial Intelligence, Vol. 325. 1634–1641.

[29]

Zheng Zhang, Xiaofeng Zhu, Guangming Lu, and Yudong Zhang. 2021. Probability ordinal-preserving semantic hashing for large-scale image retrieval. ACM Transactions on Knowledge Discovery from Data 15, 3 (2021), 1–22. DOI:

Digital Library

[30]

Huifang Yang, Chenghao Tu, and Chusong Chen. 2021. Learning binary hash codes based on adaptable label representations. IEEE Transactions on Neural Networks and Learning Systems (2021). DOI:

[31]

Yi Xu, Xianglong Liu, Binshuai Wang, Renshuai Tao, Ke Xia, and Xianbin Cao. 2021. Fast nearest subspace search via random angular hashing. IEEE Transactions on Multimedia 23 (2021), 342–352.

[32]

Wen Fang, Haimiao Hu, Zihao Hu, Shengcai Liao, and Bo Li. 2018. Perceptual hash-based feature description for person re-identification. Neurocomputing 272 (2018), 520–531.

[33]

Breiman Leo. 2001. Random forests. Machine Learning 45, 1 (2001), 5–32.

Digital Library

[34]

Kafeng Wang, Haoyi Xiong, Jiang Bian, Zhanxing Zhu, Qian Gao, and Zhishan Guo. 2021. Sampling sparse representations with randomized measurement langevin dynamics. ACM Transactions on Knowledge Discovery from Data 15, 2 (2021), 1–21. DOI:

Digital Library

[35]

Yong Liu, Shizhong Liao, Shali Jiang, Lizhong Ding, Hailun Lin, and Weiping Wang. 2020. Fast cross-validation for kernel-based algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 5 (2020), 1083–1096.

[36]

Shaoning Zeng, Bob Zhang, Jianping Gou, Yong Xu, and Wei Huang. 2021. Fast and robust dictionary-based classification for image data. ACM Transactions on Knowledge Discovery from Data 15 6 (2021), 1–22. DOI:

Digital Library

[37]

Youxi Wu, Yuehua Wang, Yan Li, Xingquan Zhu, and Xindong Wu. 2021. Top-k self-adaptive contrast sequential pattern mining. IEEE Transactions on Cybernetics. DOI:

[38]

Youxi Wu, Meng Geng, Yan Li, lei Guo, Zhao Li, Philippe Fournier-Viger, Xingquan Zhu, and Xindong Wu. 2021. HANP-Miner: High average utility nonoverlapping sequential pattern mining. Knowledge-Based Systems, 229, C (2021), 107361.

Digital Library

[39]

H. M. Dipu Kabir, Moloud Abdar, Seyed Mohammad Jafar Jalali, Abbas Khosravi, Amir F. Atiya, Saeid Nahavandi, and Dipti Srinivasan. 2020. SpinalNet: Deep neural network with gradual input. arXiv:2007.03347. Retrieved from https://arxiv.org/abs/2007.03347.

[40]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

Cited By

Sun KLi CQian T(2024)City Matters! A Dual-Target Cross-City Sequential POI Recommendation ModelACM Transactions on Information Systems10.1145/3664284Online publication date: 10-May-2024
https://doi.org/10.1145/3664284
Wu YWang ZLi YGuo YJiang HZhu XWu X(2024)Co-occurrence Order-preserving Pattern Mining with Keypoint Alignment for Time SeriesACM Transactions on Management Information Systems10.1145/365845015:2(1-27)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3658450
Shen CMao DTang JLiao ZChen S(2024)Prediction of LncRNA-Protein Interactions Based on Kernel Combinations and Graph Convolutional NetworksIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2023.328691728:4(1937-1948)Online publication date: Apr-2024
https://doi.org/10.1109/JBHI.2023.3286917
Show More Cited By

Index Terms

HW-Forest: Deep Forest with Hashing Screening and Window Screening
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
    2. Machine learning approaches
      1. Classification and regression trees

Recommendations

Deep Learning in Breast Cancer Screening
Deep Convolutional Neural Networks for breast cancer screening
Highlights
- A Computer-aided Diagnosis system for mammography mass lesion classification is proposed.
Abstract Background and objective
Radiologists often have a hard time classifying mammography mass lesions which leads to unnecessary breast biopsies to remove suspicions and this ends up adding exorbitant expenses to an already ...
Impact of Digital Mammography in Breast Cancer Screening: Initial Experience in a National Breast Screening Program
IWDM '08: Proceedings of the 9th international workshop on Digital Mammography

Full field digital mammography (FFDM) was introduced into the Irish National Breast Screening Program (INBSP) in 2005. The aim of this study is to review the use of FFDM in a National Breast Screening Program and to compare the results to standard ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 16, Issue 6

December 2022

631 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3543989

Editor:
Charu Aggarwal
IBM T. J. Watson Research, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2022

Online AM: 04 May 2022

Accepted: 01 April 2022

Revised: 01 March 2022

Received: 01 November 2021

Published in TKDD Volume 16, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
National Key Research and Development Program of China
Natural Science Foundation of Hebei Province, China
Graduate Student Innovation Program of Hebei Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
353
Total Downloads

Downloads (Last 12 months)83
Downloads (Last 6 weeks)18

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sun KLi CQian T(2024)City Matters! A Dual-Target Cross-City Sequential POI Recommendation ModelACM Transactions on Information Systems10.1145/3664284Online publication date: 10-May-2024
https://doi.org/10.1145/3664284
Wu YWang ZLi YGuo YJiang HZhu XWu X(2024)Co-occurrence Order-preserving Pattern Mining with Keypoint Alignment for Time SeriesACM Transactions on Management Information Systems10.1145/365845015:2(1-27)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3658450
Shen CMao DTang JLiao ZChen S(2024)Prediction of LncRNA-Protein Interactions Based on Kernel Combinations and Graph Convolutional NetworksIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2023.328691728:4(1937-1948)Online publication date: Apr-2024
https://doi.org/10.1109/JBHI.2023.3286917
Ni RCai WJiang Y(2024)Contrastive cross-domain sequential recommendation via emphasized intention featuresNeural Networks10.1016/j.neunet.2024.106488179(106488)Online publication date: Nov-2024
https://doi.org/10.1016/j.neunet.2024.106488
Yang XXiong XYu JChen JLi S(2024)Cross-domain sequential recommendation base on Fourier transform and contrastive variational augmentationComputers and Electrical Engineering10.1016/j.compeleceng.2024.109681120(109681)Online publication date: Dec-2024
https://doi.org/10.1016/j.compeleceng.2024.109681
Ma YZhang BLiu ZLiu YWang JLi XFeng FNi YLi S(2024)IAS-FET: An intelligent assistant system and an online platform for enhancing successful rate of in-vitro fertilization embryo transfer technology based on clinical featuresComputer Methods and Programs in Biomedicine10.1016/j.cmpb.2024.108050245(108050)Online publication date: Mar-2024
https://doi.org/10.1016/j.cmpb.2024.108050
Kabir HMondal SKhanam SKhosravi ARahman SQazani MAlizadehsani RAsadi HMohamed SNahavandi SAcharya U(2024)Uncertainty aware neural network from similarity and sensitivity▪Applied Soft Computing10.1016/j.asoc.2023.111027149:PAOnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.asoc.2023.111027
Li XGao XWang QWang CLi BWan K(2024)Feature Analysis Network: An Interpretable Idea in Deep LearningCognitive Computation10.1007/s12559-023-10238-016:3(803-826)Online publication date: 20-Jan-2024
https://doi.org/10.1007/s12559-023-10238-0
Mohamed Taha ALiu GChen QFan WCui ZWu XFang H(2024)Toward Data-Driven Mineral Prospectivity Mapping from Remote Sensing Data Using Deep Forest Predictive ModelNatural Resources Research10.1007/s11053-024-10387-533:6(2407-2431)Online publication date: 23-Jul-2024
https://doi.org/10.1007/s11053-024-10387-5
Zhang JHua XZhao PKang K(2023)Dual cross-domain session-based recommendation with multi-channel integrationAI Communications10.3233/AIC-23008436:4(341-359)Online publication date: 13-Oct-2023
https://doi.org/10.3233/AIC-230084
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents