Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Towards federated feature selection: : Logarithmic division for resource-conscious methods

Published: 25 September 2024 Publication History

Abstract

Feature selection is a popular preprocessing step to reduce the dimensionality of the data while preserving the important information. In this paper, we propose an efficient and green feature selection method based on information theory, with the novelty of using the logarithmic division and resorting to fixed-point precision. Moreover, we extend these advancements by adapting the Mutual Information calculation for federated scenarios. The results of experiments conducted on several datasets indicate the potential of our proposal, as it does not incur significant information loss compared to the double-precision method, both in the features selected and in the subsequent classification step. Our method has shown significant potential when applied in federated scenarios, where experimentation demonstrated a lossless feature selection process and maintains classification results compared with centralised versions. These findings open up possibilities towards a new family of green feature selection methods, which would help to minimise energy consumption, lower carbon emissions and increase adaptability to Internet of Things environments.

Highlights

Implementation of green feature selection methods based on information theory.
Study of logarithmic division to reduce energy and memory consumption.
Federated Mutual Information calculation enables IoT environments data privacy.
Feature selection with logarithmic division obtains similar results as double-precision.
Lossless and robust methods have been developed for federated scenarios.

References

[1]
Guyon I., Gunn S., Nikravesh M., Zadeh L.A., Feature Extraction: Foundations and Applications, Springer, 2008.
[2]
Climente-González H., Azencott C.-A., Kaski S., Yamada M., Block HSIC lasso: model-free biomarker detection for ultra-high dimensional data, Bioinformatics 35 (14) (2019) i427–i435.
[3]
Hleg A., Ethics guidelines for trustworthy AI, 2019, B-1049 Brussels.
[4]
Shi W., Cao J., Zhang Q., Li Y., Xu L., Edge computing: Vision and challenges, IEEE Internet Things J. 3 (5) (2016) 637–646.
[5]
Horowitz M., 1.1 computing’s energy problem (and what we can do about it), in: 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers, ISSCC, IEEE, 2014, pp. 10–14.
[6]
Blum A.L., Langley P., Selection of relevant features and examples in machine learning, Artif. Intell. 97 (1–2) (1997) 245–271.
[7]
Brown G., Pocock A., Zhao M.-J., Luján M., Conditional likelihood maximisation: a unifying framework for information theoretic feature selection, J. Mach. Learn. Res. 13 (2012) 27–66.
[8]
Suárez-Marcote S., Morán-Femández L., Bolón-Canedo V., Less is more: Low-precision feature selection for wearables, in: 2022 International Joint Conference on Neural Networks, IJCNN, IEEE, 2022, pp. 1–8.
[9]
Oberman S.F., Flynn M.J., Design issues in division and other floating-point operations, IEEE Trans. Comput. 46 (2) (1997) 154–161.
[10]
Parhami B., Computing with logarithmic number system arithmetic: Implementation methods and performance benefits, Comput. Electr. Eng. 87 (2020).
[11]
Kairouz P., McMahan H.B., Avent B., Bellet A., Bennis M., Bhagoji A.N., Bonawitz K., Charles Z., Cormode G., Cummings R., et al., Advances and open problems in federated learning, Found. Trends Mach. Learn. 14 (1–2) (2021) 1–210.
[12]
Yang Q., Liu Y., Chen T., Tong Y., Federated machine learning: Concept and applications, ACM Trans. Intell. Syst. Technol. 10 (2) (2019) 1–19.
[13]
Criado M.F., Casado F.E., Iglesias R., Regueiro C.V., Barro S., Non-IID data and continual learning processes in federated learning: A long road ahead, Inf. Fusion 88 (2022) 263–280.
[14]
Krawczuk J., Łukaszuk T., The feature selection bias problem in relation to high-dimensional gene data, Artif. Intell. Med. 66 (2016) 63–71.
[15]
Venkatesh B., Anuradha J., A review of feature selection and its methods, Cybern. Inf. Technol. 19 (1) (2019) 3–26.
[16]
Paninski L., Estimation of entropy and mutual information, Neural Comput. 15 (6) (2003) 1191–1253.
[17]
Han S., Liu X., Mao H., Pu J., Pedram A., Horowitz M.A., Dally W.J., EIE: Efficient inference engine on compressed deep neural network, ACM SIGARCH Comput. Archit. News 44 (3) (2016) 243–254.
[18]
Hubara I., Courbariaux M., Soudry D., El-Yaniv R., Bengio Y., Quantized neural networks: Training neural networks with low precision weights and activations, J. Mach. Learn. Res. 18 (1) (2017) 6869–6898.
[19]
Sun X., Wang N., Chen C.-Y., Ni J., Agrawal A., Cui X., Venkataramani S., El Maghraoui K., Srinivasan V.V., Gopalakrishnan K., Ultra-low precision 4-bit training of deep neural networks, Adv. Neural Inf. Process. Syst. 33 (2020) 1796–1807.
[20]
Wang S., Tuor T., Salonidis T., Leung K.K., Makaya C., He T., Chan K., When edge meets learning: Adaptive control for resource-constrained distributed machine learning, in: IEEE INFOCOM 2018-IEEE Conference on Computer Communications, IEEE, 2018, pp. 63–71.
[21]
Tschiatschek S., Pernkopf F., Parameter learning of Bayesian network classifiers under computational constraints, in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2015, pp. 86–101.
[22]
Mitchell J.N., Computer multiplication and division using binary logarithms, IRE Trans. Electron. Comput. (4) (1962) 512–517.
[23]
Subhasri C., Jammu B.R., Guna Sekhar Sai Harsha L., Bodasingi N., Samoju V.R., Hardware-efficient approximate logarithmic division with improved accuracy, Int. J. Circuit Theory Appl. 49 (1) (2021) 128–141.
[24]
Li A., Peng H., Zhang L., Huang J., Guo Q., Yu H., Liu Y., FedSDG-FS: Efficient and secure feature selection for vertical federated learning, 2023, arXiv preprint arXiv:2302.10417.
[25]
Hu Y., Zhang Y., Gong D., Sun X., Multi-participant federated feature selection algorithm with particle swarm optimizaiton for imbalanced data under privacy protection, IEEE Transactions on Artificial Intelligence (2022).
[26]
Cassara P., Gotta A., Valerio L., Federated feature selection for cyber-physical systems of systems, IEEE Trans. Veh. Technol. 71 (9) (2022) 9937–9950.
[27]
Banerjee S., Elmroth E., Bhuyan M., Fed-fis: A novel information-theoretic federated feature selection for learning stability, in: International Conference on Neural Information Processing, Springer, 2021, pp. 480–487.
[28]
G. Meurant, Fixed point, floating point and posits, unpublished, https://gerard-meurant.pagesperso-orange.fr/.
[29]
Ansari M.S., Cockburn B.F., Han J., An improved logarithmic multiplier for energy-efficient neural computing, IEEE Trans. Comput. 70 (4) (2020) 614–625.
[30]
Alistarh D., Grubic D., Li J., Tomioka R., Vojnovic M., QSGD: Communication-efficient SGD via gradient quantization and encoding, Adv. Neural Inf. Process. Syst. 30 (2017).
[31]
Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., Witten I.H., The WEKA data mining software: an update, ACM SIGKDD Explor. Newsl. 11 (1) (2009) 10–18.
[32]
Kaggle, Google LLC M., Kaggle datasets, 2023, https://www.kaggle.com/datasets. (Online; accessed April 2023).
[33]
Statnikov A., Tsamardinos I., Dosbayev Y., Aliferis C.F., GEMS: a system for automated cancer diagnosis and biomarker discovery from microarray gene expression data, Int. J. Med. Inform. 74 (7–8) (2005) 491–503.
[34]
Li J., Liu H., Kent ridge bio-medical data set repository, Inst. Infocomm Res. (2002).
[35]
Aha D.W., Kibler D., Albert M.K., Instance-based learning algorithms, Mach. Learn. 6 (1) (1991) 37–66.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing
Neurocomputing  Volume 596, Issue C
Sep 2024
611 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 25 September 2024

Author Tags

  1. Feature selection
  2. Mutual information
  3. Low precision
  4. Internet of Things
  5. Federated learning
  6. Logarithmic division

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media