research-article

NMNN: : Newtonian Mechanics-based Natural Neighbor algorithm

Authors:

Tao LiuAuthors Info & Claims

Volume 681, Issue C

https://doi.org/10.1016/j.ins.2024.121200

Published: 18 October 2024 Publication History

Abstract

Natural neighbor (NaN) algorithm, as a parameter-free alternative to KNN, is widely used in various fields such as pattern recognition and machine learning. However, the original NaN algorithm only takes the Euclidean distance of the samples as the only criterion for similarity calculation. It does not reasonably reflect the relationship between the samples and affects convergence and accuracy. Newtonian mechanics considers the effect of forces from many perspectives. Consequently, in this paper, Newtonian mechanics is introduced to calculate the relationship between samples, then the Newtonian mechanics-based Natural Neighbor algorithm (NMNN) is proposed. At first, it introduces Newton's first, second, and third laws and the idea of gravity in the natural neighbor algorithm. Then, it computes the mass of the samples by the natural neighbor density. Moreover, it updates the similarity matrix through Newtonian mechanics, which makes the calculation results more accurate. The formation process of the natural neighbor can be regarded as the spontaneous mutual attraction of samples based on traction. Finally, the experimental results show that the convergence performance of the NMNN algorithm is better than that of the NaN algorithm, and it can achieve better results in applications such as unbalanced data, clustering, and outlier detection.

References

[1]

B. Tang, H. He, Enn: extended nearest neighbor method for pattern recognition [research frontier], IEEE Comput. Intell. Mag. 10 (3) (2015) 52–60.

[2]

C. Bohm, F. Krebs, High performance data mining using the nearest neighbor join, in: 2002 IEEE International Conference on Data Mining, in: Proceedings, 2002, IEEE, 2002, pp. 43–50.

[3]

G. Guo, H. Wang, D. Bell, Y. Bi, K. Greer, Knn model-based approach in classification, in: On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, in: Proceedings, Catania, Sicily, Italy, November 3-7, 2003, Springer, 2003, pp. 986–996.

[4]

Q.T. Tran, D. Taniar, M. Safar, Reverse k nearest neighbor and reverse farthest neighbor search on spatial networks, in: Transactions on Large-Scale Data- and Knowledge-Centered Systems I, 2009, pp. 353–372.

[5]

S. Arya, T.M. Chan, Better ϵ-dependencies for offline approximate nearest neighbor search, Euclidean minimum spanning trees, and ϵ-kernels, in: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, 2014, pp. 416–425.

[6]

Q. Zhu, J. Feng, J. Huang, Natural neighbor: a self-adaptive neighborhood method without parameter k, Pattern Recognit. Lett. 80 (2016) 30–36.

Digital Library

[7]

N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, Smote: synthetic minority over-sampling technique, J. Artif. Intell. Res. 16 (2002) 321–357.

[8]

H. Han, W.-Y. Wang, B.-H. Mao, Borderline-smote: a new over-sampling method in imbalanced data sets learning, in: International Conference on Intelligent Computing, Springer, 2005, pp. 878–887.

[9]

H. He, Y. Bai, E. Garcia, S.A. Li, Adaptive synthetic sampling approach for imbalanced learning. ieee international joint conference on neural networks, in: IEEE World Congress on Computational Intelligence, 2008.

[10]

J. Sivakumar, K. Ramamurthy, M. Radhakrishnan, D. Won, Synthetic sampling from small datasets: a modified mega-trend diffusion approach using k-nearest neighbors, Knowl.-Based Syst. 236 (2022).

[11]

J. Li, Q. Zhu, Q. Wu, Z. Fan, A novel oversampling technique for class-imbalanced learning based on smote and natural neighbors, Inf. Sci. 565 (2021) 438–455.

Digital Library

[12]

W. Wang, L. Yang, J. Yang, T. Li, Non-parameter oversampling algorithm based on natural neighbors, in: 2023 6th International Conference on Software Engineering and Computer Science (CSECS), 2023, pp. 1–6.

[13]

J.A. Hartigan, M.A. Wong, Algorithm as 136: a k-means clustering algorithm, J. R. Stat. Soc., Ser. C, Appl. Stat. 28 (1) (1979) 100–108.

[14]

U. Von Luxburg, A tutorial on spectral clustering, Stat. Comput. 17 (2007) 395–416.

Digital Library

[15]

M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al., A density-based algorithm for discovering clusters in large spatial databases with noise, in: kdd, vol. 96, 1996, pp. 226–231.

[16]

D. Cheng, Q. Zhu, J. Huang, Q. Wu, L. Yang, A novel cluster validity index based on local cores, IEEE Trans. Neural Netw. Learn. Syst. 30 (4) (2018) 985–999.

[17]

J. Zhang, L. Yang, Y. Zhang, D. Tang, T. Liu, Non-parameter clustering algorithm based on saturated neighborhood graph, Appl. Soft Comput. 130 (2022).

[18]

M.M. Breunig, H.-P. Kriegel, R.T. Ng, J. Sander, Lof: identifying density-based local outliers, in: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 93–104.

Digital Library

[19]

W. Jin, A.K. Tung, J. Han, W. Wang, Ranking outliers using symmetric neighborhood relationship, in: Advances in Knowledge Discovery and Data Mining: 10th Pacific-Asia Conference, in: Proceedings, vol. 10, PAKDD 2006, Singapore, April 9-12, 2006, Springer, 2006, pp. 577–593.

[20]

J. Ha, S. Seok, J.-S. Lee, Robust outlier detection using the instability factor, Knowl.-Based Syst. 63 (2014) 15–23.

Digital Library

[21]

J. Huang, Q. Zhu, L. Yang, J. Feng, A non-parameter outlier detection algorithm based on natural neighbor, Knowl.-Based Syst. 92 (2016) 71–77.

[22]

G. Qian, S. Sural, Y. Gu, S. Pramanik, Similarity between Euclidean and cosine angle distance for nearest neighbor queries, in: Proceedings of the 2004 ACM Symposium on Applied Computing, 2004, pp. 1232–1237.

[23]

A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science 344 (6191) (2014) 1492–1496.

[24]

R. Mehmood, R. Bie, L. Jiao, H. Dawood, Y. Sun, Adaptive cutoff distance: clustering by fast search and find of density peaks, J. Intell. Fuzzy Syst. 31 (5) (2016) 2619–2628.

[25]

E. Hecht, Origins of Newton's first law, Phys. Teach. 53 (2) (2015) 80–83.

[26]

B. Pourciau, Newton's interpretation of Newton's second law, Arch. Hist. Exact Sci. 60 (2) (2006) 157–207.

[27]

P. Cornille, Review of the application of Newton's third law in physics, Prog. Energy Combust. Sci. 25 (2) (1999) 161–210.

[28]

E. Verlinde, On the origin of gravity and the laws of Newton, J. High Energy Phys. 2011 (4) (2011).

[29]

M. Kloft, U. Brefeld, P. Laskov, K.-R. Müller, A. Zien, S. Sonnenburg, Efficient and accurate lp-norm multiple kernel learning, Adv. Neural Inf. Process. Syst. 22 (2009).

[30]

A.D. Forbes, Classification-algorithm evaluation: five performance measures based on confusion matrices, J. Clin. Monit. 11 (1995) 189–206.

[31]

A. Asuncion, D. Newman, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, 2007, http://archive.ics.uci.edu/ml.

[32]

B. Wang, C. Li, V. Pavlu, J. Aslam, A pipeline for optimizing f1-measure in multi-label text classification, in: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, 2018, pp. 913–918.

[33]

H. Guo, H. Liu, C. Wu, W. Zhi, Y. Xiao, W. She, Logistic discrimination based on g-mean and f-measure for imbalanced problem, J. Intell. Fuzzy Syst. 31 (3) (2016) 1155–1166.

Digital Library

[34]

S. Ruggieri, Efficient c4. 5 [classification algorithm], IEEE Trans. Knowl. Data Eng. 14 (2) (2002) 438–444.

[35]

F. Abramovich, V. Grinshtein, High-dimensional classification by sparse logistic regression, IEEE Trans. Inf. Theory 65 (5) (2018) 3068–3079.

[36]

P. Cunningham, S.J. Delany, k-nearest neighbour classifiers-a tutorial, ACM Comput. Surv. 54 (6) (2021) 1–25.

[37]

B. Perozzi, L. Akoglu, P. Iglesias Sánchez, E. Müller, Focused clustering and outlier detection in large attributed graphs, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 1346–1355.

[38]

D. Marutho, S.H. Handaka, E. Wijaya, et al., The determination of cluster number at k-mean using elbow method and purity evaluation on headline news, in: 2018 International Seminar on Application for Technology of Information and Communication, IEEE, 2018, pp. 533–538.

[39]

P. Třasák, M. Štroner, Outlier detection efficiency in the high precision geodetic network adjustment, Acta Geod. Geophys. 49 (2014) 161–175.

[40]

A. Smiti, A critical overview of outlier detection methods, Comput. Sci. Rev. 38 (2020).

Digital Library

Index Terms

NMNN: Newtonian Mechanics-based Natural Neighbor algorithm

Index terms have been assigned to the content through auto-classification.

Recommendations

Natural neighbor-based clustering algorithm with local representatives

Clustering by identifying cluster centers is important for detecting patterns in a data set. However, many center-based clustering algorithms cannot process data sets containing non-spherical clusters. In this paper, we propose a novel clustering ...
Natural Neighbor Clustering Algorithm without Boundary
CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

Most density-based clustering algorithms are only suitable for spherical data set. When processing streamlined data sets without cluster centers, the clustering results have certain defects. In order to deal with the clustering problem of streamlined ...
Robust Non-parameter Clustering Algorithm Based on Saturated Neighborhood Graph
ACAI '21: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

Clustering is a very popular technique in data mining applications for discovering patterns in underlying data. However, most traditional clustering algorithms need to manually set one or more parameters so that it is very difficult to select an ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Sciences: an International Journal

Information Sciences: an International Journal Volume 681, Issue C

Oct 2024

1022 pages

Issue’s Table of Contents

Elsevier Inc.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 18 October 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents