Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

NMNN: : Newtonian Mechanics-based Natural Neighbor algorithm

Published: 18 October 2024 Publication History

Abstract

Natural neighbor (NaN) algorithm, as a parameter-free alternative to KNN, is widely used in various fields such as pattern recognition and machine learning. However, the original NaN algorithm only takes the Euclidean distance of the samples as the only criterion for similarity calculation. It does not reasonably reflect the relationship between the samples and affects convergence and accuracy. Newtonian mechanics considers the effect of forces from many perspectives. Consequently, in this paper, Newtonian mechanics is introduced to calculate the relationship between samples, then the Newtonian mechanics-based Natural Neighbor algorithm (NMNN) is proposed. At first, it introduces Newton's first, second, and third laws and the idea of gravity in the natural neighbor algorithm. Then, it computes the mass of the samples by the natural neighbor density. Moreover, it updates the similarity matrix through Newtonian mechanics, which makes the calculation results more accurate. The formation process of the natural neighbor can be regarded as the spontaneous mutual attraction of samples based on traction. Finally, the experimental results show that the convergence performance of the NMNN algorithm is better than that of the NaN algorithm, and it can achieve better results in applications such as unbalanced data, clustering, and outlier detection.

References

[1]
B. Tang, H. He, Enn: extended nearest neighbor method for pattern recognition [research frontier], IEEE Comput. Intell. Mag. 10 (3) (2015) 52–60.
[2]
C. Bohm, F. Krebs, High performance data mining using the nearest neighbor join, in: 2002 IEEE International Conference on Data Mining, in: Proceedings, 2002, IEEE, 2002, pp. 43–50.
[3]
G. Guo, H. Wang, D. Bell, Y. Bi, K. Greer, Knn model-based approach in classification, in: On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, in: Proceedings, Catania, Sicily, Italy, November 3-7, 2003, Springer, 2003, pp. 986–996.
[4]
Q.T. Tran, D. Taniar, M. Safar, Reverse k nearest neighbor and reverse farthest neighbor search on spatial networks, in: Transactions on Large-Scale Data- and Knowledge-Centered Systems I, 2009, pp. 353–372.
[5]
S. Arya, T.M. Chan, Better ϵ-dependencies for offline approximate nearest neighbor search, Euclidean minimum spanning trees, and ϵ-kernels, in: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, 2014, pp. 416–425.
[6]
Q. Zhu, J. Feng, J. Huang, Natural neighbor: a self-adaptive neighborhood method without parameter k, Pattern Recognit. Lett. 80 (2016) 30–36.
[7]
N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, Smote: synthetic minority over-sampling technique, J. Artif. Intell. Res. 16 (2002) 321–357.
[8]
H. Han, W.-Y. Wang, B.-H. Mao, Borderline-smote: a new over-sampling method in imbalanced data sets learning, in: International Conference on Intelligent Computing, Springer, 2005, pp. 878–887.
[9]
H. He, Y. Bai, E. Garcia, S.A. Li, Adaptive synthetic sampling approach for imbalanced learning. ieee international joint conference on neural networks, in: IEEE World Congress on Computational Intelligence, 2008.
[10]
J. Sivakumar, K. Ramamurthy, M. Radhakrishnan, D. Won, Synthetic sampling from small datasets: a modified mega-trend diffusion approach using k-nearest neighbors, Knowl.-Based Syst. 236 (2022).
[11]
J. Li, Q. Zhu, Q. Wu, Z. Fan, A novel oversampling technique for class-imbalanced learning based on smote and natural neighbors, Inf. Sci. 565 (2021) 438–455.
[12]
W. Wang, L. Yang, J. Yang, T. Li, Non-parameter oversampling algorithm based on natural neighbors, in: 2023 6th International Conference on Software Engineering and Computer Science (CSECS), 2023, pp. 1–6.
[13]
J.A. Hartigan, M.A. Wong, Algorithm as 136: a k-means clustering algorithm, J. R. Stat. Soc., Ser. C, Appl. Stat. 28 (1) (1979) 100–108.
[14]
U. Von Luxburg, A tutorial on spectral clustering, Stat. Comput. 17 (2007) 395–416.
[15]
M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al., A density-based algorithm for discovering clusters in large spatial databases with noise, in: kdd, vol. 96, 1996, pp. 226–231.
[16]
D. Cheng, Q. Zhu, J. Huang, Q. Wu, L. Yang, A novel cluster validity index based on local cores, IEEE Trans. Neural Netw. Learn. Syst. 30 (4) (2018) 985–999.
[17]
J. Zhang, L. Yang, Y. Zhang, D. Tang, T. Liu, Non-parameter clustering algorithm based on saturated neighborhood graph, Appl. Soft Comput. 130 (2022).
[18]
M.M. Breunig, H.-P. Kriegel, R.T. Ng, J. Sander, Lof: identifying density-based local outliers, in: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 93–104.
[19]
W. Jin, A.K. Tung, J. Han, W. Wang, Ranking outliers using symmetric neighborhood relationship, in: Advances in Knowledge Discovery and Data Mining: 10th Pacific-Asia Conference, in: Proceedings, vol. 10, PAKDD 2006, Singapore, April 9-12, 2006, Springer, 2006, pp. 577–593.
[20]
J. Ha, S. Seok, J.-S. Lee, Robust outlier detection using the instability factor, Knowl.-Based Syst. 63 (2014) 15–23.
[21]
J. Huang, Q. Zhu, L. Yang, J. Feng, A non-parameter outlier detection algorithm based on natural neighbor, Knowl.-Based Syst. 92 (2016) 71–77.
[22]
G. Qian, S. Sural, Y. Gu, S. Pramanik, Similarity between Euclidean and cosine angle distance for nearest neighbor queries, in: Proceedings of the 2004 ACM Symposium on Applied Computing, 2004, pp. 1232–1237.
[23]
A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science 344 (6191) (2014) 1492–1496.
[24]
R. Mehmood, R. Bie, L. Jiao, H. Dawood, Y. Sun, Adaptive cutoff distance: clustering by fast search and find of density peaks, J. Intell. Fuzzy Syst. 31 (5) (2016) 2619–2628.
[25]
E. Hecht, Origins of Newton's first law, Phys. Teach. 53 (2) (2015) 80–83.
[26]
B. Pourciau, Newton's interpretation of Newton's second law, Arch. Hist. Exact Sci. 60 (2) (2006) 157–207.
[27]
P. Cornille, Review of the application of Newton's third law in physics, Prog. Energy Combust. Sci. 25 (2) (1999) 161–210.
[28]
E. Verlinde, On the origin of gravity and the laws of Newton, J. High Energy Phys. 2011 (4) (2011).
[29]
M. Kloft, U. Brefeld, P. Laskov, K.-R. Müller, A. Zien, S. Sonnenburg, Efficient and accurate lp-norm multiple kernel learning, Adv. Neural Inf. Process. Syst. 22 (2009).
[30]
A.D. Forbes, Classification-algorithm evaluation: five performance measures based on confusion matrices, J. Clin. Monit. 11 (1995) 189–206.
[31]
A. Asuncion, D. Newman, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, 2007, http://archive.ics.uci.edu/ml.
[32]
B. Wang, C. Li, V. Pavlu, J. Aslam, A pipeline for optimizing f1-measure in multi-label text classification, in: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, 2018, pp. 913–918.
[33]
H. Guo, H. Liu, C. Wu, W. Zhi, Y. Xiao, W. She, Logistic discrimination based on g-mean and f-measure for imbalanced problem, J. Intell. Fuzzy Syst. 31 (3) (2016) 1155–1166.
[34]
S. Ruggieri, Efficient c4. 5 [classification algorithm], IEEE Trans. Knowl. Data Eng. 14 (2) (2002) 438–444.
[35]
F. Abramovich, V. Grinshtein, High-dimensional classification by sparse logistic regression, IEEE Trans. Inf. Theory 65 (5) (2018) 3068–3079.
[36]
P. Cunningham, S.J. Delany, k-nearest neighbour classifiers-a tutorial, ACM Comput. Surv. 54 (6) (2021) 1–25.
[37]
B. Perozzi, L. Akoglu, P. Iglesias Sánchez, E. Müller, Focused clustering and outlier detection in large attributed graphs, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 1346–1355.
[38]
D. Marutho, S.H. Handaka, E. Wijaya, et al., The determination of cluster number at k-mean using elbow method and purity evaluation on headline news, in: 2018 International Seminar on Application for Technology of Information and Communication, IEEE, 2018, pp. 533–538.
[39]
P. Třasák, M. Štroner, Outlier detection efficiency in the high precision geodetic network adjustment, Acta Geod. Geophys. 49 (2014) 161–175.
[40]
A. Smiti, A critical overview of outlier detection methods, Comput. Sci. Rev. 38 (2020).

Index Terms

  1. NMNN: Newtonian Mechanics-based Natural Neighbor algorithm
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Information Sciences: an International Journal
          Information Sciences: an International Journal  Volume 681, Issue C
          Oct 2024
          1022 pages

          Publisher

          Elsevier Science Inc.

          United States

          Publication History

          Published: 18 October 2024

          Author Tags

          1. Natural neighbor
          2. Newtonian mechanics
          3. Unbalanced data
          4. Clustering
          5. Outlier detection

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 0
            Total Downloads
          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 10 Nov 2024

          Other Metrics

          Citations

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media