Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

FUDT: A Fuzzy Uncertain Decision Tree Algorithm for Classification of Uncertain Data

  • Research Article - Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

The classifications of uncertain data turned into one of the dreary procedures in the data mining domain. The uncertain data have tuples with distinctive probability distribution, which helps to find similar class of tuples. When we consider an uncertain data, the feature vector will not be a single valued but a function. In this paper, we proposed fuzzy entropy and similarity measure to characterize the uncertain data through binary decision tree algorithm. Fuzzy entropy is used to find the best split point for the decision tree to handle the uncertain data. Similarity measure is used to make the better decision for the uncertain data with high accuracy. Initially, fuzzy entropy for each feature vector is calculated to select the best feature vector. Then, best split is selected from the selected feature vector. With the help of trained uncertain data, the binary tree starts to grow. Once the split point is selected, then the constructed decision tree is evaluated by the testing phase of uncertain data. The testing data are subjected to the trained decision tree to obtain the classified data. The experimental analyses are made to evaluate the performance of the proposed FUDT approach. Proposed FUDT algorithm is compared with the existing classification algorithm UDT in terms of accuracy and running time. The experimental analysis finalizes that our FUDT algorithm outperforms the existing UDT algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Denoeux T.: Maximum likelihood estimation from uncertain data in the belief function framework. IEEE Trans. Knowl. Data Eng. 649, 119–130 (2013)

    Article  Google Scholar 

  2. Charu, C.; Aggarwal.; Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Trans. Knowl. Data Eng. 21(5), 609–623 (2009)

  3. Barbara, D.; Garcia-Molina, H.; Porter, D.: The management of probabilistic data. IEEE Trans. Knowl. Data Eng. 4(5), 487–502 (1992)

  4. Puente, J.; Fuente, D.; Priore P.; Pino, R.: Abc classification with uncertain data. A fuzzy model vs. a probabilistic model. Appl. Artif. Intell. 16(6), 443–456 (2002)

  5. Cheng, R.; Kalashnikov, D.; Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2003)

  6. Chau, M.; Cheng, R.; Kao, B.: Uncertain at a mining: a new research direction. In: Proceedings of the Workshop on the Sciences of the Artificial, Hualien, pp. 7–8 (2005)

  7. Qin.; Biao.; Xia, Y.; Li, F.: DTU: a decision tree for uncertain data. Adv. Knowl. Discov. Data Mining 4–15 (2009)

  8. Choudhary V., Jain P.: Classification: a decision tree for uncertain data using CDF. Int. J. Eng. Res. Appl. 3(1), 1501–1506 (2013)

    Google Scholar 

  9. Appriou A.: Uncertain data aggregation in classification and tracking processes. Aggreg. Fusion Imperf. Inf. Stud. Fuzz. Soft Comput. 12, 231–260 (1998)

    Article  MathSciNet  Google Scholar 

  10. Sun, Y.; Yuan, Y.; Wang, G.: Extreme learning machine for classification over uncertain data. Neurocomputing 128, 500–506 (2013)

  11. Quinlan, J.R.: Probabilistic decision trees. Mach. Learn. 1, 81–106 (1990)

  12. Lobo, O.O.; Numao, M.: Ordered estimation of missing values. PAKDD 239, 499–503, (1999)

  13. Hawarah, L.; Simonet, A.; Simonet, M.: Dealing with missing values in a probabilistic decision tree during classification. In: The Second International Workshop on Mining Complex Data, pp. 325–329 (2006)

  14. Bounhas, M.; et al.: Naive possibilistic classifiers for imprecise or uncertain numerical data. Fuzzy Sets Syst. 239, 137–156 (2013)

  15. Angryk, R.A.: Similarity-driven defuzzification of fuzzy tuples for entropy-based data classification purposes. IEEE Int. Conf. Fuzzy Syst. 99, 414–422 (2006)

  16. Kumar A., Dadhwal V.K.: Entropy-based fuzzy classification parameter optimization using uncertainty variation across spatial resolution. J. Ind. Soc. Remote Sens. 38(2), 179–192 (2010)

    Article  Google Scholar 

  17. Qin, B.; et al.: A novel Bayesian classification for uncertain data. Knowl. Based Syst. 24, 1151–1158 (2011)

  18. Quinlan J.R.: C4.5: Programs for Machine Learning. Morgan Kaufman, Burlington (1993)

    Google Scholar 

  19. Cohen, W.W.: Fast effective rule induction. In: Proceeding of the 12th International Conferrence on Machine Learning, pp. 115–123 (1995)

  20. Langley, P.; Iba, W.; Thompson, K.: An analysis of Bayesian classifiers. In: National Conference on Artificial Intelligence, pp. 223–228 (1992)

  21. Vapnik V.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)

    Book  MATH  Google Scholar 

  22. Andrews R., Diederich J., Tickle A.: A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl. Based Syst. 8(6), 373–389 (1995)

    Article  Google Scholar 

  23. Dietterich, T.G.: Ensemble methods in machine learning. Lect. Notes Comput. Sci. 1857, 1–15 (2000)

  24. Farid, D.M.; et al.: Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Syst. Appl. 41, 1937–1946 (2014)

  25. Mantas, C.J.; Abellán, J.: Analysis and extension of decision trees based on imprecise probabilities: application on noisy data. Expert Syst. Appl. 41, 2514–2525 (2014)

  26. Khushaba, R.N.; Al-Jumaily, A.; Al-Ani, A.: Novel feature extraction method based on fuzzy entropy and wavelet packet transform for myoelectric control. International Symposium on Communications and Information Technologies (2007)

  27. Tsang, S.; et al.: Decision trees for uncertain data. IEEE Trans. Knowl. Data Eng. 23, 64–78 (2011)

  28. Tsang, S.; Kao, B.; Yip, K.Y.; Ho, W.S.; Lee, S.D.: Decision trees for uncertain data. In: Proceeding on International Conference Data Engineering, pp. 441–444, Mar/Apr (2009)

  29. Iris dataset https://archive.ics.uci.edu/ml/datasets/Iris

  30. Liver disorder dataset https://archive.ics.uci.edu/ml/machine-learning-databases/liver-disorders/bupa.data

  31. Breast cancer dataset http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.names

  32. Echocardiogram dataset from https://archive.ics.uci.edu/ml/machine-learning-databases/echocardiogram/echocardiogram.data

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Meenakshi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meenakshi, S., Venkatachalam, V. FUDT: A Fuzzy Uncertain Decision Tree Algorithm for Classification of Uncertain Data. Arab J Sci Eng 40, 3187–3196 (2015). https://doi.org/10.1007/s13369-015-1800-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-015-1800-0

Keywords

Navigation