Abstract
Univariate decision trees (UDT’s) have inherent problems of replication, repetition, and fragmentation. Multivariate decision trees (MDT’s) have been proposed to overcome some of the problems. Close examination of the conventional ways of building MDT’s, however, reveals that the fragmentation problem still persists. A novel approach is suggested to minimize the fragmentation problem by separating hyperplane search from decision tree building. This is achieved by feature transformation. Let the initial feature vector be x, the new feature vector after feature transformation T is y, i.e., y = T(x). We can obtain an MDTb y (1) building a UDT on y; and (2) replacing new features y at each node with the combinations of initial features x. We elaborate on the advantages of this approach, the details of T, and why it is expected to perform well. Experiments are conducted in order to confirm the analysis, and results are compared to those of C4.5, OC1, and CART
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
K.P. Bennett and O.L. Mangasarian. Neural network training via linear programming. In P.M. Pardalos, editor, Advances in Optimization and Parallel Computing, pages 56–67. Elsevier Science Publishers B.V., Amsterdam, 1992.
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, 1984.
C.E. Brodley and P.E. Utgoff. Multivariate decision trees. Machine Learning, 19:45–77, 1995.
M. Dash and H. Liu. Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1(3), 1997. http://www-east.elsevier.com/ida/free.htm.
U.M. Fayyad and K.B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1022–1027. Morgan Kaufmann Publishers, Inc., 1993.
J.H. Friedman, R. Kohavi, and Y. Yun. Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 717–724, 1996.
L. Fu. Neural Networks in Computer Intelligence. McGraw-Hill, 1994.
B. Hassibi and D.G. Stork. Second order derivatives for network pruning: Optimal brain surgeon. Neural Information Processing Systems, 5:164–171, 1993.
D. Heath, S. Kasif, and S. Salzberg. Learning oblique decision trees. In Proceedings of the Thirteenth International Joint Conference on AI, pages 1002–1007, France, 1993.
K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129–134. Menlo Park: AAAI Press/The MIT Press, 1992.
H. Liu and R. Setiono. Chi2: Feature selection and discretization of numeric attributes. In J.F. Vassilopoulos, editor, Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, November 5–8, 1995, pages 388–391, Herndon, Virginia, 1995. IEEE Computer Society.
C Matheus and L. Rendell. Constructive induction on decision trees. In Proceedings of International Joint Conference on AI, pages 645–650, August1989.
C.J. Merz and P.M. Murphy. UCI repository of machine learning databases. http://www.ics.uci.edu/ mlearn/MLRepository.html. Irvine, CA: University of California,Department of Information and Computer Science, 1996.
John Mingers. An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3:319–342, 1989.
S Murthy, S. Kasif, S. Salzberg, and R. Beigel. Oc1: Randomized induction of oblique decision trees. In Proceedings of AAAIConference (AAAI’93), pages 322–327. AAAI Press / The MIT Press, 1993.
G. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 5:71–99, 1990.
J.R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
D.E. Rumelhart, J.L. McClelland, and the PDP Research Group. Parallel Distributed Processing, volume 1. Cambridge, Mass. The MIT Press, 1986.
I.K. Sethi. Neural implementation of tree classifiers. IEEE Trans. on Systems, Man, and Cybernetics, 25(8), August 1995.
R. Setiono. A penalty-function approach for pruning feedforward neural networks. Neural Computation, 9(1):185–204, 1997.
R. Setiono and H. Liu. Understanding neural networks via rule extraction. In Proceedings of International Joint Conference on AI, 1995.
R. Setiono and H. Liu. Analysis of hidden representations by greedy clustering. Connection Science, 10(1):21–42, 1998.
J.W. Shavlik, R.J. Mooney, and G.G. Towell. Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6(2):111–143, 1991.
G.G. Towell and J.W. Shavlik. Extracting refined rules from knowledge-based neural networks. Machine Learning, 13(1):71–101, 1993.
P.E. Utgo. and C.E. Brodley. An incremental method for finding multivariate splits for decision trees. In Machine Learning: Proceedings of the Seventh International Conference, pages 58–65. University of Texas, Austin,Texas, 1990.
R. Vilalta, G. Blix, and L. Rendell. Global data analysis and the fragmentation problem in decision tree induction. In M. van Someren and G. Widmer, editors, Machine Learning: ECML-97, pages 312–326. Springer-Verlag, 1997.
J. Wnek and R.S. Michalski. Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments. Machine Learning, 14, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg Berlin Heidelberg
About this paper
Cite this paper
Liu, H., Setiono, R. (1998). Feature Transformation and Multivariate Decision Tree Induction. In: Arikawa, S., Motoda, H. (eds) Discovey Science. DS 1998. Lecture Notes in Computer Science(), vol 1532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49292-5_25
Download citation
DOI: https://doi.org/10.1007/3-540-49292-5_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65390-5
Online ISBN: 978-3-540-49292-4
eBook Packages: Springer Book Archive