Deep Regressor Stacking for Air Ticket Prices Prediction
Resumo
Purchasing air tickets by the lowest price is a challenging task for consumers since the prices might fluctuate over time influenced by several factors. In order to support users’ decision, some price prediction techniques have been developed. Considering that this problem could be solved by multi-target approaches from Machine Learning, this work proposes a novel method looking forward to obtaining an improvement in air ticket prices prediction. The method, called Deep Regressor Stacking (DRS), applies a naive deep learning methodology to reach more accurate predictions. To evaluate the contribution of the DRS, it was compared with the competence of the single-target regression and two state-of-the-art multi-target regressions (Stacked Single Target and Ensemble of Regressor Chains). All four approaches were performed based on Random Forest and Support Vector Machine algorithms over two real-life airfares datasets. After results, it was concluded DRS outperformed the other three methods, being the most indicated (most predictive) to assist air passengers in the prediction of flight ticket price.
Referências
Asa Ben-Hur and Jason Weston. A user’s guide to support vector machines. Data Mining Techniques for the Life Sciences, pages 223–239, 2010.
Douglas G. Bonett and Thomas A. Wright. Sample size requirements for estimating pearson, kendall and spearman correlations. Psychometrika, 65(1):23–28, 2000.
Hanen Borchani, Gherardo Varando, Concha Bielza, and Pedro Larra˜naga. A survey on multi-output regression. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5(5):216–233, 2015.
Leo Breiman. Random forests. Machine learning, 45 (1):5–32, 2001.
JA Cornell and RD Berger. Factors that influence the value of the coecient of determination in simple linear and nonlinear regression models. Phytopathology, 77(1): 63–70, 1987.
Janez Demšar. Statistical comparisons of classifiers over multiple data ss. The Journal of Machine Learning Research, 7:1–30, 2006.
Oren Etzioni, Rattapoom Tuchinda, Craig a. Knoblock, and Alexander Yates. To buy or not to buy: Mining airfare data to minimize ticket purchase price. Proceedings of the Ninth International Conference on Knowledge Discovery and Data Mining, (August):119–128, 2003.
William Groves and Maria Gini. A regression model for predicting optimal purchase timing for airline tickets. Technical Report, 2011.
William Groves and Maria Gini. On Optimizing Airline Ticket Purchase Timing. ACM Trans. Intell. Syst. Technol. 7, 1, Article 3, 7(1):1–28, 2015.
Dragi Kocev, Celine Vens, Jan Struyf, and Sašo Džeroski. Ensembles of multi-objective decision trees. In European Conference on Machine Learning, pages 624– 631. Springer, 2007.
Dragi Kocev, Sašo Džeroski, Matt D White, Graeme R Newell, and Peter Grioen. Using singleand multitarget regression trees and ensembles to model a compound index of vegetation condition. Ecological Modelling, 220(8):1159–1168, 2009.
Guangcan Liu, Zhouchen Lin, and Yong Yu. Multioutput regression on the output manifold. Pattern Recognition, 42(11):2737–2743, 2009.
Eleftherios Spyromitros-Xioufis, Grigorios Tsoumakas, William Groves, and Ioannis Vlahavas. Multi-target regression via input space expansion: treating targets as inputs. Machine Learning, 104(1):55–98, 2016.
Joanna Stavins. Price Discrimination in the Airline Market: The Effect of Market Concentration. The Review of Economics and Statistics, 83(1):200–202, 2001.
Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis, Aikaterini Vrekou, and Ioannis Vlahavas. Multi-target regression via random linear target combinations. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 225–240. Springer, 2014.
Till Wohlfarth, Stéphan Clémençon, François Roueff, and Xavier Casellato. A Data-Mining Approach to Travel Price Forecasting. ICMLA, 2011.
Tao Xiong, Yukun Bao, and Zhongyi Hu. Multipleoutput support vector regression with a firefly algorithm for interval-valued stock price index forecasting. Knowledge-Based Systems, 55:87–100, 2014.
Shuo Xu, Xin An, Xiaodong Qiao, Lijun Zhu, and Lin Li. Multi-output least-squares support vector regression machines. Pattern Recognition Lett., 34(9):1078–1084, 2013.
Wei Zhang, Xianhui Liu, Yi Ding, and Deming Shi. Multi-output LS-SVR machine in extended feature space. CIMSA 2012 - IEEE Int. Conf. Comput. Intell. Meas. Syst. Appl. Proc., pages 130–144, 2012