Abstract
Iterative compilation based on machine learning can automatically predict the best optimization for the new programs. However, the efficient prediction models often require repetitive training, which leads to a higher training time overheads, and greatly affects the widespread utilization of the technology. The existing approaches in the prediction model construction often use random sample search strategy, which easily lead to data redundancy. In addition, due to the effect of run-time noises, the sample program is subjected to a fixed number of repetitive observations. However, in the case there is very little noises, the repetitive observations will result in a serious waste of iterative compilation time overheads. Therefore, how to effectively collect the optimal prediction model samples and choose the appropriate sample observations number are the key problem of reducing the iterative compilation overheads. We propose a low overheads iterative compilation optimization parameters prediction model ALIC. First, we describe the target programs by static-dynamic features representation based on feature-class relevance, and construct an initial optimization prediction model by the classifier. Then we use a dynamic number of sample observations strategy for each sample. The most profitable sample from the candidate samples set is selected and marked, each mark is equivalent to increase the number of sample observations. Finally, the optimization prediction model is constructed based on the intermediate prediction model that learns candidate samples actively. The experimental results show that when predicting optimization parameters for the new programs on Intel Xeon E5520 and Chinese Shenwei 26010 platforms, the ALIC model generates 1.38× (by ICC14.0 compiler), 1.35× (by GCC5.4 compiler) average performance improvement on the Xeon platform, and 1.42× (by SW compiler) on the Shenwei Platform. In addition, the ALIC model can significantly reduce the iterative compilation training time overheads than the existing approaches.
Similar content being viewed by others
References
Chen, Y., Fang, S. D., et al. (2012). Deconstructing iterative optimization. ACM Transactions on Architecture and Code Optimization (TACO), 9(3), 1–30.
Fang, S. D., Xu, W. W., et al. (2015). Practical iterative optimization for the data center. ACM Transactions on Architecture and Code Optimization (TACO), 12(2), 1–26.
Nobre, R., Martins, L. G., & Cardoso, J. M. (2015). Use of previously acquired positioning of optimizations for phase ordering exploration. In Proceedings of the 18th international workshop on software and compilers for embedded systems, pp. 58–67.
Li, F. Q., Tang, F. L., & Shen, Y. (2014). Feature mining for machine learning based compilation optimization. In Proceedings of the eighth international conference on innovative mobile and internet services in ubiquitous computing, pp. 207–214.
Ballal, P. A., Sarojadevi, H., et al. (2015). Compiler optimization: A genetic algorithm approach. International Journal of Computer Applications., 112(10), 9–13.
Schkufza, E., Sharma, R., & Aiken, A. (2014). Stochastic optimization of floating-point programs with tunable precision. In Proceedings of programming language design and implementation (PLDI), pp. 53–64.
Purini, S., & Jain, L. (2013). Finding good optimization sequences covering program space. ACM Transactions on Architecture and Code Optimization (TACO), 9(4), 56:1–56:23.
Wang, Z., & Boyle, M. F. P. O. (2013). Using machine learning to partition streaming programs. ACM Transactions on Architecture and Code Optimization (TACO), 10(3), 20:1–20:25.
Trouvé, A., Cruz, A., et al. (2013). Using machine learning in order to improve automatic SIMD instruction generation. In Proceedings of the international conference on computational science, pp. 1292–1301.
Kumar, T. S., Sakthivel, S., & Kumar, S. (2014). Optimizing code by selecting compiler flags using parallel genetic algorithm on multicore CPUs. International Journal of Engineering and Technology (IJET), 6(2), 544–555.
Ogilvie, W. F., Petoumenos, P., Wang, Z., & Leather H. (2014). Fast automatic heuristic construction using active learning. In International workshop on languages and compilers for parallel computing (LCPC), pp. 146–160.
Balaprakash, P., Gramacy, R. B., & Wild, S. M. (2013). Active-learning based surrogate models for empirical performance tuning. In IEEE international conference on cluster computing, pp. 1–8.
Balaprakash, P., Rupp, K., Mametjanov, A., et al. (2013). Empirical performance modeling of GPU kernels using active learning. In International conference on parallel computing, pp 646–655.
Mazouz, A., Touati, S. A. A., & Barthou, D. (2010). Study of variations of native program execution times on multi-core architectures. In Conference on complex, intelligent and software intensive systems (CISIS), pp. 919–924.
SPEC CPU2006: SPEC CPU2006 benchmark suite. http://www.spec.org/cpu/.
Okada, T. K., Goldman, A., & Cavalheiro, G. G. H. (2016). Using NAS Parallel Benchmarks to evaluate HPC performance in clouds. In IEEE 15th international symposium on network computing and applications (NCA), pp. 27–30.
Sani, S., Wiratunga, N., Massie, S., & Cooper, K. (2017). kNN sampling for personalised human activity recognition. In International conference on case-based reasoning, pp. 330–344.
Dionisios, N. S., Demitrios, E. P., & George, G. (2017). SVM-based sentiment classification: A comparative study against state-of-the-art classifiers. International Journal of Computational Intelligence Studies., 6(1), 52–67.
Acknowledgements
This work was funded by the National key research and development program “high-performance computing” key special (2016YFB0200503).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, H., Zhao, R., Wang, Q. et al. ALIC: A Low Overhead Compiler Optimization Prediction Model. Wireless Pers Commun 103, 809–829 (2018). https://doi.org/10.1007/s11277-018-5479-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-018-5479-x