Abstract
Cubemap projection (CMP) becomes a potential panoramic data format for its efficiency. However, default CMP coordinate system with fixed viewpoint may cause distortion, especially around the boundaries of each projection plane. To promote quality of panoramic images in CMP, we propose a content-awared CMP optimization method via deep Q-learning. The key of this method is to predict an angle for rotating the image in Equirectangular projection (ERP), which attempts to keep foreground objects away from the edge of each projection plane after the image is re-projected with CMP. Firstly, the panoramic image in ERP is preprocessed for obtaining a foreground pixel map. Secondly, we feed the foreground map into the proposed deep convolutional network (ConvNet) to obtain the predicted rotation angle. The model parameters are training through the deep Q-learning scheme. Experimental results show our method keep more foreground pixels in center of each projection plane than the baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ng, K.-T., Chan, S.-C., Shum, H.-Y.: Data compression and transmission aspects of panoramic videos. IEEE Trans. Circuits Syst. Video Technol. 15(1), 82–95 (2005)
Grünheit, C., Smolic, A., Wiegand, T.: Efficient representation and interactive streaming of high-resolution panoramic views. In: Proceedings of the 2002 International Conference on Image Processing (ICIP), pp. 209–212 (2002)
Xiong, B., Grauman, K.: Snap angle prediction for 360\(^{\circ }\) panoramas. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 3–20. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_1
Abudahab, K., et al.: Panini: pangenome neighbour identification for bacterial populations. Microbial Genomics 5(4) (2019)
Kim, Y.W., Lee, C.-R., Cho, D.-Y., Kwon, Y.H., Choi, H.-J., Yoon, K.-J.: Automatic content-aware projection for \(360^{\circ }\) videos. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4753–4761 (2017)
Tehrani, M.A., Majumder, A., Gopi, M.: Correcting perceived perspective distortions using object specific planar transformations. In: 2016 IEEE International Conference on Computational Photography (ICCP), pp. 1–10, May 2016
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations (ICLR) (2016)
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning (ICML), pp. 2829–2838 (2016)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.A.: Deterministic policy gradient algorithms. In: Proceedings of the 31th International Conference on Machine Learning (ICML), vol. 32, pp. 387–395 (2014)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Xu, M., Song, Y., Wang, J., Qiao, M., Huo, L., Wang, Z.: Predicting head movement in panoramic video: a deep reinforcement learning approach. IEEE Trans. Pattern Anal. Mach. Intell. (2018)
Jain, S.D., Xiong, B., Grauman, K.: Pixel objectness: learning to segment generic objects automatically in images and videos. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) (2018)
Su, Y.-C., Grauman, K.: Learning spherical convolution for fast features from \(360^{\circ }\) imagery. In: Advances in Neural Information Processing Systems, pp. 529–539 (2017)
Su, Y.-C., Grauman, K.: Kernel transformer networks for compact spherical convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9442–9451 (2019)
Su, Y.-C., Jayaraman, D., Grauman, K.: Pano2Vid: automatic cinematography for watching 360\(^{\circ }\) videos. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10114, pp. 154–171. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54190-7_10
Su, Y.-C., Grauman, K.: Making \(360^{\circ }\) video watchable in 2D: learning videography for click free viewing. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1368–1376. IEEE (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Uhlenbeck, G.E., Ornstein, L.S.: On the theory of the Brownian motion. Phys. Rev. 36(5), 823 (1930)
Liu, T., et al.: Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 33(2), 353–367 (2010)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (Grant 31670553, Grant 61871270 and Grant 61672443), in part by the Guangdong Natural Science Foundation of China under Grant 2016A030310058, in part by the Natural Science Foundation of SZU (Grant 827000144) and in part by the National Engineering Laboratory for Big Data System Computing Technology of China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Z., Wang, X., Zhou, Y., Zou, L., Jiang, J. (2020). Content-Aware Cubemap Projection for Panoramic Image via Deep Q-Learning. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-37734-2_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37733-5
Online ISBN: 978-3-030-37734-2
eBook Packages: Computer ScienceComputer Science (R0)