Abstract
Realizing autonomy is a hot research topic for automatic vehicles in recent years. For a long time, most of the efforts to this goal concentrate on understanding the scenes surrounding the ego-vehicle (autonomous vehicle itself). By completing low-level vision tasks, such as detection, tracking and segmentation of the surrounding traffic participants, e.g., pedestrian, cyclists and vehicles, the scenes can be interpreted. However, for an autonomous vehicle, low-level vision tasks are largely insufficient to give help to comprehensive scene understanding. What are and how about the past, the on-going and the future of the scene participants? This deep question actually steers the vehicles towards truly full automation, just like human beings. Based on this thoughtfulness, this paper attempts to investigate the interpretation of traffic scene in autonomous driving from an event reasoning view. To reach this goal, we study the most relevant literatures and the state-of-the-arts on scene representation, event detection and intention prediction in autonomous driving. In addition, we also discuss the open challenges and problems in this field and endeavor to provide possible solutions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
M. M. Waldrop. Autonomous vehicles: No drivers required. Nature, vol.518, no.7537, pp.20–23, 2015. DOI: 10.1038/518020a.
J. Mervis. Are We Going Too Fast on Driverless Cars? http://www.sciencemag.org/news/2017/12/are-wegoing-too-fast-driverless-cars, December 14, 2017.
Y. Y. Zheng, J. Yao. Multi-angle face detection based on DP-adaboost. International Journal of Automation and Computing, vol.12, no.4, pp.421–431, 2015. DOI: 10.1007/s11633-014-0872-8.
H. G. Ren, W. M. Liu, T. Shi, F. J. Li. Compressive tracking based on online Hough forest. International Journal of Automation and Computing, vol. 14, no.4, pp.396–406, 2017. DOI: 10.1007/s11633-017-1083-x.
J. W. Fang, H. K. Xu, Q. Wang, T. J. Wu. Online hash tracking with spatio-temporal saliency auxiliary. Computer Vision and Image Understanding, vol. 160, pp. 57–72, 2017. DOI: 10.1016/j.cviu.2017.03.006.
S. Arumugadevi, V. Seenivasagam. Color image segmentation using feedforward neural networks with FCM. International Journal ofAutomation and Computing, vol. 13, no. 5, pp. 491–500, 2016. DOI: 10.1007/s11633-016-0975-5.
J. F. Bonnefon, A. Shariff, I. Rahwan. The social dilemma of autonomous vehicles. Science, vol. 352, no. 6293, pp. 1573–1576, 2016. DOI: 10.1126/science.aaf2654.
J. Janai, F. Guney, A. Behl, A. Geiger. Computer vision for autonomous vehicles: Problems, datasets and state-ofthe-art. arXiv:1704.05519, 2017.
J. R. Xue, D. Wang, S. Y. Du, D. X. Cui, Y. Huang, N. N. Zheng. A vision-centered multi-sensor fusing approach to self-localization and obstacle perception for robotic cars. Frontiers of Information Technology & Electronic Engineering, vol. 18, no. 1, pp. 122–138, 2017. DOI: 10.1631/FITEE. 1601873.
H. Zhu, K. V. Yuen, L. Mihaylova, H. Leung. Overview of environment perception for intelligent vehicles. IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 10, pp. 2584–2601, 2017. DOI: 10.1109/TITS.2017.2658662.
D. L. Waltz. Understanding scene descriptions as event simulations. In Proceedings of the 18th Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, Philadelphia, USA, pp.7–11, 1980. DOI: 10.3115/981436.981439.
Y. Q. Hou, S. Hornauer, K. Zipser. Fast recurrent fully convolutional networks for direct perception in autonomous driving. arXiv:1711.06459, 2017.
H. Z. Xu, Y. Gao, F. Yu, T. Darrell. End-to-end learning of driving models from large-scale video datasets. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp.3530–3538, 2017. DOI: 10.1109/CVPR.2017.376.
T. Fernando, S. Denman, S. Sridharan, C. Fookes. Going deeper: Autonomous steering with neural memory networks. In Proceedings of IEEE International Conference on Computer Vision Workshop, IEEE, Venice, Italy, pp.214–221, 2017. DOI: 10.1109/ICCVW.2017.34.
C. Thorpe, M. H. Hebert, T. Kanade, S. A. Shafer. Vision and navigation for the Carnegie-Mellon Navlab. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 10, no. 3, pp. 362–372, 1988. DOI: 10.1109/34.3900.
M. Buehler, K. Iagnemma, S. Singh. The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, Berlin, Heidelberg, Germany: Springer, 2009. DOI: 10.1007/9783-642-03991-1.
J. Hooper. From DARPA Grand Challenge 2004DARPA’ s Debacle in The Desert. https://www.popsci.com/scitech/article/2004-06/darpagrand-challenge-2004darpas-debacle-desert, June 4, 2004.
S. Thrun, M. Montemerlo, H. Dahlkamp, D. Stavens, A. Aron, J. Diebel, P. Fong, J. Gale, M. Halpenny, G. Hoffmann, K. Lau, C. Oakley, M. Palatucci, V. Pratt, P. Stang, S. Strohband, C. Dupont, L. E. Jendrossek, C. Koelen, C. Markey, C. Rummel, J. Van Niekerk, E. Jensen, P. Alessandrini, G. Bradski, B. Davies, S. Ettinger, A. Kaehle, A. Nefian, P. Mahoney. Stanley: The robot that won the DARPA grand challenge. The 2005 DARPA Grand Challenge, M. Buehler, K. Iagnemma, S. Singh, Eds., Berlin, Heidelberg, Germany: Springer, 2007. DOI: 10.1007/978-3-540-73429-1_1.
A. Geiger, P. Lenz, R. Urtasun. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 33543361, 2012. DOI: 10.1109/CVPR.2012.6248074.
G. J. Brostow, J. Fauqueur, R. Cipolla. Semantic object classes in video: A high-definition ground truth database. Pattern Recognition Letters, vol.30, no.2, pp.88, 2009. DOI: 10.1016/j.patrec.2008.04.005.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3213–3223, 2016. DOI: 10.1109/CVPR.2016.350.
A. Gaidon, Q. Wang, Y. Cabon, E. Vig. Virtual worlds as proxy for multi-object tracking analysis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4340–4349, 2016. DOI: 10.1109/CVPR.2016.470.
W. Maddern, G. Pascoe, C. Linegar, P. Newman. 1 year, 1000 km: The Oxford RobotCar dataset. International Journal of Robotics Research, vol. 36, no. 1, pp. 3–15, 2017. DOI: 10.1177/0278364916679498.
J. V. Dueholm, M. S. Kristoffersen, R. K. Satzoda, E. Ohn-Bar, T. B. Moeslund, M. M. Trivedi. Multiperspective vehicle detection and tracking: Challenges, dataset, and metrics. In Proceedings of the 19th International Conference on Intelligent Transportation Systems, IEEE, Rio de Janeiro, Brazil, pp.959–964, 2016. DOI: 10.1109/ITSC.2016.7795671.
C. Wang, Y. K. Fang, H. J. Zhao, C. Z. Guo, S. Mita, H. B. Zha. Probabilistic inference for occluded and Multiview on-road vehicle detection. IEEE Transactions on Intelligent Transportation Systems, vol.17, no.1, pp.215–229, 2015. DOI: 10.1109/TITS.2015.2466109.
D. Hoiem, S. K. Divvala, J. H. Hays. Pascal VOC 2008 challenge. World Literature Today, 2009.
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, F. F. Li. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: 10.1007/s11263-015-0816-y.
A. Milan, L. Leal-Taixe, I. Reid, S. Roth, K. Schindler. MOT16: A benchmark for multi-object tracking. arXiv:1603.00831, 2016.
F. C. Heilbron, V. Escorcia, B. Ghanem, J. C. Niebles. Activitynet: A large-scale video benchmark for human activity understanding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 961–970, 2015. DOI: 10.1109/CVPR.2015.7298698.
T. Deng, K. F. Yang, Y. J. Li, H. M. Yan. Where does the driver look? Top-down-based saliency detection in a traffic driving environment. IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 7, pp. 2051–2062, 2016. DOI: 10.1109/TITS.2016.2535402.
A. Geiger, M. Lauer, R. Urtasun. A generative model for 3D urban scene understanding from movable platforms. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Colorado Springs, USA, pp. 19451952, 2011. DOI: 10.1109/CVPR.2011.5995641.
J. M. Zhang, S. Sclaroff. Exploiting surroundedness for saliency detection: A Boolean map approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, no.5, pp.889, 2016. DOI: 10.1109/TPAMI.2015.2473844.
L. Zhou, Y. F. Ju, J. W. Fang, J. R. Xue. Saliency detection via background invariance in scale space. Journal of Electronic Imaging, vol.26, no.4, Article number 043021, 2017. DOI: 10.1117/1.JEI.26.4.043021.
Q. Wang, Y. Yuan, P. K. Yan, X. L. Li. Saliency detection by multiple-instance learning. IEEE Transactions on Cybernetics, vol.43, no.2, pp.660–672, 2013. DOI: 10.1109/TSMCB.2012.2214210.
S. F. He, R. W. H. Lau. Exemplar-driven top-down saliency detection via deep association. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 5723–5732, 2016. DOI: 10.1109/CVPR.2016.617.
J. M. Yang, M. H. Yang. Top-down visual saliency via joint CRF and dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 3, pp. 576–588, 2017. DOI: 10.1109/TPAMI.2016.2547384.
J. T. Pan, E. Sayrol, X. Giro-I-Nieto, K. McGuinness, N. E. O’Connor. Shallow and deep convolutional networks for saliency prediction. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, NV, USA, pp.598–606, 2016. DOI: 10.1109/CVPR.2016.71.
Y. Xia, D. Q. Zhang, A. Pozdnoukhov, K. Nakayama, K. Zipser, D. Whitney. Training a network to attend like human drivers saves it from common but misleading loss functions. arXiv:1711.06406, 2017.
Y. Xie, L. F. Liu, C. H. Li, Y. Y. Qu. Unifying visual saliency with hog feature learning for traffic sign detection. In Proceedings of IEEE Intelligent Vehicles Symposium, IEEE, Xi’an, China, pp. 24–29, 2009. DOI: 10.1109/IVS.2009.5164247.
W. J. Won, M. Lee, J. W. Son. Implementation of road traffic signs detection based on saliency map model. In Proceedings of IEEE Intelligent Vehicles Symposium, IEEE, Eindhoven, Netherlands, pp. 542–547, 2008. DOI: 10.1109/IVS.2008.4621144.
D. D. Wang, X. W. Hou, J. W. Xu, S. G. Yue, C. L. Liu. Traffic sign detection using a cascade method with fast feature extraction and saliency test. IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 12, pp. 32903302, 2017. DOI: 10.1109/TITS.2017.2682181.
J. Kim, S. Kim, R. Mallipeddi, G. Jang, M. Lee. Adaptive driver assistance system based on traffic information saliency map. In Proceedings of International Joint Conference on Neural Networks, IEEE, Vancouver, Canada, pp. 1918–1923, 2016. DOI: 10.1109/IJCNN.2016.7727434.
V. John, K. Yoneda, Z. Liu, S. Mita. Saliency map generation by the convolutional neural network for real-time traffic light detection using template matching. IEEE Transactions on Computational Imaging, vol. 1, no. 3, pp. 159–173, 2015. DOI: 10.1109/TCI.2015.2480006.
H. L. Kuang, K. F. Yang, L. Chen, Y. J. Li, L. L. H. Chan, H. Yan. Bayes saliency-based object proposal generator for nighttime traffic images. IEEE Transactions on Intelligent Transportation Systems, vol. 19, no. 3, pp. 814–825, 2017. DOI: 10.1109/TITS.2017.2702665.
R. Timofte, K. Zimmermann, L. V. Gool. Multi-view traffic sign detection, recognition, and 3D localisation. Machine Vision and Applications, vol.25, no.3, pp.633–647, 2014. DOI: 10.1007/s00138-011-0391-3.
S. Alletto, A. Palazzi, F. Solera, S. Calderara, R. Cucchiara. DR(eye)VE: A dataset for attention-based tasks with applications to autonomous and assisted driving. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, LasVegas, USA, 2016. DOI: 10.1109/CVPRW.2016.14.
A. Palazzi, F. Solera, S. Calderara, S. Alletto, R. Cucchiara. Where should you attend while driving? arXiv:1611.08215, 2016.
C. Landsiedel, D. Wollherr. Road geometry estimation for urban semantic maps using open data. Advanced Robotics, vol.31, no.5, pp.282–290, 2017. DOI: 10.1080/01691864.2016.1250675.
E. Levinkov, M. Fritz. Sequential Bayesian model update under structured scene prior for semantic road scenes labeling. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Sydney, Australia, pp. 1321–1328, 2013. DOI: 10.1109/ICCV.2013.167.
Z. Y. Zhang, S. Fidler, R. Urtasun. Instance-level segmentation for autonomous driving with deep densely connected MRFs. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 669–677, 2016. DOI: 10.1109/CVPR.2016.79.
T. Cavallari, Semantic Slam: A New Paradigm for Object Recognition and Scene Reconstruction, Ph. D. dissertation, University of Bologna, Italy, 2017.
S. C. Zhou, R. Yan, J. X. Li, Y. K. Chen, H. J. Tang. A brain-inspired SLAM system based on ORB features. International Journal of Automation and Computing, vol. 14, no.5, pp. 564–575, 2017. DOI: 10.1007/s11633-017-1090-y.
B. Zhao, J. S. Feng, X. Wu, S. C. Yan. A survey on deep learning-based fine-grained object classification and semantic segmentation. International Journal of Automation and Computing, vol. 14, no. 2, pp. 119–135, 2017. DOI: 10.1007/s11633-017-1053-3.
H. Kong, J. Y. Audibert, J. Ponce. Vanishing point detection for road detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 96–103, 2009. DOI: 10.1109/CVPR.2009.5206787.
H. Kong, S. E. Sarma, F. Tang. Generalizing Laplacian of Gaussian filters for vanishing-point detection. IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 1, pp. 408–418, 2013. DOI: 10.1109/TITS.2012.2216878.
J. J. Shi, J. X. Wang, F. F. Fu. Fast and robust vanishing point detection for unstructured road following. IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 4, pp. 970–979, 2016. DOI: 10.1109/TITS.2015.2490556.
J. M. Alvarez, T. Gevers, A. M. Lopez. 3D scene priors for road detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, San Francisco, USA, pp.57–64, 2010. DOI: 10.1109/CVPR.2010.5540228.
E. Casapietra, T. H. Weisswange, C. Goerick, F. Kummert. Enriching a spatial road representation with lanes and driving directions. In Proceedings of the 19th International Conference on Intelligent Transportation Systems, IEEE, Rio de Janeiro, Brazil, pp. 1579–1585, 2016. DOI: 10.1109/ITSC.2016.7795768.
A. Seff, J. X. Xiao. Learning from maps: Visual common sense for autonomous driving. arXiv:1611.08583, 2016.
M. Y. Liu, S. X. Lin, S. Ramalingam, O. Tuzel. Layered interpretation of street view images. arXiv:1506.04723, 2015.
A. Geiger, M. Lauer, C. Wojek, C. Stiller, R. Urtasun. 3D traffic scene understanding from movable platforms. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 5, pp. 1012–1025, 2014. DOI: 10.1109/TPAMI.2013.185.
A. Ess, T. Mueller, H. Grabner, L. Van Gool. Segmentationbased urban traffic scene understanding. In Proceedings of British Machine Vision Conference, London, UK, 2009. DOI: 10.5244/C.23.84.
B. Kitt, A. Geiger, H. Lategahn. Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme. In Proceedings of IEEE Intelligent Vehicles Symposium, IEEE, San Diego, USA, pp. 486–492, 2010. DOI: 10.1109/IVS.2010.5548123.
S. Thrun, W. Burgard, D. Fox. Probabilistic Robotics (Intelligent Robotics and Autonomous Agents), Cambridge, Mass, UK: MIT, 2005.
H. Y. Zhang, A. Geiger, R. Urtasun. Understanding high-level semantics by modeling traffic patterns. In Proceedings of IEEE Conference on Computer Vision, IEEE, Sydney, Australia, pp. 3056–3063, 2013. DOI: 10.1109/ICCV.2013.379.
C. Y. Chen, A. Seff, A. Kornhauser, J. X. Xiao. Deep-Driving: Learning affordance for direct perception in autonomous driving. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp.2722–2730, 2015. DOI: 10.1109/ICCV.2015.312.
P. Stahl, B. Donmez, G. A. Jamieson. Anticipation in driving: The role of experience in the efficacy of pre-event conflict cues. IEEE Transactions on Human-Machine Systems, vol.44, no.5, pp.603–613, 2014. DOI: 10.1109/THMS.2014.2325558.
S. J. Pan, Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2010. DOI: 10.1109/TKDE.2009.191.
N. Segev, M. Harel, S. Mannor, K. Crammer, R. El-Yaniv. Learn on source, refine on target: A model transfer learning framework with random forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 9, pp.1811–1824, 2017. DOI: 10.1109/TPAMI.2016.2618118.
D. Mitrovic. Reliable method for driving events recognition. IEEE Transactions on Intelligent Transportation Systems, vol. 6, no. 2, pp. 198–205, 2005. DOI: 10.1109/TITS.2005.848367.
B. F. Wu, Y. H. Chen, C. H. Yeh, Y. F. Li. Reasoningbased framework for driving safety monitoring using driving event recognition. IEEE Transactions on Intelligent Transportation Systems, vol.14, no. 3, pp. 1231–1241, 2013. DOI: 10.1109/TITS.2013.2257759.
A. Ramirez, E. Ohn-Bar, M. Trivedi. Integrating motion and appearance for overtaking vehicle detection. In Proceedings of IEEE Intelligent Vehicles Symposium Proceedings, IEEE, Dearborn, USA, pp.96–101, 2014. DOI: 10.1109/IVS.2014.6856598.
J. D. Alonso, E. R. Vidal, A. Rotter, M. Muhlenberg. Lane-change decision aid system based on motiondriven vehicle tracking. IEEE Transactions on Vehicular Technology, vol. 57, no. 5, pp. 2736–2746, 2008. DOI: 10.1109/TVT.2008.917220.
Y. Zhu, D. Comaniciu, M. Pellkofer, T. Koehler. Reliable detection of overtaking vehicles using robust information fusion. IEEE Transactions on Intelligent Transportation Systems, vol.7, no.4, pp.401–414, 2006. DOI: 10.1109/TITS.2006.883936.
F. Garcia, P. Cerri, A. Broggi, A. De La Escalera, J. M. Armingol. Data fusion for overtaking vehicle detection based on radar and optical flow. In Proceedings of IEEE Intelligent Vehicles Symposium, IEEE, Alcala de Henares, Spain, pp. 494–499, 2012. DOI: 10.1109/IVS.2012.6232199.
Deutscher Verkehrssicherheitsrat. DVR-Report: Fachmagazin für Verkehrssicherheit. https://www. dvr.de/presse/dvr-report/2017-04.
Auto Club Europa (ACE). Reviere der blinkmuffel. http: //www.ace-online.de/fileadmin/user_uploads/Der_Club/Dokumente/10.07.2008_Grafik_Blinkmuffel_l.pdf.
D. Kasper, G. Weidl, T. Dang, G. Breuel, A. Tamke, A. Wedel, W. Rosenstiel. Object-oriented Bayesian networks for detection of lane change maneuvers. IEEE Intelligent Transportation Systems Magazine, vol.4, no.3, pp.19–31, 2012. DOI: 10.1109/MITS.2012.2203229.
W. Yao, Q. Q. Zeng, Y. P. Lin, D. H. Xu, H. J. Zhao, F. Guillemard, S. Geronimi, F. Aioun. On-road vehicle trajectory collection and scene-based lane change analysis: Part II. IEEE Transactions on Intelligent Transportation Systems, vol.18, no.1, pp.206–220, 2017. DOI: 10.1109/TITS.2016.2571724.
T. Gindele, S. Brechtel, R. Dillmann. A probabilistic model for estimating driver behaviors and vehicle trajectories in traffic environments. In Proceedings of the 13th International Conference on Intelligent Transportation Systems, IEEE, Funchal, Portugal, pp. 1625–1631, 2010. DOI: 10.1109/ITSC.2010.5625262.
S. Sivaraman, B. Morris, M. Trivedi. Learning multi-lane trajectories using vehicle-based vision. In Proceedings of IEEE Conference on Computer Vision Workshops, IEEE, Barcelona, Spain, pp. 2070–2076, 2011. DOI: 10.1109/ICCVW. 2011.6130503.
R. K. Satzoda, M. M. Trivedi. Overtaking & receding vehicle detection for driver assistance and naturalistic driving studies. In Proceedings of the 17th International Conference on Intelligent Transportation Systems, IEEE, Qingdao, China, pp.697–702, 2014. DOI: 10.1109/ITSC.2014.6957771.
M. S. Kristoffersen, J. V. Dueholm, R. K. Satzoda, M. M. Trivedi, A. Mogelmose, T. B. Moeslund. Towards semantic understanding of surrounding vehicular maneuvers: A panoramic vision-based framework for realworld highway studies. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Las Vegas, USA, pp. 1584–1591, 2016. DOI: 10.1109/CVPRW.2016.197.
J. V. Dueholm, M. S. Kristoffersen, R. K. Satzoda, T. B. Moeslund, M. M. Trivedi. Trajectories and maneuvers of surrounding vehicles with panoramic camera arrays. IEEE Transactions on Intelligent Vehicles, vol. 1, no. 2, pp. 203214, 2016. DOI: 10.1109/TIV.2016.2622921.
A. Khosroshahi, E. Ohn-Bar, M. M. Trivedi. Surround vehicles trajectory analysis with recurrent neural networks. In Proceedings of the 19th International Conference on Intelligent Transportation Systems, IEEE, Rio de Janeiro, Brazil, pp. 2267–2272, 2016. DOI: 10.1109/ITSC.2016.7795922.
S. Ernst, J. Rieken, M. Maurer. Behaviour recognition of traffic participants by using manoeuvre primitives for automated vehicles in urban traffic. In Proceedings of the 19th International Conference on Intelligent Transportation Systems, IEEE, Rio de Janeiro, Brazil, 2016. DOI: 10.1109/ITSC.2016.7795674.
S. Busch, T. Schindler, T. Klinger, C. Brenner. Analysis of Spatio-temporal traffic patterns based on pedestrian trajectories. In Proceedings of International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Prague, Czech Republic, vol.XLI-B2, pp.497–503, 2016. DOI: 10.5194/isprsarchives-XLI-B2-497-2016.
J. Hariyono, K. H. Jo. Detection of pedestrian crossing road: A study on pedestrian pose recognition. Neurocomputing, vol.234, pp.144–153, 2017. DOI: 10.1016/j.neucom.2016.12.050.
R. M. Mueid, C. Ahmed, M. A. R. Ahad. Pedestrian activity classification using patterns of motion and histogram of oriented gradient. Journal on Multimodal User Interfaces, vol. 10, no. 4, pp. 299–305, 2016. DOI: 10.1007/s12193-015-0178-3.
R. Quintero, I. Parra, D. F. Llorca, M. A. Sotelo. Pedestrian intention and pose prediction through dynamical models and behaviour classification. In Proceedings of the 18th International Conference on Intelligent Transportation Systems, IEEE, Las Palmas, Spain, pp. 83–88, 2015. DOI: 10.1109/ITSC.2015.22.
M. Ogawa, H. Fukamachi, R. Funayama, T. Kindo. CYKLS: Detect pedestrian’s dart focusing on an appearance change. In Proceedings of the 12th International Conference on Computer Vision, Springer-Verlag, Florence, Italy, pp. 556–565, 2012. DOI: 10.1007/978-3-642-33868-7_55.
F. H. Chan, Y. T. Chen, Y. Xiang, M. Sun. Anticipating accidents in dashcam videos. In Proceedings of the 13th Asian Conference on Computer Vision, Springer, Taipei, China, pp. 136–153, 2016. DOI: 10.1007/978-3-319-54190-7_9.
Y. J. Xia, W. W. Xu, L. M. Zhang, X. M. Shi, K. Mao. Integrating 3D structure into traffic scene understanding with RGB-D data. Neurocomputing, vol. 151, pp. 700–709, 2015. DOI: 10.1016/j.neucom.2014.05.091.
A. Tageldin, M. H. Zaki, T. Sayed. Examining pedestrian evasive actions as a potential indicator for traffic conflicts. IET Intelligent Transport Systems, vol.11, no.5, pp.282–289, 2017. DOI: 10.1049/iet-its.2016.0066.
F. Westerhuis, D. De Waard. Reading cyclist intentions: Can a lead cyclists behaviour be predicted? Accident Analysis & Prevention, vol. 105, pp. 146–155, 2017. DOI: 10.1016/j.aap.2016.06.026.
D. Manstetten. Behaviour prediction and intention detection in UR:BAN VIE -overview and introduction. URBAN Human Factors in Traffic, K. Bengler, J. Druke, S. Hoffmann, D. Manstetten, A. Neukum, Eds., Wiesbaden, Germany: Springer, 2018. DOI: 10.1007/978-3-658-15418-9_8.
F. Schneemann, P. Heinemann. Context-based detection of pedestrian crossing intention for autonomous driving in urban environments. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Daejeon, South Korea, pp. 2243–2248, 2016. DOI: 10.1109/IROS.2016.7759351.
T. Fugger, B. Randles, A. Stein, W. Whiting, B. Gallagher. Analysis of pedestrian gait and Perception-reaction at signal-controlled crosswalk intersections. Transportation Research Record: Journal of the Transportation Research Board, vol. 1705, no. 1, pp.20–25, 2000. DOI: 10.3141/170504.
N. Schneider, D. M. Gavrila. Pedestrian path prediction with recursive Bayesian filters: A comparative study. In Proceedings of the 35th German Conference on Pattern Recognition, Springer, Saarbrücken, Germany, pp. 174–183, 2013. DOI: 10.1007/978-3-642-40602-7_18.
M. Goldhammer, M. Gerhard, S. Zernetsch, K. Doll, U. Brunsmann. Early prediction of a pedestrian’s trajectory at intersections. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems, IEEE, The Hague, The Netherlands, pp.237–242, 2013. DOI: 10.1109/ITSC.2013.6728239.
M. Goldhammer, K. Doll, U. Brunsmann, A. Gensler, B. Sick. Pedestrians trajectory forecast in public traffic with artificial neural networks. In Proceedings of the 22nd International Conference on Pattern Recognition, IEEE, Stockholm, Sweden, pp.4110–4115, 2014. DOI: 10.1109/ICPR.2014.704.
C. G. Keller, D. M. Gavrila. Will the pedestrian cross? A study on pedestrian path prediction. IEEE Transactions on Intelligent Transportation Systems, vol.15, no.2, pp.494–506, 2014. DOI: 10.1109/TITS.2013.2280766.
S. Koehler, M. Goldhammer, S. Bauer, S. Zecha, K. Doll, U. Brunsmann, K. Dietmayer. Stationary detection of the Pedestrians intention at intersections. IEEE Intelligent Transportation Systems Magazine, vol. 5, no. 4, pp. 87–99, 2013. DOI: 10.1109/MITS.2013.2276939.
J. F. P. Kooij, N. Schneider, F. Flohr, D. M. Gavrila. Context-based pedestrian path prediction. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp.618–633, 2014. DOI: 10.1007/978-3-319-10599-4_40.
J. Y. Kwak, B. C. Ko, J. Y. Nam. Pedestrian intention prediction based on dynamic fuzzy automata for vehicle driving at nighttime. Infrared Physics & Technology, vol.81, pp. 41–51, 2017. DOI: 10.1016/j.infrared.2016.12.014.
G. Q. Xu, L. Liu, Y. S. Ou, Z. J. Song. Dynamic modeling of driver control strategy of lane-change behavior and trajectory planning for collision prediction. IEEE Transactions on Intelligent Transportation Systems, vol. 13, no. 3, pp. 1138–1155, 2012. DOI: 10.1109/TITS.2012.2187447.
R. N. Dang, J. Q. Wang, S. E. Li, K. Q. Li. Coordinated adaptive cruise control system with lane-change assistance. IEEE Transactions on Intelligent Transportation Systems, vol.16, no.5, pp.2373–2383, 2015. DOI: 10.1109/TITS.2015.2389527.
W. Liu, S. W. Kim, K. Marczuk, M. H. Ang. Vehicle motion intention reasoning using cooperative perception on urban road. In Proceedings of the 17th International Conference on Intelligent Transportation Systems, IEEE, Qingdao, China, pp.424–430, 2014. DOI: 10.1109/ITSC.2014.6957727.
Y. Hou, P. Edara, C. Sun. Modeling mandatory lane changing using Bayes classifier and decision trees. IEEE Transactions on Intelligent Transportation Systems, vol. 15, no.2, pp. 647–655, 2014. DOI: 10.1109/TITS.2013.2285337.
D. Lee, A. Hansen, J. K. Hedrick. Probabilistic inference of traffic participants lane change intention for enhancing adaptive cruise control. In Proceedings of IEEE Intelligent Vehicles Symposium, IEEE, Los Angeles, USA, pp.855–860, 2017. DOI: 10.1109/IVS.2017.7995823.
Y. L. Gu, Y. Hashimoto, L. T. Hsu, S. Kamijo. Motion planning based on learning models of pedestrian and driver behaviors. In Proceedings of the 19th International Conference on Intelligent Transportation Systems, IEEE, Rio de Janeiro, Brazil, pp.808–813, 2016. DOI: 10.1109/ITSC.2016.7795648.
W. D. Xu, J. Pan, J. Q. Wei, J. M. Dolan. Motion planning under uncertainty for on-road autonomous driving. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Hong Kong, China, pp. 2507–2512, 2014. DOI: 10.1109/ICRA.2014.6907209.
T. Y. Gu, J. M. Dolan, J. W. Lee. Automated tactical maneuver discovery, reasoning and trajectory planning for autonomous driving. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Daejeon, South Korea, pp.5474–5480, 2016. DOI: 10.1109/IROS.2016.7759805.
N. Nagasaka, M. Harada. Towards safe, smooth, and stable path planning for on-road autonomous driving under uncertainty. In Proceedings of the 19th International Conference on Intelligent Transportation Systems, IEEE, Rio de Janeiro, Brazil, pp.795–801, 2016. DOI: 10.1109/ITSC.2016.7795646.
K. Jo, M. Lee, J. Kim, M. Sunwoo. Tracking and behavior reasoning of moving vehicles based on roadway geometry constraints. IEEE Transactions on Intelligent Transportation Systems, vol.18, no.2, pp.460–476, 2017. DOI: 10.1109/TITS.2016.2605163.
E. A. I. Pool, J. F. P. Kooij, D. M. Gavrila. Using road topology to improve cyclist path prediction. In Proceedings of IEEE Intelligent Vehicles Symposium, IEEE, Los Angeles, USA, pp. 289–296, 2017. DOI: 10.1109/IVS.2017.7995734.
N. Evestedt, E. Ward, J. Folkesson, D. Axehill. Interaction aware trajectory planning for merge scenarios in congested traffic situations. In Proceedings of the 19th International Conference on Intelligent Transportation Systems, IEEE, Rio de Janeiro, Brazil, pp. 465–472, 2016. DOI: 10.1109/ITSC.2016.7795596.
H. M. Eraqi, M. N. Moustafa, J. Honer. End-to-end deep learning for steering autonomous vehicles considering temporal dependencies. arXiv:1710.03804, 2017.
L. Caltagirone, M. Bellone, L. Svensson, M. Wahde. Simultaneous perception and path generation using fully convolutional neural networks. arXiv:1703.08987, 2017.
Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, F. Durand. What do different evaluation metrics tell us about saliency models? arXiv:1604.03605, 2016.
B. W. Tatler. The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision, vol. 7, no. 14–17, pp. 4.1–17, 2007. DOI: 10.1167/7.14.4.
R. J. Peters, A. Iyer, L. Itti, C. Koch. Components of bottom-up gaze allocation in natural images. Vision Research, vol. 45, no. 18, pp. 2397–2416, 2005. DOI: 10.1016/j.visres.2005.03.019.
M. Kümmerer, T. S. A. Wallis, M. Bethge. Informationtheoretic model comparison unifies saliency metrics. In Proceedings of the National Academy of Sciences of the United States of America, vol. 112, no.52, pp. 16054–16059, 2015. DOI: 10.1073/pnas. 1510393112.
M. J. Swain, D. H. Ballard. Color indexing. International Journal of Computer Vision, vol.7, no.1, pp.11–32, 1991. DOI: 10.1007/BF00130487.
O. Le Meur, P. Le Callet, D. Barba. Predicting visual fixations on video based on low-level visual features. Vision Research, vol.47, no.19, pp.2483, 2498. DOI: 10.1016/j.visres.2007.06.015.
O. Pele, M. Werman. A linear time histogram metric for improved sift matching. In Proceedings of the 10th European Conference on Computer Vision: Part III, Marseille, France, pp.495–508, 2008. DOI: 10.1007/978-3-540-88690-7_37.
Y. Rubner, C. Tomasi, L. J. Guibas. The earth movers distance as a metric for image retrieval. International Journal of Computer Vision, vol.40, no.2, pp.99–121, 2000. DOI: 10.1023/A:1026543900054.
C. Sammut, G. I. Webb. Encyclopedia of Machine Learning, Boston, MA: Springer, 2010.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by National Key R&D Program Project of China (No. 2016YFB1001004), Natural Science Foundation of China (Nos. 61751308, 61603057, 61773311), China Postdoctoral Science Foundation (No. 2017M613152), and Collaborative Research with MSRA.
Recommended by Associate Editor Matjaz Gams
Jian-Ru Xue received the M. Sc. and Ph.D. degrees from Xi’an Jiaotong University (XJTU), China in 1999 and 2003, respectively. He was with FujiXerox, Japan from 2002 to 2003, and visited the University of California at Los Angeles, USA from 2008 to 2009. He is currently a professor with the Institute of Artificial Intelligence and Robotics at XJTU. He served as a coorganization chair of the Asian Conference on Computer Vision and Virtual System and Multimedia Conference. He also served as a PC member of the Pattern Recognition Conference in 2012, and Asian Conference on Computer Vision in 2010 and 2012.
His research interests include computer vision, visual navigation, and scene understanding for autonomous system.
Jian-Wu Fang received the Ph.D. degree in signal and information processing from Univerisity of Chinese Academy of Sciences, China in 2015. He is currently an assistant professor in School of Electronic and Control Engineering, Chang-an University, China, and is also a postdoctor in Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, China.
His research interests include computer vision, pattern recognition and scene understanding.
Pu Zhang received the B. Sc. degree in automation from Southeast University, China in 2016. She is currently a Ph.D. degree candidate at Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, China.
Her research interests include computer vision and on-road scene understanding.
Rights and permissions
About this article
Cite this article
Xue, JR., Fang, JW. & Zhang, P. A Survey of Scene Understanding by Event Reasoning in Autonomous Driving. Int. J. Autom. Comput. 15, 249–266 (2018). https://doi.org/10.1007/s11633-018-1126-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-018-1126-y