Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Towards Trajectory Forecasting From Detection

Published: 01 October 2023 Publication History

Abstract

Trajectory forecasting for traffic participants (e.g., vehicles) is critical for autonomous platforms to make safe plans. Currently, most trajectory forecasting methods assume that object trajectories have been extracted and directly develop trajectory predictors based on the ground truth trajectories. However, this assumption does not hold in practical situations. Trajectories obtained from object detection and tracking are inevitably noisy, which could cause serious forecasting errors to predictors built on ground truth trajectories. In this paper, we propose to predict trajectories directly based on detection results without relying on explicitly formed trajectories. Different from traditional methods which encode the motion cues of an agent based on its clearly defined trajectory, we extract the motion information only based on the affinity cues among detection results, in which an affinity-aware state update mechanism is designed to manage the state information. In addition, considering that there could be multiple plausible matching candidates, we aggregate the states of them. These designs take the uncertainty of association into account which relax the undesirable effect of noisy trajectory obtained from data association and improve the robustness of the predictor. Extensive experiments validate the effectiveness of our method and its generalization ability to different detectors or forecasting schemes.

References

[1]
A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social LSTM: Human trajectory prediction in crowded spaces,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 961–971.
[2]
T. Zhao et al., “Multi-agent tensor fusion for contextual trajectory prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 12 126–12 134.
[3]
P. Zhang, W. Ouyang, P. Zhang, J. Xue, and N. Zheng, “SR-LSTM: State refinement for LSTM towards pedestrian trajectory prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 12 085–12 094.
[4]
P. Kothari, B. Sifringer, and A. Alahi, “Interpretable social anchors for human trajectory forecasting in crowds,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 15 556–15 566.
[5]
T. Phan-Minh, E. C. Grigore, F. A. Boulton, O. Beijbom, and E. M. Wolff, “CoverNet: Multimodal behavior prediction using trajectory sets,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 14 074–14 083.
[6]
L. Fang, Q. Jiang, J. Shi, and B. Zhou, “TPNet: Trajectory proposal network for motion prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 6797–6806.
[7]
S. Narayanan, R. Moslemi, F. Pittaluga, B. Liu, and M. Chandraker, “Divide-and-conquer for lane-aware diverse trajectory prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 15 799–15 808.
[8]
Y. Liu, J. Zhang, L. Fang, Q. Jiang, and B. Zhou, “Multimodal motion prediction with stacked transformers,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 7577–7586.
[9]
M. Ye, T. Cao, and Q. Chen, “TPCN: Temporal point cloud networks for motion forecasting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 11313–11322.
[10]
B. Pang, T. Zhao, X. Xie, and Y. N. Wu, “Trajectory prediction with latent belief energy-based model,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 11 814–11 824.
[11]
C. Yu, X. Ma, J. Ren, H. Zhao, and S. Yi, “Spatio-temporal graph transformer networks for pedestrian trajectory prediction,” in Proc. Eur. Conf. Comput. Vis., Springer, 2020, pp. 507–523.
[12]
L. Shi et al., “SGCN: Sparse graph convolution network for pedestrian trajectory prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 8994–9003.
[13]
J. Fang, C. Zhu, P. Zhang, H. Yu, and J. Xue, “Heterogeneous trajectory forecasting via risk and scene graph learning,” 2022,.
[14]
R. Chandra, U. Bhattacharya, A. Bera, and D. Manocha, “TraPHic: Trajectory prediction in dense and heterogeneous traffic using weighted interactions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 8483–8492.
[15]
Y. Ma, X. Zhu, S. Zhang, R. Yang, W. Wang, and D. Manocha, “TrafficPredict: Trajectory prediction for heterogeneous traffic-agents,” in Proc. AAAI Conf. Artif. Intell., 2019, pp. 6120–6127.
[16]
F. Marchetti, F. Becattini, L. Seidenari, and A. D. Bimbo, “MANTRA: Memory augmented networks for multiple trajectory prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 7143–7152.
[17]
F. Zheng et al., “Unlimited neighborhood interaction for heterogeneous trajectory prediction,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 13148–13157.
[18]
T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data,” in Proc. Eur. Conf. Comput. Vis., Springer, 2020, pp. 683–700.
[19]
M. Liang et al., “Learning lane graph representations for motion forecasting,” in Proc. Eur. Conf. Comput. Vis., Springer, 2020, pp. 541–556.
[20]
Z. Zhou, L. Ye, J. Wang, K. Wu, and K. Lu, “HiVT: Hierarchical vector transformer for multi-agent motion prediction,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 8823–8833.
[21]
A. Rasouli, I. Kotseruba, T. Kunic, and J. K. Tsotsos, “PIE: A large-scale dataset and models for pedestrian intention estimation and trajectory prediction,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 6262–6271.
[22]
H. Song, W. Ding, Y. Chen, S. Shen, M. Y. Wang, and Q. Chen, “PiP: Planning-informed trajectory prediction for autonomous driving,” in Proc. Eur. Conf. Comput. Vis., Springer, 2020, pp. 598–614.
[23]
S. Malla, B. Dariush, and C. Choi, “TITAN: Future forecast using action priors,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 11 186–11 196.
[24]
K. Mangalam et al., “It is not the journey but the destination: Endpoint conditioned trajectory prediction,” in Proc. Eur. Conf. Comput. Vis., Springer, 2020, pp. 759–776.
[25]
P. Dendorfer, A. Osep, and L. Leal-Taixé, “Goal-GAN: Multimodal trajectory prediction based on goal position estimation,” in Proc. Asian Conf. Comput. Vis., 2020, pp. 405–420.
[26]
J. Gu, C. Sun, and H. Zhao, “DenseTNT: End-to-end trajectory prediction from dense goal sets,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 15283–15292.
[27]
L. L. Li et al., “End-to-end contextual perception and prediction with interaction transformer,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2020, pp. 5784–5791.
[28]
S. Casas, C. Gulino, R. Liao, and R. Urtasun, “SpAGNN: Spatially-aware graph neural networks for relational behavior forecasting from sensor data,” in Proc. IEEE Int. Conf. Robot. Automat., 2020, pp. 9491–9497.
[29]
S. Casas, W. Luo, and R. Urtasun, “IntentNet: Learning to predict intention from raw sensor data,” in Proc. 2nd Conf. Robot Learn., PMLR, 2018, pp. 947–956.
[30]
W. Zeng, S. Wang, R. Liao, Y. Chen, B. Yang, and R. Urtasun, “DSDNet: Deep structured self-driving network,” in Proc. Eur. Conf. Comput. Vis., Springer, 2020, pp. 156–172.
[31]
W. Luo, B. Yang, and R. Urtasun, “Fast and furious: Real time end-to-end 3D detection, tracking and motion forecasting with a single convolutional net,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3569–3577.
[32]
G. P. Meyer et al., “LaserFlow: Efficient and probabilistic object detection and motion forecasting,” IEEE Trans. Robot. Autom., vol. 6, no. 2, pp. 526–533, Apr. 2021.
[33]
[34]
X. Weng, Y. Yuan, and K. Kitani, “PTP: Parallelized tracking and prediction with graph neural networks and diversity sampling,” IEEE Trans. Robot. Autom., vol. 6, no. 3, pp. 4640–4647, Jul. 2021.
[35]
M. Liang et al., “PnPNet: End-to-end perception and prediction with tracking in the loop,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 11550–11559.
[36]
N. Djuric et al., “Uncertainty-aware short-term motion prediction of traffic actors for autonomous driving,” in Proc. Winter Conf. Appl. Comput. Vis., 2020, pp. 2095–2104.
[37]
R. Yu and Z. Zhou, “Towards robust human trajectory prediction in raw videos,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2021, pp. 8059–8066.
[38]
X. Weng, B. Ivanovic, and M. Pavone, “MTP: Multi-hypothesis tracking and prediction for reduced error propagation,” in Proc. IEEE Intell. Veh. Symp., 2022, pp. 1218–1225.
[39]
X. Weng, B. Ivanovic, K. Kitani, and M. Pavone, “Whose track is it anyway? Improving robustness to tracking errors with affinity-based trajectory prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 6573–6582.
[40]
B. Ivanovic et al., “Heterogeneous-agent trajectory forecasting incorporating class uncertainty,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2022, pp. 12 196–12 203.
[41]
B. Ivanovic, Y. Lin, S. Shrivastava, P. Chakravarty, and M. Pavone, “Propagating state uncertainty through trajectory forecasting,” in Proc. IEEE Int. Conf. Robot. Automat., 2022, pp. 2351–2358.
[42]
H. Caesar et al., “nuScenes: A multimodal dataset for autonomous driving,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 11618–11628.
[43]
A. Milan, S. H. Rezatofighi, A. Dick, I. Reid, and K. Schindler, “Online multi-target tracking using recurrent neural networks,” in Proc. AAAI Conf. Artif. Intell., 2017, pp. 4225–4232.
[44]
K. Fang, Y. Xiang, X. Li, and S. Savarese, “Recurrent autoregressive networks for online multi-object tracking,” in Proc. Winter Conf. Appl. Comput. Vis., 2018, pp. 466–475.
[45]
A. Sadeghian, A. Alahi, and S. Savarese, “Tracking the untrackable: Learning to track multiple cues with long-term dependencies,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 300–311.
[46]
C. Kim, F. Li, and J. M. Rehg, “Multi-object tracking with neural gating using bilinear LSTM,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 200–215.
[47]
C. Kim, L. Fuxin, M. Alotaibi, and J. M. Rehg, “Discriminative appearance modeling with multi-track pooling for real-time multi-object tracking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 9553–9562.
[48]
W. Zhang, H. Zhou, S. Sun, Z. Wang, J. Shi, and C. C. Loy, “Robust multi-modality multi-object tracking,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 2365–2374.
[49]
S. S. Blackman, “Multiple hypothesis tracking for multiple target tracking,” IEEE Aerosp. Electron. Syst. Mag., vol. 19, no. 1, pp. 5–18, Jan. 2004.
[50]
C. Kim, F. Li, A. Ciptadi, and J. M. Rehg, “Multiple hypothesis tracking revisited,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 4696–4704.
[51]
Y. Huang, H. Bi, Z. Li, T. Mao, and Z. Wang, “STGAT: Modeling spatial-temporal interactions for human trajectory prediction,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 6272–6281.
[52]
M.-F. Chang et al., “Argoverse: 3D tracking and forecasting with rich maps,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 8748–8757.
[53]
P. Zhang, J. Xue, P. Zhang, N. Zheng, and W. Ouyang, “Social-aware pedestrian trajectory prediction via states refinement LSTM,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 5, pp. 2742–2759, May 2022.
[54]
M.-F. Chang et al., “Open argoverse CBGS-KF tracker,” 2019. [Online]. Available: https://github.com/argoai/argoverse-api
[55]
T. Yin, X. Zhou, and P. Krahenbuhl, “Center-based 3D object detection and tracking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 11 784–11 793.
[56]
Y. Tianwei, 2020. [Online]. Available: https://github.com/tianweiy/CenterPoint
[57]
X. Zhang, F. Wan, C. Liu, R. Ji, and Q. Ye, “FreeAnchor: Learning to match anchors for visual object detection,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2019, Art. no.
[58]
A. Sadeghian, F. Legros, M. Voisin, R. Vesel, A. Alahi, and S. Savarese, “CAR-Net: Clairvoyant attentive recurrent network,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 151–167.
[59]
W. Zeng et al., “End-to-end interpretable neural motion planner,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 8660–8669.
[60]
X. Zhu, Y. Ma, T. Wang, Y. Xu, J. Shi, and D. Lin, “SSN: Shape signature networks for multi-class object detection from point clouds,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 581–597.
[61]
A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “PointPillars: Fast encoders for object detection from point clouds,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 12689–12697.
[62]
B. Zhu, Z. Jiang, X. Zhou, Z. Li, and G. Yu, “Class-balanced grouping and sampling for point cloud 3D object detection,” 2019,.
[63]
J. Lambert, “Open argoverse CBGS-KF tracker,” 2020. [Online]. Available: https://github.com/johnwlambert/argoverse_cbgs_kf_tracker

Cited By

View all
  • (2023)Heterogeneous Trajectory Forecasting via Risk and Scene Graph LearningIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.328718624:11(12078-12091)Online publication date: 1-Nov-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 45, Issue 10
Oct. 2023
1331 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 October 2023

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Heterogeneous Trajectory Forecasting via Risk and Scene Graph LearningIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.328718624:11(12078-12091)Online publication date: 1-Nov-2023

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media