Hierarchical Attention Learning of Scene Flow in 3D Point Clouds (IEEE Transactions on Image Processing)
This is the official implementation of TIP 2021 papaer "Hierarchical Attention Learning of Scene Flow in 3D Point Clouds" created by Guangming Wang, Xinrui Wu, Zhe Liu and Hesheng Wang.
If you find our work useful in your research, please cite:
@ARTICLE{9435105,
author={Wang, Guangming and Wu, Xinrui and Liu, Zhe and Wang, Hesheng},
journal={IEEE Transactions on Image Processing},
title={Hierarchical Attention Learning of Scene Flow in 3D Point Clouds},
year={2021},
volume={30},
number={},
pages={5168-5181},
doi={10.1109/TIP.2021.3079796}}
Abstract—Scene flow represents the 3D motion of every point in the dynamic environments. Like the optical flow that represents the motion of pixels in 2D images, 3D motion representation of scene flow benefits many applications, such as autonomous driving and service robot. This paper studies the problem of scene flow estimation from two consecutive 3D point clouds. In this paper, a novel hierarchical neural network with double attention is proposed for learning the correlation of point features in adjacent frames and refining scene flow from coarse to fine layer by layer. The proposed network has a new more-for-less hierarchical architecture. The more-for-less means that the number of input points is greater than the number of output points for scene flow estimation, which brings more input information and balances the precision and resource consumption. In this hierarchical architecture, scene flow of different levels is generated and supervised respectively. A novel attentive embedding module is introduced to aggregate the features of adjacent points using a double attention method in a patch-to-patch manner. The proper layers for flow embedding and flow supervision are carefully considered in our network designment. Experiments show that the proposed network outperforms the state-of-the-art performance of 3D scene flow estimation on the FlyingThings3D and KITTI Scene Flow 2015 datasets. We also apply the proposed network to the realistic LiDAR odometry task, which is a key problem in autonomous driving. The experiment results demonstrate that our proposed network can outperform the ICP based method and shows good practical application ability.
python 3.6.8
CUDA 9.0
TensorFlow 1.12.0
numpy 1.16.1
g++ 5.4.0
The TF operators are included under tf_ops
, you need to compile them first by make
under each ops subfolder (check Makefile
). Update arch
in the Makefiles for different CUDA Compute Capability that suits your GPU if necessary.
cd ./tf_ops/sampling
make
cd ../grouping
make
cd ../3d_interpolation
make
cd ..
For fair comparison with previous methods, we adopt the preprocessing steps in HPLFlowNet. Please refer to repo. We also copy the preprocessing instructions here for your reference.
- FlyingThings3D:
Download and unzip the "Disparity", "Disparity Occlusions", "Disparity change", "Optical flow", "Flow Occlusions" for DispNet/FlowNet2.0 dataset subsets from the FlyingThings3D website (we used the paths from this file, now they added torrent downloads)
. They will be upzipped into the same directory,
RAW_DATA_PATH
. Then run the following script for 3D reconstruction:
python3 data_preprocess/process_flyingthings3d_subset.py --raw_data_path RAW_DATA_PATH --save_path SAVE_PATH/FlyingThings3D_subset_processed_35m --only_save_near_pts
- KITTI Scene Flow 2015
Download and unzip KITTI Scene Flow Evaluation 2015 to directory
RAW_DATA_PATH
. Run the following script for 3D reconstruction:
python3 data_preprocess/process_kitti.py RAW_DATA_PATH SAVE_PATH/KITTI_processed_occ_final
Train the network by running
sh command.sh
Please reminder to specify the mode
(train),dataset
, GPU
,model
(path to HALFlowNet model), data_ft3d_path
,data_kitti_path
,log_dir
,num_point
in the scripts.
The training results and best model will be saved in log_dir
.
We provide pre-trained model in ./pretrained_model
directory.
Please run
sh command.sh
Train the network by running sh command.sh
please reminder to specify the mode
(eval), GPU
,model
(path to HALFlowNet model), data_ft3d_path
,data_kitti_path
,checkpoint_path
(path to pre-trained model for testing), log_dir
.
We thank all the TIP reviewers and the following open-source project for the help of the implementations:
- PointNet++ (Furthest Points Sampling and TF operators)
- HPLFlowNet (Data Preprocessing)