Abstract: Deep learning has dominated the research field of traffic sign detection, but the traffic sign detection algorithms based on deep learning have difficulty in solving the two tasks of localization and classification simultaneously when performing traffic sign detection on realistic and complex traffic scene images, and the images or the types of traffic signs provided by the public dataset used by the relevant algorithm cannot meet the situations encountered in realistic traffic scenes.To solve the above problems, this paper creates a new road traffic sign dataset, and based on the YOLOv4 algorithm, designs a multi-size feature extraction module and an…enhanced feature fusion module to improve the algorithm’s ability to locate and classify traffic signs simultaneously, in view of the complexity of realistic traffic scene images and the large variation of traffic sign sizes in the images. The experimental results on the newly created dataset show that the improved algorithm achieves 83.63% mean Average Precision (mAP), which is higher than several major object detection algorithms based on deep learning for the same type of task at present. The newly created dataset in this paper is publicly available at https://github.com/zhang1018/Traffic-sign-dataset-for-public .
Show more
Keywords: Traffic sign detection and recognition, traffic sign datasets, autonomous driving, convolutional neural networks, intelligent traffic system
Abstract: End-to-end deep learning has gained considerable interests in autonomous driving vehicles. End-to-end autonomous driving uses the deep convolutional neural network to establish input-to-output mapping. However, existing end-to-end driving models only predict steering angle with front-facing camera data and poorly extract spatial-temporal information. Based on deep learning and attention mechanism, we propose an end-to-end driving model which combines the multi-stream attention module with the multi-stream network. As a multimodal multitask model, the proposed end-to-end driving model not only fully extracts spatial-temporal information from multimodality, but also adopts the multitask learning method with hard parameter sharing to predict the steering angle and…speed. Furthermore, the proposed multi-stream attention module predicts the attention weights of streams based on the multimodal feature fusion, which encourages the proposed end-to-end driving model to pay attention to streams that positively impact the prediction result. We demonstrate the efficiency of the proposed driving model on the public Udacity dataset compared to existing models. Experimental results show that the proposed driving model has better performances than other existing methods.
Show more