MTNAS: Search Multi-task Networks for Autonomous Driving

Hao Liu¹²,
Dong Li¹³,
JinZhang Peng¹³,
Qingjie Zhao¹²,
Lu Tian¹³ &
…
Yi Shan¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12624))

Included in the following conference series:

Asian Conference on Computer Vision

792 Accesses

Abstract

Multi-task learning (MTL) aims to learn shared representations from multiple tasks simultaneously, which has yielded outstanding performance in widespread applications of computer vision. However, existing multi-task approaches often demand manual design on network architectures, including shared backbone and individual branches. In this work, we propose MTNAS, a practical and principled neural architecture search algorithm for multi-task learning. We focus on searching for the overall optimized network architecture with task-specific branches and task-shared backbone. Specifically, the MTNAS pipeline consists of two searching stages: branch search and backbone search. For branch search, we separately optimize each branch structure for each target task. For backbone search, we first design a pre-searching procedure t1o pre-optimize the backbone structure on ImageNet. We observe that searching on such auxiliary large-scale data can not only help learn low-/mid-level features but also offer good initialization of backbone structure. After backbone pre-searching, we further optimize the backbone structure for learning task-shared knowledge under the overall multi-task guidance. We apply MTNAS to joint learning of object detection and semantic segmentation for autonomous driving. Extensive experimental results demonstrate that our searched multi-task model achieves superior performance for each task and consumes less computation complexity compared to prior hand-crafted MTL baselines. Code and searched models will be released at https://github.com/RalphLiu/MTNAS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Real-Time Multi-task Network for Autonomous Driving

ULAF-Net: Ultra lightweight attention fusion network for real-time semantic segmentation

Article 29 January 2024

Improved DNN Robustness by Multi-task Training with an Auxiliary Self-Supervised Task

Notes

1.
zero, skip-connect, max-pool-3x3, avg-pool3x3, sep-conv-3x3, sep-conv-5x5, dil-conv-3x3, dil-conv5x5.

References

Caruana, R.: Multitask learning. Mac. Learn. 28, 41–75 (1997). https://doi.org/10.1023/A:1007379606734
Article Google Scholar
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. SPL 23, 1499–1503 (2016)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)
Google Scholar
Luvizon, D.C., Picard, D., Tabia, H.: 2D/3D pose estimation and action recognition using multitask deep learning. In: CVPR (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: ICML (2018)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)
Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI. (2019)
Google Scholar
Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. In: ICLR (2018)
Google Scholar
Ghiasi, G., Lin, T.Y., Le, Q.V.: NAS-FPN: learning scalable feature pyramid architecture for object detection. In: CVPR (2019)
Google Scholar
Chen, Y., Yang, T., Zhang, X., Meng, G., Xiao, X., Sun, J.: Detnas: backbone search for object detection. In: NeurIPS (2019)
Google Scholar
Peng, J., Sun, M., Zhang, Z.X., Tan, T., Yan, J.: Efficient neural architecture transformation search in channel-level for object detection. In: NeurIPS (2019)
Google Scholar
Liu, C., et al.: Auto-DeepLab: hierarchical neural architecture search for semantic image segmentation. In: CVPR (2019)
Google Scholar
Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction. In: NeurIPS (2018)
Google Scholar
Cai, H., Zhu, L., Han, S.: Proxylessnas: direct neural architecture search on target task and hardware. In: ICLR (2019)
Google Scholar
Liang, H., et al.: Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2019)
Xu, Y., et al.: Pc-darts: partial channel connections for memory-efficient architecture search. In: ICLR (2019)
Google Scholar
Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: CVPR (2018)
Google Scholar
Chen, Z., Badrinarayanan, V., Lee, C.Y., Rabinovich, A.: Gradnorm: gradient normalization for adaptive loss balancing in deep multitask networks. In: ICML (2018)
Google Scholar
Lin, X., Zhen, H.L., Li, Z., Zhang, Q.F., Kwong, S.: Pareto multi-task learning. In: NeurIPS (2019)
Google Scholar
Misra, I., Shrivastava, A., Gupta, A., Hebert, M.: Cross-stitch networks for multi-task learning. In: CVPR (2016)
Google Scholar
He, X., Zhou, Z., Thiele, L.: Multi-task zipping via layer-wise neuron sharing. In: NeurIPS (2018)
Google Scholar
Meyerson, E., Miikkulainen, R.: Beyond shared hierarchies: deep multitask learning through soft layer ordering. In: ICLR (2018)
Google Scholar
Mallya, A., Lazebnik, S.: Packnet: dding multiple tasks to a single network by iterative pruning. In: CVPR (2018)
Google Scholar
Kim, E., Ahn, C., Torr, P.H., Oh, S.: Deep virtual networks for memory efficient inference of multiple tasks. In: CVPR (2019)
Google Scholar
Ahn, C., Kim, E., Oh, S.: Deep elastic networks with model selection for multi-task learning. In: ICCV (2019)
Google Scholar
Rosenbaum, C., Klinger, T., Riemer, M.: Routing networks: Adaptive selection of non-linear functions for multi-task learning. In: ICLR (2018)
Google Scholar
Liang, J., Meyerson, E., Miikkulainen, R.: Evolutionary architecture search for deep multitask networks. In: GECCO (2018)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: CVPR (2018)
Google Scholar
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2015)
Google Scholar
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018ECCV 2018ECCV 2018, Part XIV. LNCS, vol. 11218, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_8
Chapter Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: DetNet: design backbone for object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part IX. LNCS, vol. 11213, pp. 339–354. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_21
Chapter Google Scholar
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2016)
Google Scholar
Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: ICLR (2019)
Google Scholar
Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: DPP-Net: device-aware progressive search for pareto-optimal neural architectures. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part XI. LNCS, vol. 11215, pp. 540–555. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_32
Chapter Google Scholar
Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: ICLR (2018)
Google Scholar
Tan, M., et al.: Mnasnet: platform-aware neural architecture search for mobile. In: CVPR (2019)
Google Scholar
He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for model compression and acceleration on mobile devices. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 815–832. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_48
Chapter Google Scholar
Wu, B., et al.: Fbnet: hardware-aware efficient convnet design via differentiable neural architecture search. In: CVPR (2019)
Google Scholar
Yu, J., et al.: BigNAS: scaling up neural architecture search with big single-stage models. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part VII. LNCS, vol. 12352, pp. 702–717. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58571-6_41
Chapter Google Scholar
Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once-for-all: train one network and specialize it for efficient deployment. In: ICLR (2019)
Google Scholar
He, C., Ye, H., Shen, L., Zhang, T.: Milenas: efficient neural architecture search via mixed-level reformulation. In: CVPR, pp. 11993–12002 (2020)
Google Scholar
Guo, J., et al.: Hit-detector: hierarchical trinity architecture search for object detection. In: CVPR, pp. 11405–11414 (2020)
Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In: ICCV (2019)
Google Scholar
Peng, C., et al.: Megdet: a large mini-batch object detector. In: CVPR (2018)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The Kitti vision benchmark suite. In: CVPR (2012)
Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
Google Scholar
Yu, F., et al.: Bdd100k: A diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 (2018)
Waymo open dataset: An autonomous driving dataset (2019)
Google Scholar
Cao, J., Pang, Y., Li, X.: Triply supervised decoder networks for joint detection and segmentation. In: CVPR. (2019)
Google Scholar
Li, L., Talwalkar, A.: Random search and reproducibility for neural architecture search. In: Uncertainty in Artificial Intelligence, PMLR, pp. 367–377 (2020)
Google Scholar
Krizhevsky, A., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV) (2011)
Google Scholar
Siam, M., Gamal, M., Abdel-Razek, M., Yogamani, S., Jagersand, M.: Rtseg: real-time semantic segmentation comparative study. In: ICIP (2018)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40, 834–848 (2017)
Article Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. TPAMI 39, 2481–2495 (2017)
Article Google Scholar
Treml, M., et al.: Speeding up semantic segmentation for autonomous driving. In: MLITS, NeurIPS Workshop (2016)
Google Scholar
Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: ICCV (2015)
Google Scholar
Kim, H., Lee, Y., Yim, B., Park, E., Kim, H.: On-road object detection using deep neural network. In: ICCE-Asia (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015)
Google Scholar
Wu, B., Iandola, F., Jin, P.H., Keutzer, K.: Squeezedet: uinified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: CVPR (2017)
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: CVPR (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Institute of Technology, Beijing, China
Hao Liu & Qingjie Zhao
Xilinx Inc., Beijing, China
Dong Li, JinZhang Peng, Lu Tian & Yi Shan

Authors

Hao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Dong Li
View author publications
You can also search for this author in PubMed Google Scholar
JinZhang Peng
View author publications
You can also search for this author in PubMed Google Scholar
Qingjie Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Lu Tian
View author publications
You can also search for this author in PubMed Google Scholar
Yi Shan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Liu .

Editor information

Editors and Affiliations

Waseda University, Tokyo, Japan
Hiroshi Ishikawa
Institute of Automation of Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
Czech Technical University in Prague, Prague, Czech Republic
Tomas Pajdla
University of Pennsylvania, Philadelphia, PA, USA
Jianbo Shi

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 250 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, H., Li, D., Peng, J., Zhao, Q., Tian, L., Shan, Y. (2021). MTNAS: Search Multi-task Networks for Autonomous Driving. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12624. Springer, Cham. https://doi.org/10.1007/978-3-030-69535-4_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-69535-4_41
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69534-7
Online ISBN: 978-3-030-69535-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MTNAS: Search Multi-task Networks for Autonomous Driving

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Real-Time Multi-task Network for Autonomous Driving

ULAF-Net: Ultra lightweight attention fusion network for real-time semantic segmentation

Improved DNN Robustness by Multi-task Training with an Auxiliary Self-Supervised Task

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 250 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

MTNAS: Search Multi-task Networks for Autonomous Driving

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Real-Time Multi-task Network for Autonomous Driving

ULAF-Net: Ultra lightweight attention fusion network for real-time semantic segmentation

Improved DNN Robustness by Multi-task Training with an Auxiliary Self-Supervised Task

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 250 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation