Nothing Special   »   [go: up one dir, main page]

Skip to main content

MTNAS: Search Multi-task Networks for Autonomous Driving

  • Conference paper
  • First Online:
Computer Vision – ACCV 2020 (ACCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12624))

Included in the following conference series:

  • 792 Accesses

Abstract

Multi-task learning (MTL) aims to learn shared representations from multiple tasks simultaneously, which has yielded outstanding performance in widespread applications of computer vision. However, existing multi-task approaches often demand manual design on network architectures, including shared backbone and individual branches. In this work, we propose MTNAS, a practical and principled neural architecture search algorithm for multi-task learning. We focus on searching for the overall optimized network architecture with task-specific branches and task-shared backbone. Specifically, the MTNAS pipeline consists of two searching stages: branch search and backbone search. For branch search, we separately optimize each branch structure for each target task. For backbone search, we first design a pre-searching procedure t1o pre-optimize the backbone structure on ImageNet. We observe that searching on such auxiliary large-scale data can not only help learn low-/mid-level features but also offer good initialization of backbone structure. After backbone pre-searching, we further optimize the backbone structure for learning task-shared knowledge under the overall multi-task guidance. We apply MTNAS to joint learning of object detection and semantic segmentation for autonomous driving. Extensive experimental results demonstrate that our searched multi-task model achieves superior performance for each task and consumes less computation complexity compared to prior hand-crafted MTL baselines. Code and searched models will be released at https://github.com/RalphLiu/MTNAS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    zero, skip-connect, max-pool-3x3, avg-pool3x3, sep-conv-3x3, sep-conv-5x5, dil-conv-3x3, dil-conv5x5.

References

  1. Caruana, R.: Multitask learning. Mac. Learn. 28, 41–75 (1997). https://doi.org/10.1023/A:1007379606734

    Article  Google Scholar 

  2. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. SPL 23, 1499–1503 (2016)

    Google Scholar 

  3. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)

    Google Scholar 

  4. Luvizon, D.C., Picard, D., Tabia, H.: 2D/3D pose estimation and action recognition using multitask deep learning. In: CVPR (2018)

    Google Scholar 

  5. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  6. Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: ICML (2018)

    Google Scholar 

  7. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)

    Google Scholar 

  8. Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI. (2019)

    Google Scholar 

  9. Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. In: ICLR (2018)

    Google Scholar 

  10. Ghiasi, G., Lin, T.Y., Le, Q.V.: NAS-FPN: learning scalable feature pyramid architecture for object detection. In: CVPR (2019)

    Google Scholar 

  11. Chen, Y., Yang, T., Zhang, X., Meng, G., Xiao, X., Sun, J.: Detnas: backbone search for object detection. In: NeurIPS (2019)

    Google Scholar 

  12. Peng, J., Sun, M., Zhang, Z.X., Tan, T., Yan, J.: Efficient neural architecture transformation search in channel-level for object detection. In: NeurIPS (2019)

    Google Scholar 

  13. Liu, C., et al.: Auto-DeepLab: hierarchical neural architecture search for semantic image segmentation. In: CVPR (2019)

    Google Scholar 

  14. Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction. In: NeurIPS (2018)

    Google Scholar 

  15. Cai, H., Zhu, L., Han, S.: Proxylessnas: direct neural architecture search on target task and hardware. In: ICLR (2019)

    Google Scholar 

  16. Liang, H., et al.: Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2019)

  17. Xu, Y., et al.: Pc-darts: partial channel connections for memory-efficient architecture search. In: ICLR (2019)

    Google Scholar 

  18. Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: CVPR (2018)

    Google Scholar 

  19. Chen, Z., Badrinarayanan, V., Lee, C.Y., Rabinovich, A.: Gradnorm: gradient normalization for adaptive loss balancing in deep multitask networks. In: ICML (2018)

    Google Scholar 

  20. Lin, X., Zhen, H.L., Li, Z., Zhang, Q.F., Kwong, S.: Pareto multi-task learning. In: NeurIPS (2019)

    Google Scholar 

  21. Misra, I., Shrivastava, A., Gupta, A., Hebert, M.: Cross-stitch networks for multi-task learning. In: CVPR (2016)

    Google Scholar 

  22. He, X., Zhou, Z., Thiele, L.: Multi-task zipping via layer-wise neuron sharing. In: NeurIPS (2018)

    Google Scholar 

  23. Meyerson, E., Miikkulainen, R.: Beyond shared hierarchies: deep multitask learning through soft layer ordering. In: ICLR (2018)

    Google Scholar 

  24. Mallya, A., Lazebnik, S.: Packnet: dding multiple tasks to a single network by iterative pruning. In: CVPR (2018)

    Google Scholar 

  25. Kim, E., Ahn, C., Torr, P.H., Oh, S.: Deep virtual networks for memory efficient inference of multiple tasks. In: CVPR (2019)

    Google Scholar 

  26. Ahn, C., Kim, E., Oh, S.: Deep elastic networks with model selection for multi-task learning. In: ICCV (2019)

    Google Scholar 

  27. Rosenbaum, C., Klinger, T., Riemer, M.: Routing networks: Adaptive selection of non-linear functions for multi-task learning. In: ICLR (2018)

    Google Scholar 

  28. Liang, J., Meyerson, E., Miikkulainen, R.: Evolutionary architecture search for deep multitask networks. In: GECCO (2018)

    Google Scholar 

  29. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: CVPR (2018)

    Google Scholar 

  30. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2015)

    Google Scholar 

  31. Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018ECCV 2018ECCV 2018, Part XIV. LNCS, vol. 11218, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_8

    Chapter  Google Scholar 

  32. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)

    Google Scholar 

  33. Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: DetNet: design backbone for object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part IX. LNCS, vol. 11213, pp. 339–354. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_21

    Chapter  Google Scholar 

  34. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2016)

    Google Scholar 

  35. Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: ICLR (2019)

    Google Scholar 

  36. Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: DPP-Net: device-aware progressive search for pareto-optimal neural architectures. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part XI. LNCS, vol. 11215, pp. 540–555. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_32

    Chapter  Google Scholar 

  37. Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: ICLR (2018)

    Google Scholar 

  38. Tan, M., et al.: Mnasnet: platform-aware neural architecture search for mobile. In: CVPR (2019)

    Google Scholar 

  39. He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for model compression and acceleration on mobile devices. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 815–832. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_48

    Chapter  Google Scholar 

  40. Wu, B., et al.: Fbnet: hardware-aware efficient convnet design via differentiable neural architecture search. In: CVPR (2019)

    Google Scholar 

  41. Yu, J., et al.: BigNAS: scaling up neural architecture search with big single-stage models. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part VII. LNCS, vol. 12352, pp. 702–717. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58571-6_41

    Chapter  Google Scholar 

  42. Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once-for-all: train one network and specialize it for efficient deployment. In: ICLR (2019)

    Google Scholar 

  43. He, C., Ye, H., Shen, L., Zhang, T.: Milenas: efficient neural architecture search via mixed-level reformulation. In: CVPR, pp. 11993–12002 (2020)

    Google Scholar 

  44. Guo, J., et al.: Hit-detector: hierarchical trinity architecture search for object detection. In: CVPR, pp. 11405–11414 (2020)

    Google Scholar 

  45. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  46. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)

    Google Scholar 

  47. Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In: ICCV (2019)

    Google Scholar 

  48. Peng, C., et al.: Megdet: a large mini-batch object detector. In: CVPR (2018)

    Google Scholar 

  49. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The Kitti vision benchmark suite. In: CVPR (2012)

    Google Scholar 

  50. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)

    Google Scholar 

  51. Yu, F., et al.: Bdd100k: A diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 (2018)

  52. Waymo open dataset: An autonomous driving dataset (2019)

    Google Scholar 

  53. Cao, J., Pang, Y., Li, X.: Triply supervised decoder networks for joint detection and segmentation. In: CVPR. (2019)

    Google Scholar 

  54. Li, L., Talwalkar, A.: Random search and reproducibility for neural architecture search. In: Uncertainty in Artificial Intelligence, PMLR, pp. 367–377 (2020)

    Google Scholar 

  55. Krizhevsky, A., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  56. Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  57. Siam, M., Gamal, M., Abdel-Razek, M., Yogamani, S., Jagersand, M.: Rtseg: real-time semantic segmentation comparative study. In: ICIP (2018)

    Google Scholar 

  58. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)

    Google Scholar 

  59. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40, 834–848 (2017)

    Article  Google Scholar 

  60. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49

    Chapter  Google Scholar 

  61. Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. TPAMI 39, 2481–2495 (2017)

    Article  Google Scholar 

  62. Treml, M., et al.: Speeding up semantic segmentation for autonomous driving. In: MLITS, NeurIPS Workshop (2016)

    Google Scholar 

  63. Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: ICCV (2015)

    Google Scholar 

  64. Kim, H., Lee, Y., Yim, B., Park, E., Kim, H.: On-road object detection using deep neural network. In: ICCE-Asia (2016)

    Google Scholar 

  65. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015)

    Google Scholar 

  66. Wu, B., Iandola, F., Jin, P.H., Keutzer, K.: Squeezedet: uinified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: CVPR (2017)

    Google Scholar 

  67. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: CVPR (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Liu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 250 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, H., Li, D., Peng, J., Zhao, Q., Tian, L., Shan, Y. (2021). MTNAS: Search Multi-task Networks for Autonomous Driving. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12624. Springer, Cham. https://doi.org/10.1007/978-3-030-69535-4_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69535-4_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69534-7

  • Online ISBN: 978-3-030-69535-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics