Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

EPIC: Traffic Engineering-Centric Path Programmability Recovery Under Controller Failures in SD-WANs

Published: 24 October 2024 Publication History

Abstract

Software-Defined Wide Area Networks (SD-WANs) offer a promising opportunity to enhance the performance of Traffic Engineering (TE). With the help of Software-Defined Networking (SDN), TE can promptly respond to traffic changes and maintain network performance by leveraging a global network view. One of the key benefits of SDN for TE is path programmability, which is empowered by SDN controllers to enable dynamic adjustments of flows’ forwarding paths. However, controller failures pose new challenges for SD-WANs since path programmability could be decreased due to the increasing number of offline flows, leading to potential TE performance degradation. Existing recovery solutions mainly focus on recovering path programmability for improving unpredictable network performance but cannot guarantee consistently satisfactory TE performance as expected, since path programmability can only indirectly evaluate network performance. In this paper, we propose EPIC to ensure robust TE performance under controller failures. We observe that frequently rerouted flows could greatly influence TE performance. Enlightened by this, EPIC introduces a novel metric called the TE performance-centric ratio to assess the relevance of different path programmability values for TE performance. The key idea of EPIC lies in identifying frequently rerouted flows during TE operations and prioritizing recovery of the path programmability of these flows under controller failures. We formulate an optimization problem to maximize TE performance-centric path programmability and propose an efficient heuristic algorithm to solve this problem. Evaluation results demonstrate that EPIC can improve average load balancing performance by up to 55.6% compared with baselines.

References

[1]
W. Xia, Y. Wen, C. H. Foh, D. Niyato, and H. Xie, “A survey on software-defined networking,” IEEE Commun. Surveys Tuts., vol. 17, no. 1, pp. 27–51, 1st Quart., 2014.
[2]
Z. Yang, Y. Cui, B. Li, Y. Liu, and Y. Xu, “Software-defined wide area network (SD-WAN): Architecture, advances and opportunities,” in Proc. 28th Int. Conf. Comput. Commun. Netw. (ICCCN), Jul. 2019, pp. 1–9.
[3]
S. Jain et al., “B4: Experience with a globally-deployed software defined WAN,” ACM SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 3–14, 2013.
[4]
C.-Y. Hong et al., “B4 and after: Managing hierarchy, partitioning, and asymmetry for availability and scale in Google’s software-defined WAN,” in Proc. Conf. ACM Special Interest Group Data Commun., New York, NY, USA, 2018, pp. 74–87.
[5]
C.-Y. Hong et al., “Achieving high utilization with software-driven WAN,” in Proc. ACM SIGCOMM Conf., Hong Kong, Aug. 2013, pp. 15–26.
[6]
U. Krishnaswamy, R. Singh, N. Bjørner, and H. Raj, “Decentralized cloud wide-area network traffic engineering with BLASTSHIELD,” in Proc. 19th USENIX Symp. Netw. Syst. Design Implement. (NSDI), 2022, pp. 325–338.
[7]
AWS RE: Invent 2016: Tuesday Night Live With James Hamilton. Accessed: Jul. 20, 2024. [Online]. Available: https://www.youtube.com/watch?v=AyOAjFNPAbA
[8]
Z. Guo, C. Li, Y. Li, S. Dou, B. Zhang, and W. Wu, “Maintaining the network performance of software-defined WANs with efficient critical routing,” IEEE Trans. Netw. Service Manage., vol. 21, no. 2, pp. 2240–2252, Apr. 2024.
[9]
A. Mendiola, J. Astorga, E. Jacob, and M. Higuero, “A survey on the contributions of software-defined networking to traffic engineering,” IEEE Commun. Surveys Tuts., vol. 19, no. 2, pp. 918–953, 2nd Quart., 2016.
[10]
J. Zhang, K. Xi, M. Luo, and H. J. Chao, “Load balancing for multiple traffic matrices using SDN hybrid routing,” in Proc. IEEE 15th Int. Conf. High Perform. Switching Routing (HPSR), Jul. 2014, pp. 44–49.
[11]
T. Das, V. Sridharan, and M. Gurusamy, “A survey on controller placement in SDN,” IEEE Commun. Surveys Tuts., vol. 22, no. 1, pp. 472–503, 1st Quart., 2019.
[12]
F. He and E. Oki, “Preventive priority setting against multiple controller failures in software defined networks,” IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 8, pp. 2352–2364, Aug. 2023.
[13]
S. Dou, G. Miao, Z. Guo, C. Yao, W. Wu, and Y. Xia, “Matchmaker: Maintaining network programmability for software-defined WANs under multiple controller failures,” Comput. Netw., vol. 192, Jun. 2021, Art. no.
[14]
S. Dou, Z. Guo, and Y. Xia, “ProgrammabilityMedic: Predictable path programmability recovery under multiple controller failures in SD-WANs,” in Proc. IEEE 41st Int. Conf. Distrib. Comput. Syst. (ICDCS), Jul. 2021, pp. 461–471.
[15]
S. Dou, Y. He, S. Liu, W. Wu, and Z. Guo, “RateSheriff: Multipath flow-aware and resource efficient rate limiter placement for data center networks,” in Proc. IEEE/ACM 31st Int. Symp. Quality Service (IWQoS), Jun. 2023, pp. 1–10.
[16]
H. Ni, Z. Guo, C. Li, S. Dou, C. Yao, and T. Baker, “Network coding-based resilient routing for maintaining data security and availability in software-defined networks,” J. Netw. Comput. Appl., vol. 205, Sep. 2022, Art. no.
[17]
J. Bogle et al., “TEAVAR: Striking the right utilization-availability balance in WAN traffic engineering,” in Proc. ACM Special Interest Group Data Commun., Aug. 2019, pp. 29–43.
[18]
C. Jiang, S. Rao, and M. Tawarmalani, “PCF: Provably resilient flexible routing,” in Proc. Annu. Conf. ACM Special Interest Group Data Commun. Appl., Technol., Archit., Protocols Comput. Commun., Jul. 2020, pp. 139–153.
[19]
P. Kumar et al., “Semi-oblivious traffic engineering: The road not taken,” in Proc. 15th USENIX Symp. Netw. Syst. Design Implement. (NSDI), 2018, pp. 157–170.
[20]
Z. Xu, F. Y. Yan, R. Singh, J. T. Chiu, A. M. Rush, and M. Yu, “Teal: Learning-accelerated optimization of WAN traffic engineering,” in Proc. ACM SIGCOMM Conf., 2023, pp. 378–393.
[21]
D. Kreutz, F. M. V. Ramos, P. E. Veríssimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, “Software-defined networking: A comprehensive survey,” Proc. IEEE, vol. 103, no. 1, pp. 14–76, Jan. 2015.
[22]
N. Wang, K. H. Ho, G. Pavlou, and M. Howarth, “An overview of routing optimization for internet traffic engineering,” IEEE Commun. Surveys Tuts., vol. 10, no. 1, pp. 36–56, 1st Quart., 2008.
[23]
J. Chu and C.-T. Lea, “Optimal link weights for IP-based networks supporting hose-model VPNs,” IEEE/ACM Trans. Netw., vol. 17, no. 3, pp. 778–788, Jun. 2009.
[24]
Y. Zhang et al., “Prophet: Traffic engineering-centric traffic matrix prediction,” IEEE/ACM Trans. Netw., vol. 32, no. 1, pp. 822–832, Feb. 2024.
[25]
M. Parham, T. Fenz, N. Süss, K.-T. Foerster, and S. Schmid, “Traffic engineering with joint link weight and segment optimization,” in Proc. 17th Int. Conf. Emerg. Netw. EXperiments Technol., Dec. 2021, pp. 313–327.
[26]
Y. Perry et al., “DOTE: Rethinking (predictive) WAN traffic engineering,” in Proc. USENIX Symp. Netw. Syst. Design Implement. (NSDI), 2023, pp. 1557–1581.
[27]
S. Dou and Z. Guo, “Path programmability recovery under controller failures for SD-WANs: Recent advances and future research challenges,” IEEE Commun. Mag., early access, 2024. 10.1109/MCOM.001.2300459.
[28]
D. Mitra and K. G. Ramakrishnan, “A case study of multiservice, multipriority traffic engineering design for data networks,” in Proc. Seamless Interconnection Universal Services. Global Telecommun. Conf. (GLOBECOM), Dec. 1999, pp. 1077–1083.
[29]
B. Heller, R. Sherwood, and N. McKeown, “The controller placement problem,” ACM SIGCOMM Comput. Commun. Rev., vol. 42, no. 4, pp. 473–478, 2012.
[30]
D. Ongaro and J. K. Ousterhout, “In search of an understandable consensus algorithm,” in Proc. USENIX Annu. Tech. Conf., 2014, pp. 305–319.
[31]
OpenFlow Switch Specification Version 1.3.0. Accessed: Jul. 20, 2024. [Online]. Available: https://www.opennetworking.org/wp-content/uploads/2014/10/openflow-spec-v1.3.0.pdf
[32]
Y. Tian, W. Chen, and C.-T. Lea, “An SDN-based traffic matrix estimation framework,” IEEE Trans. Netw. Service Manage., vol. 15, no. 4, pp. 1435–1445, Dec. 2018.
[33]
Y. Guo, Z. Wang, X. Yin, X. Shi, and J. Wu, “Traffic engineering in hybrid SDN networks with multiple traffic matrices,” Comput. Netw., vol. 126, pp. 187–199, Oct. 2017.
[34]
K. Gao et al., “Incorporating intra-flow dependencies and inter-flow correlations for traffic matrix prediction,” in Proc. IEEE/ACM 28th Int. Symp. Quality Service (IWQoS), Jun. 2020, pp. 1–10.
[35]
V. A. Le, T. T. Le, P. L. Nguyen, H. T. T. Binh, and Y. Ji, “Multi-time-step segment routing based traffic engineering leveraging traffic prediction,” in Proc. IFIP/IEEE Int. Symp. Integr. Netw. Manag. (IM), May 2021, pp. 125–133.
[36]
Z. Wang et al., “Large-scale measurements and prediction of DC-WAN traffic,” IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 5, pp. 1390–1405, May 2023.
[37]
J. Xie, D. Guo, X. Li, Y. Shen, and X. Jiang, “Cutting long-tail latency of routing response in software defined networks,” IEEE J. Sel. Areas Commun., vol. 36, no. 3, pp. 384–396, Mar. 2018.
[38]
G. Yao, J. Bi, and L. Guo, “On the cascading failures of multi-controllers in software defined networks,” in Proc. 21st IEEE Int. Conf. Netw. Protocols (ICNP), May 2013, pp. 1–2.
[39]
A. Caprara, H. Kellerer, and U. Pferschy, “The multiple subset sum problem,” SIAM J. Optim., vol. 11, no. 2, pp. 308–319, Jan. 2000.
[40]
J. Renegar, “A polynomial-time algorithm, based on Newton’s method, for linear programming,” Math. Program., vols. 40–40, nos. 1–3, pp. 59–93, Jan. 1988.
[41]
Gurobi Optimization. Accessed: Jul. 20, 2024. [Online]. Available: https://www.gurobi.com
[42]
L. A. Wolsey and G. L. Nemhauser, Integer and Combinatorial Optimization. Hoboken, NJ, USA: Wiley, 2014.
[43]
T. L. Magnanti, P. Mirchandani, and R. Vachani, “Modeling and solving the two-facility capacitated network loading problem,” Oper. Res., vol. 43, no. 1, pp. 142–157, 1995.
[44]
S. Uhlig, B. Quoitin, J. Lepropre, and S. Balon, “Providing public intradomain traffic matrices to the research community,” ACM SIGCOMM Comput. Commun. Rev., vol. 36, no. 1, pp. 83–86, Jan. 2006.
[45]
A. Azzouni and G. Pujolle, “NeuTM: A neural network-based framework for traffic matrix prediction in SDN,” in Proc. NOMS IEEE/IFIP Netw. Oper. Manage. Symp., Apr. 2018, pp. 1–5.
[46]
A. Destounis, S. Paris, L. Maggi, G. S. Paschos, and J. Leguay, “Minimum cost SDN routing with reconfiguration frequency constraints,” IEEE/ACM Trans. Netw., vol. 26, no. 4, pp. 1577–1590, Aug. 2018.
[47]
Z. Ye, G. Sun, and M. Guizani, “ILBPS: An integrated optimization approach based on adaptive load-balancing and heuristic path selection in SDN,” IEEE Internet Things J., vol. 11, no. 4, pp. 6144–6157, Feb. 2024.
[48]
S. Dou, L. Qi, C. Yao, and Z. Guo, “Exploring the impact of critical programmability on controller placement for software-defined wide area networks,” IEEE/ACM Trans. Netw., vol. 31, no. 6, pp. 2575–2588, Jun. 2023.
[49]
Q. Qin, K. Poularakis, G. Iosifidis, and L. Tassiulas, “SDN controller placement at the edge: Optimizing delay and overheads,” in Proc. IEEE Conf. Comput. Commun. (INFOCOM), Apr. 2018, pp. 684–692.
[50]
A. Tootoonchian, M. Ghobadi, and Y. Ganjali, “OpenTM: Traffic matrix estimator for OpenFlow networks,” in Proc. Int. Conf. Passive Act. Netw. Meas. Cham, Switzerland: Springer, 2010, pp. 201–210.
[51]
N. L. M. van Adrichem, C. Doerr, and F. A. Kuipers, “OpenNetMon: Network monitoring in OpenFlow software-defined networks,” in Proc. IEEE Netw. Operat. Manage. Symp. (NOMS), Jun. 2014, pp. 1–8.
[52]
H. Räcke, “Optimal hierarchical decompositions for congestion minimization in networks,” in Proc. 40th Annu. ACM Symp. Theory Comput., May 2008, pp. 255–264.
[53]
J. Moy, “OSPF version 2,” Internet Eng. Task Force (IETF), Wilmington, DE, USA, Tech. Rep. RFC 2178, 1997.
[54]
D. Thaler and C. Hopps, “Multipath issues in unicast and multicast next-hop selection,” Internet Eng. Task Force (IETF), Wilmington, DE, USA, Tech. Rep. RFC 2991, 2000.
[55]
J. Xie, D. Guo, C. Qian, L. Liu, B. Ren, and H. Chen, “Validation of distributed SDN control plane under uncertain failures,” IEEE/ACM Trans. Netw., vol. 27, no. 3, pp. 1234–1247, Jun. 2019.
[56]
L. Guillen, S. Izumi, T. Abe, and T. Suganuma, “A resilient mechanism for multi-controller failure in hybrid SDN-based networks,” in Proc. 22nd Asia–Pacific Netw. Oper. Manage. Symp. (APNOMS), Sep. 2021, pp. 285–290.
[57]
Y. Feng, W. Zhang, Z. Feng, X. Zhong, and F. Liu, “An MTD-driven hybrid defense method against DDoS based on Markov game in multi-controller SDN-enabled IoT networks,” in Proc. IEEE/ACM Int. Symp. Quality Service (IWQoS), Jun. 2024, pp. 1–6.
[58]
Y. Dai, A. Wang, Y. Guo, and S. Chen, “Elastically augmenting the control-path throughput in SDN to deal with Internet DDoS attacks,” ACM Trans. Internet Technol., vol. 23, no. 1, pp. 1–25, Feb. 2023.
[59]
F. Altheide, S. Buttgereit, and M. Rossberg, “Increasing resilience of SD-WAN by distributing the control plane [extended version],” IEEE Trans. Netw. Service Manage., vol. 21, no. 3, pp. 2569–2581, Jun. 2024.
[60]
J. A. Marques, K. Levchenko, and L. P. Gaspary, “Responding to network failures at data-plane speeds with network programmability,” in Proc. NOMS IEEE/IFIP Netw. Oper. Manage. Symp., May 2023, pp. 1–10.
[61]
F. He and E. Oki, “Main and secondary controller assignment with optimal priority policy against multiple failures,” IEEE Trans. Netw. Service Manage., vol. 18, no. 4, pp. 4391–4405, Dec. 2021.
[62]
X. Wang et al., “The joint optimization of online traffic matrix measurement and traffic engineering for software-defined networks,” IEEE/ACM Trans. Netw., vol. 28, no. 1, pp. 234–247, Feb. 2020.
[63]
Z. Guo, S. Dou, Y. Wang, S. Liu, W. Feng, and Y. Xu, “HybridFlow: Achieving load balancing in software-defined WANs with scalable routing,” IEEE Trans. Commun., vol. 69, no. 8, pp. 5255–5268, Aug. 2021.
[64]
T. Wang, F. Liu, and H. Xu, “An efficient online algorithm for dynamic SDN controller assignment in data center networks,” IEEE/ACM Trans. Netw., vol. 25, no. 5, pp. 2788–2801, Oct. 2017.
[65]
S. Troia, F. Sapienza, L. Varé, and G. Maier, “On deep reinforcement learning for traffic engineering in SD-WAN,” IEEE J. Sel. Areas Commun., vol. 39, no. 7, pp. 2198–2212, Jul. 2021.
[66]
J. Zheng, Y. Xu, L. Wang, H. Dai, and G. Chen, “Online joint optimization on traffic engineering and network update in software-defined WANs,” in Proc. IEEE INFOCOM Conf. Comput. Commun., May 2021, pp. 1–10.
[67]
E. H. Bouzidi, A. Outtagarts, R. Langar, and R. Boutaba, “Deep Q-network and traffic prediction based routing optimization in software defined networks,” J. Netw. Comput. Appl., vol. 192, Oct. 2021, Art. no.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking  Volume 32, Issue 6
Dec. 2024
985 pages

Publisher

IEEE Press

Publication History

Published: 24 October 2024
Published in TON Volume 32, Issue 6

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 3
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)3
Reflects downloads up to 04 Feb 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media