ACI Muti-Pod
ACI Muti-Pod
ACI Muti-Pod
Web App DB
Outside QoS QoS QoS
(Tenant
Filter Service Filter
VRF)
APIC
Application Policy
ACI Fabric Infrastructure Controller
Integrated GBP VXLAN Overlay
6
ACI Model Tenant L3, L2 Isolation
Tenant
EPG … Outside
subnet self-contained
L2 or L3 tenant definition
EPG APP SERVER representable as
BD a recursive
structured text
subnet
EPG WEB subnet
document
EP
EP BD
With or
EP without
. flooding
semantics
. L3 context
Application profile
. (isolated tenant VRF)
ACI MultiPod/MultiSite Use Cases
L2/L3
Pod ‘A’ IP Network Pod ‘n’ Site ‘A’ IP Network Site ‘n’
MP-BGP - EVPN
… MP-BGP - EVPN
APIC Cluster
Dual-Fabric with Common Default GW
New in 11.2 (Brazos)
• Two independent ACI fabrics. Two • L2 connection between fabrics.
management and configuration domains. Active/Active or Active/Standby
• Active/Active workload. Extend L2 and • Consistent end-to-end policy
subnet across.
• Dark fiber/OTV for L2 extension
• Common default GW on both fabrics
BD1 BD1
GW : 1.1.1.1 GW : 1.1.1.1
BD2 BD2
GW : 2.2.2.2 GW : 2.2.2.2
New
Common virtual MAC
Unique “SVI” IP
between two fabrics
DC Site 1 DC Site 2
vCenter
Fabric stretched to two sites works as a Work with one or more transit leaf per site
single fabric deployed within a DC any leaf node can be a transit leaf
One APIC cluster one management and Number of transit leaf and links dictated by
configuration point redundancy and bandwidth capacity decision
Anycast GW on all leaf switches Different options for Inter-site links (dark fiber,
DWDM, EoMPLS PWs)
Stretched ACI Fabric
Option – Ethernet over MPLS (EoMPLS) Port mode
DC Site 1 DC Site 2
10 ms RTT
800 KM / 500 miles
QSFP-40G-SR4
40G
10G/40G/100G
40G
EoMPLS Pseudowire
10G/40G/100G
40G 40G
WAN
Port mode EoMPLS used to stretch the ACI 1.0(3f) release or later, 10ms max RTT between
fabric over long distance sites
DC Interconnect links could be 10G (minimum) or Under normal conditions 10 ms allows us to support two
higher with 40G facing the Leaf/Spine nodes DCs up to 800 Km apart
DWDM or Dark Fiber provide connectivity between Other ports on the Router used for connecting to the WAN
two sites via L3Out
Stretched ACI Fabric
Support for 3 Interconnected Sites (11.2) Site 2
Site 1
Site 3
Transit Leaf
2x40G or 4x40G
Agenda
• ACI Introduction and Multi-Fabric Use Cases
• ACI Multi-Fabric Design Options
• ACI Stretched Fabric Overview
• ACI Multi-Pod Solution Deep Dive
• ACI Multi-Site Solutions Overview
• Conclusions
ACI Multi-Pod Solution
Overview
Inter-Pod Network
MP-BGP - EVPN
Multiple ACI Pods connected by an IP Inter-Pod Forwarding control plane (IS-IS, COOP)
L3 network, each Pod consists of leaf and spine fault isolation
nodes Data Plane VXLAN encapsulation between
Managed by a single APIC Cluster Pods
Single Management and Policy Domain End-to-end policy enforcement
ACI Multi-Pod Solution
Use Cases
Handling 3-tiers physical Pod
cabling layout Inter-Pod
Leaf Nodes Network
Cable constrain (multiple
buildings, campus, metro)
requires a second tier of “spines” Spine Nodes
Preferred option when compared
to ToR FEX deployment
Software
The solution will be available from Q3CY16 SW Release
Hardware
The Multi-Pod solution can be supported with all currently shipping Nexus
9000 platforms
The requirement is to use multicast in the Inter-Pod Network for handling
BUM (L2 Broadcast, Unknown Unicast, Multicast) traffic across Pods
ACI Multi-Pod Solution
Supported Topologies
Intra-DC Two DC sites connected
back2back
10G/40G/100G
40G/100G 40G/100G
Pod 1 Pod n Pod 1 40G/100G 40G/100G
Pod 2
Dark fiber/DWDM (up
to 10 msec RTT)
…
APIC Cluster
DB Web/App Web/App DB
APIC Cluster
Web/App Web/App
L3
40G/100G 40G/100G 40G/100G
Pod 3
ACI Multi-Pod Solution
Scalability Considerations
OSPF to peer with the spine nodes and learn VTEP reachability
Shard 1 Shard 11
Shard Shard 1
APIC APIC APIC
Shard 2 Shard 3 Shard 2 Shard 3 Shard 2 Shard 3
When an APIC fails a backup copy of the shard is promoted to active and it takes over
for all tasks associated with that portion of the Data Base
APIC – Design Considerations
X X
APIC APIC APIC
Additional APIC will increase the system APIC will allow read-only access to the DB
scale (today up to 5 nodes supported) but when only one node remains active
does not add more redundancy (standard DB quorum)
3 6 4
DHCP response reaches Spine 1
allowing its full provisioning
2 7
Discovery and Discovery and
provisioning of all the
devices in the local Pod
11 Single APIC Cluster
8
provisioning of all the
devices in the local Pod
APIC Node 1 connected to a
9 APIC Node 2 connected
APIC Node 2 joins the to a Leaf node in Pod 2
Leaf node in ‘Seed’ Pod 1
‘Seed’ Pod 1 Cluster Pod 2
8
9 1
10 Discover other Pods following the same procedure
ACI Multi-Pod Solution
IPN Control Plane
IPN Global VRF
IP Prefix Next-Hop
10.0.0.0/16 Pod1-S1, Pod1-S2, Pod1-S3, Pod1-S4
APIC
ACI Fabric decouples the tenant end-point address, it’s “identifier”, from the location of that end-
point which is defined by it’s “locator” or VTEP address
Forwarding within the Fabric is between VTEPs (ACI VXLAN tunnel endpoints) and leverages an
extender VXLAN header format referred to as the ACI VXLAN policy header
The mapping of the internal tenant MAC or IP address to location is performed by the VTEP using
a distributed mapping database
Host Routing - Inside
Inline Hardware Mapping DB - 1,000,000+ Hosts
10.1.3.35 Leaf 3
10.1.3.11 Leaf 1
Global Station Table Proxy Proxy Proxy Proxy fe80::8e5e Leaf 4
contains a local cache fe80::5b1a Leaf 6
of the fabric endpoints
10.1.3.35 Leaf 3
Proxy Station Table contains
addresses of ‘all’ hosts attached
* Proxy A to the fabric
10.1.3.11 Port 9
Local Station Table The Forwarding Table on the Leaf Switch is divided between local (directly attached) and
contains addresses of global entries
‘all’ hosts attached The Leaf global table is a cached portion of the full global table
directly to the Leaf
If an endpoint is not found in the local cache the packet is forwarded to the ‘default’
forwarding table in the spine switches (1,000,000+ entries in the spine forwarding table)
What is MP-BGP EVPN?
Spine encapsulates
172.16.2.40 Leaf 4 Leaf 4
172.16.1.20 Proxy B traffic to remote 172.16.1.20
Proxy A
172.16.2.40
Proxy B Spine VTEP Spine encapsulates
traffic to local leaf
3 4
Proxy A Proxy B
172.16.2.40 Pod1 L4
* Proxy A
5 * Proxy B
172.16.1.20 Pod2 L4
172.16.2.40 Pod1 L4
* Proxy A
8 * Proxy B
MP-BGP
MP-BGP
EVPN IP Network IP Network
EVPN
MP-BGP
EVPN
Multi-Pod and GOLF
Intra-DC Deployment – Control Plane
GOLF
Devices WAN
WAN routes received on the Pod
spines as EVPN routes and translated
MP-BGP EVPN Control Plane to VPNv4/VPNv6 routes with the spine
proxy TEP as Next-Hop
IPN
Public BD subnets advertised to
GOLF devices with the external
spine-proxy TEP as Next-Hop
Multiple
Pods
Web/App Web/App
... Web/App
DB DB
Single
Single APIC Cluster
APIC Domain
Multi-Pod and GOLF
Intra-DC Deployment – Control Plane
Multiple
Pods
Web/App Web/App
... Web/App
DB DB
Single
Single APIC Cluster
APIC Domain
Web/App Web/App
DB DB
Single APIC Cluster
Multi-Pod and GOLF
Multi-DC Deployment – Data Plane
IPN
Proxy A Proxy B
Spine encapsulates traffic
to the destination VTEP
that can then apply policy
Web/App Web/App
DB DB
Single APIC Cluster
Multi-Pod and GOLF
Multi-DC Deployment – Data Plane (2)
Traffic is received by
the external user
GOLF devices de-encapsulate traffic WAN
and route it to the WAN (or LISP
encapsulates to the remote router)
IPN
Web/App Web/App
DB
Single APIC Cluster
ACI Multi-Pod Solution
Summary
L2/L3
DCI
MP-BGP - EVPN
…
Separate APIC
IS-IS, COOP, MP-BGP Clusters IS-IS, COOP, MP-BGP
Multiple ACI fabrics connected via IP Network End to end policy enforcement
Separate availability zones with maximum isolation with policy collaboration
Separate APIC clusters, separate management and Support multiple sites
policy domains, separate fabric control planes Not bound by distance
ACI Multi-Site
Reachability
Inter-Site Network
MP-BGP - EVPN
…
Separate APIC
Clusters
Web1 Web2 Import Web & App Export Web & App Web1 Web2
from Fabric ‘B’ to Fabric ‘A’
MP-BGP - EVPN
…
Separate APIC
Clusters
Policy is applied at provider of the contract (always at fabric where the provider endpoint is
connected)
Scoping of changes
No need to propagate all policies to all fabrics
Different policy applied based on source EPG (which fabric)
Agenda
• ACI Introduction and Multi-Fabric Use Cases
• ACI Multi-Fabric Design Options
• ACI Stretched Fabric Overview
• ACI Multi-Pod Solution Deep Dive
• ACI Multi-Site Solutions Overview
• Conclusions
Conclusions