Application Centric Infrastructure (ACI): Datacenter SDN

In this post I’m going to walk through the initial configuration of an ACI fabric, outlining these main steps:

Application Center Infrastructure (ACI) Overview

ACI simplifies and accelerates application deployment through the use of an orchestration controller and hierarchical, object-oriented, policy-centric configuration. The solution will automatically deploy a VXLAN powered spine-leaf network that can be deployed in minutes and scaled in seconds. A major focus of this solution is interoperability and, as such, comes with a slew of integration options including a variety of out-of-the-box integrations, robust REST API, and python SDK.

Before jumping in to configuration, it’s a good idea to run through a few ACI basics and do a little planning. There are many more in-depth guides out there and I suggest you leverage a few such as the official Cisco docs and the Unofficial ACI Guide (https://unofficialaciguide.com/). While noted as unofficial, this site houses some of the best ACI information available online. I’ll make substantial use of a bunch of external content in this post, most notably the following:

ACI Initial Deployment Cookbook: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/white_papers/Cisco-ACI-Initial-Deployment-Cookbook.html

ACI Best Practices: https://unofficialaciguide.com/2018/11/29/aci-best-practice-configurations/

ACI Naming Conventions: https://unofficialaciguide.com/2018/02/16/aci-naming-convention-best-practices/

I’ll explain the key concepts such as tenants, endpoint groups, and contracts, in more detail within relevant sections.

Here’s the starting topology of my lab, including a pair of Nexus 9336PQ spines, a pair of 9372PX leaves and a single APIC-M1 controller.

ACI Physical Topology

Initialize the APIC Cluster

The first step is to set up the APIC controllers. The process here is straightforward, we’ll make only a few customizations from the default config. In this process we’ll ensure the TEP subnet and infra VLAN will not be used anywhere else in the network. FYI, the TEP network will not be exposed outside of the ACI Fabric (except in certain edge cases). The VLAN Cisco recommends is 3912. Here’s what the cluster config looks like:

Cluster configuration ...
  Enter the fabric name [ACI Fabric1]: Hive Fabric
  Enter the fabric ID (1-128) [1]: 
  Enter the number of active controllers in the fabric (1-9) [3]:
  Enter the POD ID (1-9) [1]: 
  Is this a standby controller? [NO]: 
  

  Is this an APIC-X? [NO]:
  Enter the controller ID (1-3) [1]:
  Enter the controller name [apic1]: hive-apic-1
  Enter address pool for TEP addresses [10.0.0.0/16]: 10.255.0.0/16
  Note: The infra VLAN ID should not be used elsewhere in your environment
        and should not overlap with any other reserved VLANs on other platforms.
  Enter the VLAN ID for infra network (2-4094): 3912
  Enter address pool for BD multicast addresses (GIPO) [225.0.0.0/15]:

Out-of-band management configuration ...
  Enable IPv6 for Out of Band Mgmt Interface? [N]:
  Enter the IPv4 address [192.168.10.1/24]: 10.0.0.10/24
  Enter the IPv4 address of the default gateway [None]: 10.0.0.1
  Enter the interface speed/duplex mode [auto]:

admin user configuration ...
  Enable strong passwords? [Y]: 
  Enter the password for admin:

  Reenter the password for admin:

Cluster configuration ...
  Fabric name: Hive Fabric
  Fabric ID: 1
  Number of controllers: 3
  Controller name: hive-apic-1
  POD ID: 1
  Controller ID: 1
  TEP address pool: 10.255.0.0/16
  Infra VLAN ID: 3912
  Multicast address pool: 225.0.0.0/15

Out-of-band management configuration ...
  Management IP address: 10.0.0.10/24
  Default gateway: 10.0.0.1
  Interface speed/duplex mode: auto

admin user configuration ...
  Strong Passwords: Y
  User name: admin
  Password: ************

The above configuration will be applied ...

Warning: TEP address pool, Infra VLAN ID and Multicast address pool
         cannot be changed later, these are permanent until the
         fabric is wiped.

Would you like to edit the configuration? (y/n) [n]:

Commission the Spine and Leaf Nodes

Once the APIC controller comes online, navigate to the management IP (https://<mgmt ip>). With our controller operational we can now commission our fabric. With ACI we get all of the benefits of a VXLAN based spine/leaf architecture with near zero provisioning and management overheard. This rapid of deployment and focus on simplified scale makes ACI the perfect fit for anyone looking to focus on application policy. First, navigate to Fabric -> Fabric Membership -> Nodes Pending Registration to assure your leaf switches connected to your APIC controllers are showing up. Note that you won’t see your spines until leaves are commissioned in the fabric.

Fabric Membership – Nodes Pending Registration

Right click on one of the leaf nodes and select “Register.” From here, we’ll make our first use of the Unofficial ACI Guide by using their recommendations for spine/leaf numbering. Use a range of 101-199 for spines and 201+ for leaves.

Register Node

Once this process is complete for all spines and leaves, you can verify that they are in active state via the “Registered Nodes” tab.

Fabric Membership – Registered Nodes

Now’s also a good time to scope the Topology page. You’ll notice in my environment I have only one APIC controller. Fine for a test drive, but in production it’s recommended to use a cluster of three or more.

** If your APIC, spines, or leaves have been used in an ACI fabric before, be sure to perform a factory reset before proceeding using the steps in Appendix A. If you are working with factory stock devices, skip this step.

With our fabric in place, we’ll configure a number of objects and policies that will allow us to map an identity to our leaf nodes. Our goal here is to set up a Virtual Port Channel (vPC) southbound to the pre-configure pair of Fabric Interconnects (FIs). Before we jump to the port configuration, though, we’ll need to set up a few global and leaf policies.

Navigate to Fabric -> Access Policies -> Policies -> Global, right-click “Attachable Access Entity Profile” and click “Create Attachable Access Entity Profile.” All we need is a name for now, as this will server as the object through which we associate many of our forthcoming policies.

Attachable Access Entity Profile

Sticking to the Fabric -> Access Policies section, navigate to Pools -> VLAN, right-click then click “Create VLAN Pool.” Most environments will need only a single, static VLAN pool encompassing the scope of all necessary VLANs. A second VLAN pool may be necessary down the road for certain integrations. In my lab, I’m going to make a single VLAN pool for VLANs 1-2000.

Create VLAN Pool

Now navigate to Fabric -> Access Policies -> Physical and External Domains, right click on “Physical Domains” and click “Create Physical Domain.” In here, attach the AAEP and VLAN pool, providing the first illustration of how we bind together objects with ACI.

Physical Domain

Next we’ll create a few interface policies, starting with a “Link Level” policy per link type. Navigate to Fabric -> Access Policies -> Policies -> Interface, right-click “Link Level,” then click “Create Link Level Policy.” We’ll just give this a name and specify 10G link speed.

Link Level Policy – 10 Gig

Within the same “Interfaces” hierarchy, create a CDP and LLDP policy per your spec. I’m going to enable both of these for my lab, so I’ll create a policy for each with the names “CDP_Enable” and “LLDP_Enable” accordingly. Last bit here is a port-channel policy, as shown below.

Port Channel Policy – LACP Active

So far so good, just atomic units of networking to be applied to higher order constructs, such as the policy group we’ll use for our VPC domain. At this point, take another look at the lab topology incorporating the Fabric Interconnects. Keep in mind the VPC configuration must reside on the leaves, allowing FI-A and FI-B to perceive connectivity to a single northbound switch.

ACI Topology – VPC Connectivity to Fabric Interconnects

Navigate to Fabric -> Access Policies -> Interfaces -> Leaf Interfaces -> Policy Groups, right click “VPC Interface” and click “Create VPC Interface group.” We will create one of these for attaching to Fabric Interconnect A and one for Fabric Interconnect B, specifying the feature policies and the AAEP.

VPC Interface Policy Group

Now we need a bit of configuration to assure our leaves know to maintain a VPC domain. With ACI you no longer need a VPC peer link, but you do need to create a policy to leverage VPC. This is done by navigating to Fabric -> Access Policies -> Policies -> Switch, right clicking “Virtual Port Channel Default,” then clicking “Create VPC Explicit Protection Group.” We’ll use the naming guide here again (as with nearly every setting in this setup!) which specifies the usage of the leaves in question as the name along with the ID of the first leaf as the VPC Explicit Protection Group ID.

VPC Explicit Protection Group

You’ll notice that once created, an IP from your TEP (Tunnel Endpoint) IP pool is assigned to a virtual IP associated with this logical VPC pair. Now we need to create profiles associated with each physical leaf, assigning profiles to specific ports. For this task I could create a separate profile for each leaf or create a single profile for my VPC pair since their physical connectivity will always match. Given the scope of my lab, I’m going with the later.

Navigate to Fabric -> Access Policies -> Interfaces -> Leaf Interfaces and right-click Profiles, then click “Create Leaf Interface Profile.” This first leaf has just two ports requiring configuration, eth1/25 for FI-A and eth1/26 for FI-B. Create an “Access Port Selector” for each of these ports, selecting the associated “Interface Policy Group.”

Once complete, right click the leaf interface profile for leaf 201 and click “Clone.” Rename it Leaf202_IntProf and you’re ready to assign these to the physical leafs.

Cloning a Leaf Interface Profile

To map our policies to our physical equipment, navigate to Fabric -> Access Policies -> Switches -> Leaf Switches, right-click “Profiles” then select “Create Leaf Profile.” On the first page, provide a name then add a “Leaf Selector” line item to specify the first leaf (in my case, 201). Press next then select the matching interface profile for this leaf.

Leaf Switch Profile (Switch Selector)

Repeat this process for Leaf 202 and we’re done with the configuration of the physical infrastructure….for now! Onwards to the land of application centric policy!

Set up Primary Policy Constructs (Tenant, AP, & EPGs) and Network Boundaries (VRFs & Bridge Domains)

Before we configure this aspect of ACI, be sure to understand the hierarchical nature of the components that make up the ACI architecture. Here’s a fantastic rendering of the hierarchy outlined in the book “Troubleshooting ACI Infrastructure” found here: https://aci-troubleshooting-book.readthedocs.io/en/latest/

ACI Policy Model

With this in mind, it’s obvious to start with the creation of our tenant. Navigate to the “Tenant” tab and click “Add Tenant.” From here, outline just the name of your tenant, keeping it as concise as possible, and press submit.

With our tenant created, we can now proceed to create sub-elements via the traditional menu on the left, or by leveraging the ACI GUI, dragging items onto our topology. Let’s take advantage of the GUI and drag a VRF circle on to the blank topology canvas.

Now we’re going to create “Bridge Domains” which will define our layer 2 (subnet) boundaries. For your reference, here’s an excerpt from my IPAM spreadsheet, outlining the various vlans and subnets I need to support.

To create the management Bridge Domain (subnet), drag a “Bridge Domain” circle on top of the VRF circle we just created, ensuring there is a line connecting the two. Specify a name then head to the “L3 Configurations” tab and click the “+” next to subnets. Add your subnet gateway IP and subnet, then click OK. Press OK once more in the bridge domain creation page to finalize the BD.

Repeat this process for all necessary subnets.

With these networking constructs complete, we can begin defining application level boundaries. First, we need an “Application Profile” under which our various hosts (virtual and physical) will reside in groups called “Endpoint Groups (EPGs).” First, using the left hand menu, navigate to your tenant’s tree, right click “Application Profile” and click “Create Application Profile.” In here we can outline a name and quickly create the necessary Endpoint Groups aligned with our Bridge Domains. Click the “+” next to EPGs and create EPG line items for each of your subnets and select their appropriate BD along with the Physical Domain.

Application Profile and Endpoint Groups

Now we need to associate physical ports to these endpoint groups, so that ACI understands how the application level policy and physical policy align. To do so, expand the menu under the EPG, right click “Static Ports,” and select “Deploy Static EPG on PC, VPC, or Interface.” In here, select the option for “VPC,” select the VPC for FI-A, and specify the Port Encap VLAN.

Repeat this process for the other EPGs. With these steps complete, we now have an operational ACI fabric that could theoretically allow for host to host communication within a particular EPG. I already have a bunch of endpoints sitting under my Fabric Interconnect, so to validate that my configuration is working I can check to see if ACI is auto-detecting any endpoints. When I navigate to my EPG and look under the Operational -> Client End-Points tabs, I see a bunch of learned MAC addresses, along with a single host IP (due to a ping from the host to the ACI gateway for this EPG’s Bridge Domain).

With our fully functional ACI fabric ready to rock, we can now tackle the task of incorporating this fabric as a cohesive element of our broader infrastructure. For this, I will be establishing layer 3 connectivity with my core switches via OSPF. Zooming out on the lab topology, you can see the way in which I’ve set up physical connectivity and IP addressing:

To configure this connectivity in ACI, we’ll first navigate to Fabric -> Access Policies -> Physical and External Domains, right-click “External Routed Domains,” and click “Create Layer 3 Domain.” For this, I will specify the same name as my VRF (Prod) and associate my AAEP and VLAN pool.

L3 Out Physical Domain

Now we need a few policies to configure the uplink ports. Navigate to Fabric -> Access Policies -> Interfaces -> Leaf Interfaces -> Policy Groups, right-click “Leaf Access Port” and click “Create Access Port Policy Group.” Specify the low level policies for Link Level, CDP, LLDP, and AAEP.

Leaf Access Port Policy for L3 Uplink Ports

Now we’ll head back in to our previously created leaf profiles and add this new port policy, aligning it to the uplink ports. Navigate to Fabric -> Access Policies -> Interfaces -> Leaf Interfaces -> Profiles, right-click the profile for Leaf 201 and click “Create Access Port Selector.” In here, specify the port used for uplink and associate the uplink policy. Repeat this process for all four uplink ports.

L3 Out Access Port Profile

Now navigate to Tenants -> (Tenant Name) -> Networking, right-click “External Routed Networks,” and click “Create Routed Outside.” Within this routed outside config, specify the IGP of your choice (I’m using OSPF here), along with the VRF and External Routed Domain.

Routed Outside Network Config

Once the basic identity settings are set, click the “+” sign next to “Nodes and Interfaces Protocol Profiles.” Enter a name then click the “+” sign next to Nodes. Select the appropriate node ID for the first leaf (201, in my case) and specify a router ID (RID) for the OSPF process. I’ll keep the checkbox for “Use Router ID as Loopback Address” checked.

L3 Out Node Selector

Press OK to go back to the node profile and then click on the “+” sign next to OSPF interface profiles. Specify a name then click next until you reach the “Interfaces” section. In here, we’re going to create two routed interfaces for leaf 201 using the addressing schema outlined previously. Click the “+” sign next to Routed Interfaces and specify the leaf, port, and IP for the first of our L3 out interfaces.

Routed Interface Profile within Node Profile

Once you complete this process for both routed interfaces on the first leaf node, press OK a few times to complete the creation of the node profile. Create a second node profile for your other leaf, including its two routed interface profiles. Once these steps are complete, you can click “Next” to proceed to the “External EPG Networks” options.

Completed Routed Outside Profile with two Node Profiles

Once at step 2 of the “Create Routed Outside” configuration, click the “+” sign next to “Subnet.” In here, we’re going to essentially create a default route out of the fabric, specifying 0.0.0.0/0 as the subnet.

External EPG Network (Default Route out of the Fabric)

At this point, assuming you have configuration set up properly on your upstream switches, you will be able to ping the directly connected interfaces but you may not see the OSPF neighbor adjacency set up. This could be caused by the area type used by ACI by default, the NSSA OSPF are type. You can either change the OSPF area type in ACI to regular or set up the OSPF settings on the core switch to accommodate the NSSA area type. You may also find your neighbor relationship stuck in EXSTART, which is likely caused by an MTU mismatch. ACI uses an MTU of 9000 by default, so be sure to accommodate that on your upstream switches.

We now need to configure this L3 out in our Bridge Domains and decide whether we want this networks to be advertised externally. Navigate to Tenant -> (Tenant Name) -> Networking -> Bridge Domains -> (EPG Name), click on the “Policy” tab then the “L3 Configurations” tab. In here, click the “+” sign next to “Associated L3 Outs” and select the L3 Out we just created.

At this point, if we want to externally advertise this Bridge Domain (subnet), navigate to the “subnets” section of the EPG and change the scope to “Advertised Externally.”

Advertise EPG Subnet Externally

With layer 3 connectivity established to the rest of our network, we can now begin crafting policy to establish access between our ACI connected endpoints and the outside world.

Establish Host Connectivity (EPGs & Contracts)

ACI functions on a whitelist “zero trust” model, wherein all connectivity must be explicitly granted. To establish connectivity between two EPGs or between EPGs and the externally routed network, you join the two entities via a contract. The contract is simply a binding point that contains filters that permit specific types of traffic. To visualize the way these pieces work together, here’s a nice diagram from Cisco docs.

ACI Policy – Contracts/Filters

As you can see there is the concept of a provider (the endpoint providing a service) and the consumer (the entity consuming a service. You link these two entities together via a contract and then all traffic that is destined to/from them is permitted or denied based on the filters within that contract. Using this construct, we can construct a contract that will permit certain types of network activity to exit our ACI fabric (production app traffic, for instance). We can also use this to assure that some internal resources are only accessible via appropriate internal servers (a DB cluster, for instance, could be set up to only be accessible from the web app cluster and not permit any other traffic in or out).

Let’s start by establishing access between our management network and our L3 Out network so that our ESXi hosts can join vCenter. To minimize failure domains, I keep some critical service that impact the data plane, such a vCenter, our of my ACI fabric and in a self contained, mission segment area of my network. The first atomic unit of policy here is the filter – we can create a simple “any/any” contract or set up protocol specific filters as needed. For simplicity’s sake, I’ll create an any/any filter here by navigating to Tenants -> (Tenant Name) -> Contracts, right-clicking “Filters” and clicking “Create Filter.” Specify a name and click the “+” icon next to entries. Give that new entry a name and leave everything else blank (specifying a catch-all filter).

Filter – Any Traffic

Now under the same “Contracts” menu, right click “Standard” and click “Create Standard.” In the context of ACI, a “standard” contract permits traffic whereas a “taboo” contract denies. A single contract can have multiple “Contract Subjects” within to provide granular policy definition and ease upkeep. For our purposes, we’ll create a single “Contract Subject” and specify the filter we just created.

Contract Creation using Filter

We must now link the provider and consumer via the contract by either specifying the contract under the appropriate EPGs/Networks, or by using the handy APIC GUI. First we’ll add the L3 Out network to our Application Profile. Navigate to Tenants -> (Tenant Name) -> Application Profile and click on the AP (in my case, WebApp_AP). Now click on the “Topology” tab and drag the L3 cloud icon onto the screen. Specify the L3 Out we just created and press OK, then click “Submit”

Adding Layer 3 Out to Application Profile

With the L3 cloud icon on our topology, we use a contract to link EPGs to the L3Out. Drag the “Contract” circle on top of an EPG until an arrow shows up, then continue to drag it over the L3Out. Once complete, that will pop up a window where you select “Use Existing Contract” and select the contract we just created.

Association of an EPG and L3Out via Contract

Once complete, make sure to press “Submit” to confirm the contract association shown in the topology.

This same process can be used to establish EPG to EPG connectivity as needed. In my lab, I’m going to maintain a “Core Services” EPG that will house DNS, DHCP, and NTP servers, which will need to be accessed by other servers in my ACI fabric (but not externally). My “Mgmt_EPG” will need to access this to obtain a DHCP lease. Using this same approach, I can link the Mgmt_EPG and CoreServices_EPG via a contract. With a few of these created, your AP topology will look something like this:

Backup Configuration

ACI has a phenomenal backup mechanism that leverages quick snapshots, allowing operators to easily rollback and perform config diffs. The backup process is simple – navigate to Admin -> Config Rollbacks, specify a snapshot description, then click “Create a snapshot now.” By default this will save the snapshot on the APIC cluster, but you can also store this externally on an SFTP server, etc..

Config Snapshot/Rollback

Appendix A – APIC, Spine, & Leaf Factory Reset

Reset APIC Controller:

used-apic# acidiag touch clean
This command will wipe out this device. Proceed? [y/N] y

used-apic# acidiag touch setup
This command will reset the device configuration, Proceed? [y/N] y

Be sure to reboot all of the APICs at the same(ish) time to assure they don’t learn settings from an in-service controller.

used-apic# acidiag reboot
This command will restart the this device, Proceed? [y/N] y

Now, wipe the spines and leaves.

used-spine-a# acidiag touch clean
used-spine-a# setup-clean-config.sh 
In progress
In progress
...
In progress
Done
used-spine-a# reload
This command will reload the chassis, Proceed (y/n)? [n]: y

If you find this still doesn’t do the trick, use the following command:

used-spine-a# prepare-mfg.sh <image.bin>
reload