Automation
Overview
Automation may seem like an unusual topic in a network design certification, but its impact on both the network and the business is profound. Networks have never been more complex, and leveraging automation can dramatically affect how a network is designed, built, operated, and maintained.
Consider the task of building 12 data centers worldwide. Manually configuring each device at each location could take 12 months or more — one data center per month. With automation, every component, configuration, feature, and capability is templated with corresponding variables. After spending a month building templates, workflows, and orchestration processes, multiple data centers can be built simultaneously with the same resources. What took 12 months manually can be completed in 3 months or less using automation.
Beyond the initial build, automation can be leveraged for operations and maintenance (O&M) tasks, troubleshooting and resolving network issues, and instantiating business intent end to end. Automation limits user errors, reduces total cost of ownership, reduces network outages, and increases service agility. This is the business impact of automation, and why business leaders will require it within the design of their networking infrastructure.
This chapter focuses on the underlying impact of automation on a business and how a network designer can properly structure a network design for automation. It does not cover how to build or write automation (programming languages, API calls, orchestration tools, data models, etc.), but it does cover the specific capabilities of automation that network designers need to know.
Zero-Touch Provisioning
The overwhelming demand for automation has stressed overall network designs and architectures. Adding new services to an oversaturated network already fulfilling tasks like network management, quality of experience, and optimization is not feasible through manual processes alone. Zero-touch provisioning (ZTP) is one answer to this problem.
ZTP is based on software-defined networking (SDN) solutions and network functions virtualization (NFV) concepts. The intent and outcome of ZTP is to have any new network device fully configured automatically, in a plug-and-play situation. In addition to reducing capital expenditure, ZTP is designed to reduce user errors and save time. Administrators can leverage automation tools to roll out different aspects of the network devices, requiring no input from the network engineer once the process is started — which is normally when the device is first connected to the production infrastructure.
The benefits of ZTP include:
- All physical cabling is verified correct.
- All devices are running the proper code version.
- Additional sites, pods, and devices can be pre-provisioned ahead of time, before the business needs them.
- All devices push their configurations back whenever a save is issued.
- A ZTP controller can restore the latest configuration should a device return into the ZTP process.
How ZTP Works
When a new device boots with no startup configuration, it initiates a discovery process. The typical ZTP provisioning workflow (Figure 1) involves:
- Power on and connect the ZTP-enabled device.
- The device requests a DHCP address from the DHCP server.
- The DHCP server sends an IP address, image name, configuration file name, and location.
- The device downloads the image and configuration file from the file server (FTP/TFTP/HTTP).
- If the downloaded version of software differs from the running version, the device installs the downloaded software and reboots.
- The device installs the downloaded configuration file.
Different vendors implement ZTP in slightly different ways, but the core concept remains the same. Cisco devices support ZTP through mechanisms like Plug and Play (PnP) with Cisco DNA Center, or through native ZTP support on IOS-XR and NX-OS platforms.
There must be a minimum amount of network infrastructure created to allow ZTP to communicate with the network devices it will configure. In most cases this means setting up a management network to allow all network devices to be automatically provisioned with ZTP via their out-of-band management interfaces.
Design Considerations for ZTP
From a network design perspective, ZTP introduces several requirements and considerations:
- DHCP infrastructure: A reliable DHCP service must be available at every site where devices will be provisioned. The DHCP server must be configured with the appropriate options to direct devices to the provisioning server.
- Provisioning server: A centralized or distributed file server (FTP, TFTP, HTTP, or HTTPS) must host the configuration files, software images, and bootstrap scripts. This server must be reachable from the management network of the new device.
- Network reachability: The out-of-band or in-band management network must provide Layer 3 connectivity from the new device to the DHCP and provisioning servers. If devices are deployed at remote sites, the WAN must be designed to support this traffic.
- Security: ZTP introduces a security consideration — a device is pulling its configuration from the network before it has been fully secured. Mutual authentication between the device and the provisioning server, use of HTTPS, and certificate-based trust models should be part of the design.
- Software image management: ZTP can also be used to push the correct software image to a device. The design should account for image storage, bandwidth requirements for image downloads, and version control.
Scalability and Operational Impact
ZTP has a significant impact on the scalability of network deployments. When hundreds or thousands of devices need to be deployed — whether in a new data center, across branch offices, or as part of a refresh — ZTP allows the deployment to happen in parallel rather than sequentially. This reduces the time to deploy and the number of skilled engineers required on-site.
From an operational standpoint, ZTP also supports device replacement scenarios. If a device fails and is replaced with new hardware, ZTP can automatically provision the replacement with the correct configuration, minimizing downtime and the need for on-site expertise.
ZTP is most effective when the network design follows a consistent, templated approach. If every site or device role has a unique, one-off configuration, the value of ZTP diminishes significantly. A well-structured, modular network design is a prerequisite for effective ZTP.
Review Questions
1. Which of the following problems are resolved or reduced using zero-touch provisioning? (Choose two.)
- The amount of time needed to deploy new infrastructure
- Maintaining code across different versions
- Troubleshooting network outages caused by human error
- Increased time to market on new features, functionality, and capabilities
a and c. Zero-touch provisioning reduces the amount of time needed to deploy new infrastructure and eliminates the need to troubleshoot because of network outages caused by human error.
2. Which of the following are technical benefits of leveraging zero-touch provisioning? (Choose three.)
- All cabling is validated and correct.
- Day 1 configuration is loaded.
- ZTP-enabled devices upload their configuration when a save is issued.
- All devices are running the proper code version.
a, c, and d. With ZTP we can have all of our cabling validated to ensure it’s correct, ensure we have a saved configuration in a repository of our ZTP-enabled devices, and ensure all ZTP devices are running a specific code version.
Infrastructure as Code
Infrastructure as Code (IaC) is an approach to infrastructure automation that focuses on consistent, repeatable steps for provisioning, configuring, and managing infrastructure. Over the years, infrastructure has been provisioned manually. Deploying new capabilities, applications, and network devices used to take weeks, months, and even years to complete, which created a demand for a more effective and efficient process. IaC fulfills that demand.
IaC is all about the representation of infrastructure through machine-readable files that can be reproduced for an unlimited number of times. With IaC, we can read the code in the files to identify all the corresponding capabilities, characteristics, features, and functionality that will be instantiated when it gets provisioned. The entire team can review the code, make necessary changes, leverage version control to track changes, and ensure the code is compliant with security standards that the business is required to follow.
Leveraging IaC for Multiple Environments
One of the most powerful capabilities of IaC is the ability to build identical environments for development, staging, and production from the same set of machine-readable files, as shown in Figure 2. Each environment can be built the exact same way, leveraging the same automation and version control system. The different teams can properly test new features, functionality, and capabilities prior to rolling out in production, which leads to significantly fewer errors and troubleshooting steps.
Version Control and Source of Truth
Once the machine-readable files that represent the infrastructure have been created, it is best to store them in a version control system like Git. Doing this allows these files to be updated and tracked, and a previous version can be easily rolled back if needed. New infrastructure can be instantiated as needed — be it servers, virtual machines, or network devices — by pulling from Git or a similar version control system.
The source of truth is the authoritative repository that defines the intended state of the network. This could be:
- A version-controlled repository (e.g., Git) containing configuration templates and variable files
- A network management platform or controller that maintains the intended configuration
- A combination of both, with the repository feeding the controller
When the source of truth is well defined, any drift between the intended state and the actual state of the network can be detected and corrected automatically.
Elasticity
Being able to spin up new infrastructure in a repeatable, consistent fashion can be extremely useful for a business, such as in cases where an application is in large demand at a specific time. In such cases, all corresponding resources — servers, VMs, load balancers, firewalls, network devices — can be automatically instantiated to meet the high demand. Once the demand decreases, these resources can then be decommissioned and deprovisioned. This specific capability is called elasticity and is something all network designers should understand. Businesses will require this capability because it not only provides critical operational efficiency at the right time, but also reduces the long-term cost that manually adding resources would cause.
IaC and Change Culture
One of the main adoption concerns with IaC is that this approach welcomes, and even encourages, change. Historically, change was not something infrastructure teams and the business could handle easily. The business would provide the infrastructure team a handful of change windows a year, and in most cases, each was very limited in time.
With IaC, change is a catalyst to improve the reliability and performance of the infrastructure as a whole. Just as source code goes through multiple versions and becomes better with each new release, infrastructure becomes more resilient and reliable with each new version.
Adopting Infrastructure as Code
Here are some high-level steps an organization should complete to properly adopt IaC:
- Organize configurations into objects that can be stored.
- Leverage a source of truth system (Git) to store these objects.
- Leverage current workflows and processes to instantiate configurations from these objects stored in the source of truth system.
- Complete full validation testing to ensure proper functionality as intended.
Businesses and their networks are far more complex than in the past. Businesses are requiring consistent uptime as they rely on the network, specifically connectivity to the Internet, for most resources. Businesses need flexibility and elasticity to support their business needs. Infrastructure as Code can help support dynamic businesses while enhancing data and cybersecurity.
IaC does not replace the need for a solid network design. It amplifies the design — a good design becomes consistently good across the entire network, while a bad design becomes consistently bad. The quality of the underlying design is more important than ever when IaC is in play.
Review Questions
3. Which of the following are problems resolved or reduced by leveraging infrastructure as code? (Choose two.)
- The amount of time needed to deploy new infrastructure
- Maintaining code across different versions
- Troubleshooting network outages caused by human error
- Increased time to market on new features, functionality, and capabilities
a and c. Infrastructure as code reduces the amount of time needed to deploy new infrastructure and eliminates the need to troubleshoot because of network outages caused by human error.
CI/CD Pipelines
When making manual network changes via the CLI, it is suggested to make small, incremental changes versus large, wholesale changes. Copying a hundred lines of configuration changes into the CLI and then having to troubleshoot what part of the copy broke the network is much more difficult than leveraging automation tools to complete the same work. Figuring out what part of the change isn’t working is always troublesome, and often could create service downtime, an outage, or hours of frustration.
Continuous integration/continuous delivery (CI/CD) is a common software development practice used by developers to merge code changes into a central repository multiple times a day, sometimes multiple times an hour, and automate the entire software release process. With continuous integration (CI), each time the code has been changed, the build and test process is automatically executed, providing instant feedback to the different developers on what’s working and, more importantly, what’s not working in their code. With continuous delivery (CD), the relevant resources are automatically provisioned and deployed, which can sometimes consist of multiple disparate stages for more complex projects. The important aspect of a CI/CD pipeline is that all of these processes are fully automated, documented, and visible to the entire project team.
The CI/CD Pipeline Process
The four steps in a CI/CD pipeline are source, build, test, and deploy. Figure 3 illustrates how the continuous integration workflow begins when a network engineer creates a code branch, makes the necessary changes, completes local testing, and commits the changes. These changes are then merged with the master/production branch, which kicks off the build process. If problems are found, they are investigated and the status is reported back. The continuous deployment workflow then instantiates a test environment, runs the full validation test plan, and deploys the candidate changes from the branch.
Applying CI/CD to the Network
A network-focused CI/CD pipeline takes what has worked in software development and applies it to the network by automating the provisioning process — such as day-zero and day-one configuration builds — running automated functionality tests to verify configurations, and deploying changes to the production network. Automated pipelines remove manual errors and help implement standardized change processes.
Continuous integration leverages direct device communication. Network devices and automation tools talk directly, sharing data and traffic telemetry. The network configurations are then treated as code — Infrastructure as Code (as discussed previously). The network configuration is stored in a version control system (e.g., Git). For a network engineer to make any change, they must first pull down the latest copy of the stored configuration, making their own branch of it. All changes are tested locally, and when testing is completed, the changes are committed and pushed back to the master copy in the version control system, kicking off the new build process.
Benefits of CI/CD
The most common benefits of CI/CD are:
- The time required to enable new features and functionality is decreased.
- The process to make changes is streamlined.
- Changes are deployed quicker.
- Changes are simplified.
- Changes are small, steady, and frequent, decreasing the overall risk of the change.
- Removal of the human element from making changes reduces human errors.
- Intuitive test automation completed against proposed changes validates that those changes will not cause any unforeseen outages, thus minimizing network outages.
- Low-touch/automated change deployment increases operational efficiency.
- Changes are more productive and lead to a more established, better network.
Review Questions
4. Which of the following are problems resolved or reduced by a CI/CD pipeline? (Choose three.)
- The amount of time needed to deploy new infrastructure
- Maintaining code across different versions
- Troubleshooting network outages caused by human error
- Increased time to market on new features, functionality, and capabilities
a, c, and d. A CI/CD pipeline reduces the amount of time needed to deploy new infrastructure, reduces the need to troubleshoot because of network outages caused by human error, and increases time to market on new services.
5. What are the four steps of a CI/CD pipeline?
- Source, build, test, deploy
- Build, test, code, deploy
- Build, test, deploy, monetize
- Source, test, build, monetize
a. The proper steps in a CI/CD pipeline are source, build, test, and deploy.
Comparing Automation Capabilities
Now that we have covered all of these automation capabilities, let’s compare them from two perspectives: a network design perspective and a business benefits perspective.
Network Design Elements Required for Automation
Table 1 lists the network design elements that should be followed to properly adopt these automation capabilities within a business.
| Network Design Element | Design Principle | Automation Capability Affected |
|---|---|---|
| Each site, pod, and building block of the network design must be templateable and repeatable. This includes specific subnets, port assignments, device naming, and more. No one-off designs, arbitrary topologies, or snowflake implementations should be leveraged. | Modularity | ZTP, IaC, CI/CD |
| Identify the smallest number of standard models and have the discipline to stick to them. For example, small office (≤10 users), medium office (11–25 users), and large office (26+ users). Thresholds can be business-focused (OPEX/CAPEX) or technology-focused (number of ports, switches, line cards). | Hierarchy of design | IaC, CI/CD |
| Everything in the design should be modular building blocks so that if the need arises to add functionality, it can be easily added. For example, adding a second core block or a new Internet/perimeter block should be an easy addition. This extends down to the role a single device performs, such as a CE-PE device. | Modularity, Hierarchy of design | IaC, CI/CD |
Business Benefits of Automation
Table 2 highlights the corresponding business benefit for each automation capability.
| Automation Capability | Business Benefits |
|---|---|
| Zero-Touch Provisioning | Reduced CAPEX/OPEX. Decreased time to deployment. Reduced network outages from human errors. |
| Infrastructure as Code | Reduced CAPEX/OPEX. Reduced network outages with fewer human errors. |
| CI/CD Pipelines | Increased time to market on new features, functionality, and capabilities. Increased revenue in some instances. Reduced CAPEX/OPEX. Reduced network outages with fewer human errors. Increased troop multiplier. |
Summary
This chapter focused on the underlying impact of automation on a business and how a network designer can properly structure a network design for automation. It specifically covered the automation capabilities of zero-touch provisioning, Infrastructure as Code, and CI/CD pipelines, along with the corresponding network design elements around them. After highlighting these capabilities individually, they were collectively compared from a network design perspective and then from a business perspective. Automation does not replace the need for a solid network design — it amplifies it. A well-structured, modular, and templated network design is a prerequisite for effective automation.
Previous: Wireless | Next: Multicast Design