Break scalability barriers in OpenFlow SDN

New support for Table Type Patterns in OpenFlow lifts the ceiling on large deployments and enables interoperability across devices

Over the past couple of years, software-defined networking (SDN) has emerged as a strong alternative to traditional networking approaches in the areas of WAN, data center networks, and network overlay solutions. The primary benefit realized from SDN, besides open networking, is the ability to accelerate service deployments. SDN solutions using OpenFlow tackle complex problems, including dynamic provisioning, interconnection, and fault management. Although the functionality of SDN has evolved and matured, the scalability of SDNs based on OpenFlow has been limited by OpenFlow’s ties to ternary content-addressable memory (TCAM). OpenFlow by design was implemented in the TCAM.

The problem is that earlier switch ASICs like Broadcom’s Trident+ had relatively small TCAMs, and early versions of OpenFlow (1.0 through 1.2) could use only TCAM for memory. This limited OpenFlow scalability to somewhere between 1,000 and 2,000 flows. But today, switch ASIC manufacturers are building larger TCAMs and OpenFlow developers have come up with ways to access other memory resources to support more flows in today’s switch ASICs.

Flow requirements

Typical data center network infrastructure consists of top-of-rack switches, core switches, and edge switches. Each of these hierarchies needs a different level of flow scalability (Figure 1).

sdn flow scale requirements — Figure 1: Flow scale requirements in data center networks.

Given the requirements shown in Figure 1, it’s easy to understand the reluctance to implement early versions of OpenFlow. Essentially, these limits were mandated by OpenFlow’s reliance on TCAM memory devices for storage of flow routing information.

The trouble with TCAMs

TCAMs are special memory devices that enable most of today’s intelligent networks. They enable matches on a masked bit value rather than strict binary matches. This greatly enhances the usability of TCAM for network applications. Primarily, TCAMs were responsible for the development of the SDN concept. With the ability to support a policy-based forwarding model with a wild-card match, TCAMs paved the way to a multitude of network applications that enabled exception-based, suboptimal, time-of-day, cost-optimized, and and even custom forwarding models. Using TCAMs in network devices as the primary resource to enable unique functionality, OpenFlow greatly augmented traditional routing protocols to accommodate core business requirements.

However, as OpenFlow expanded in scope and capabilities (including control plane disaggregation and open networking), the scaling of OpenFlow tables became a major concern and a limiting factor. TCAMs are expensive and power-hungry memory devices that bloat the cost of devices. While embedded TCAMs offer a good price point at lower power consumption, they still don’t scale well, and they restrict the search to a priority model. This search limitation is particularly important because, for IP lookups, longest-prefix match and related tree-based algorithms are more effective at scale. DRAMs and SRAMs have long been used for these applications because they cost less and consume less power than TCAM.

While OpenFlow started out by exploiting the TCAM capabilities, it was not optimized for simple forwarding applications such as a typical routing or Layer 2 pipeline. In order to achieve flow scale in OpenFlow, it is imperative to use available memory resources on the ASIC to distribute flows to the appropriate resources to optimize the memory usage. Earlier versions of OpenFlow from 1.0 through 1.2 relied on single-table implementations -- that is, using TCAMs only for flows. This optimization typically meant that a software layer would interpret the flows provisioned and would compress them into single or multiple flow entries in hardware.

Table Type Patterns to the rescue

Table Type Patterns (TTP) address TCAM limitations by enabling OpenFlow to access other ASIC tables -- such as the VLAN, MAC, and IP tables -- along with TCAM tables. As the network operating system opens up the different tables in the ASIC, an OpenFlow application is able to control the population of these tables directly in a normalized way. This means the TTP is persistent across ASIC architectures. As a result, the SDN application can scale across ASIC architectures without any modifications. For example, a fixed pipeline for IP routing or policy routing or an MPLS flow could be made consistent across any ASIC implementation.

OpenFlow 1.3 introduced multitable support and methods to access and populate tables other than TCAM, including VLAN, MAC, IP, and MPLS. This brought in the ability to push flows to the appropriate hardware tables depending on the headers used for defining the forwarding path. OpenFlow 1.4 improves the management of flows in multiple tables with sophisticated methods such as bundle messages, eviction and vacancy events, and synchronized tables among others.

While the network operating system normalizes access to the ASIC hardware tables, OpenFlow 1.4 allows organizations to build SDN applications that operate across vendor platforms. With TTP, network engineers and operators can now implement SDN at greater scale -- in some cases, up to 2 million flows (a thousand-fold increase with newer ASICs) -- while still using standard, white-box hardware.

On the development curve, software always leads hardware because software can be developed more quickly and with a smaller investment. But ASIC vendors are catching up to OpenFlow 1.4 with new products like Broadcom’s Tomahawk and Cavium’s XPliant switch ASICs. These support 256,000 and 2 million flows, respectively (Figure 2).

sdn asic flow support — Figure 2: Flow support in different generations of ASICs.

Applications for scalable OpenFlow

How do network operators leverage OpenFlow’s scalability? Here are a couple of examples. For ISPs, automation and self-service portals are nirvana for the reduction in opex alone. If a customer wants to increase bandwidth from 10Mbps to 100Gbps, but only wants to do it from 8 a.m. to 5 p.m. and wants to apply a firewall filter and a QoS policy, this would be hard to do quickly with standard network provisioning tools and protocols.

ISPs are therefore looking at OpenFlow to achieve this level of automation and granular control. The network uses Layer 2 and Layer 3 protocols as the baseline transport, while OpenFlow rules are used to define the exception-based forwarding that users want. When considering the requirements of multitenancy, dynamic VLANs, virtualized services, and scale, it’s easy to see why scaling the number of OpenFlow rules would be important in this scenario.

Handling elephant and mice flows in the data center is another well-known problem for large enterprises and service providers. Data center networks have typically standardized on some variant of spine-and-leaf architecture, which makes perfect sense when it comes to addressing the needs of east-west traffic and ensuring the ability to quickly add scale.

However, problems can still arise when it comes to handling flows of different sizes and prioritizing when bandwidth is at a premium. The beauty of using OpenFlow in these instances: It does not disrupt what is already working with the spine-and-leaf architecture. Whether it’s MLAG, BGP, or any other standard protocol, network operators have been able to strategically stitch in OpenFlow rules to handle these elephant flows as special cases. Considering the numbers of virtual machines and workloads in data center racks today, it’s easy to see why the ability to increase the number of OpenFlow rules is important.

For years, legacy network equipment makers have derided OpenFlow by saying it doesn’t scale to handle major networking tasks. But with advances in software and switch ASICs, the open networking ecosystem is enabling network operators to scale OpenFlow deployments to any size needed, and to move forward with new, tailored services.

Sudhir Modali leads the SDN strategy at Pica8. Prior to joining Pica8, Sudhir spent 20 years at Cisco in various capacities to bring new products and solutions to market. He holds a Bachelor’s degree in Industrial Electronics from Shivaji University, India.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Next read this: