I recently did a Cisco CCNP course for my co-workers. During one of the sessions we had a big discussion on what a "collapsed core" network design is, and whether we were using the term correctly. This post is an adaption of the notes I made for my students following that discussion, based on the official Cisco Validated Design (CVD) for Campus LAN.
Switch Block design
The basic element in a campus network design is the switch block. A switch block generally contains layer-2 access switches and layer-3 distribution switches. VLANs (and thus broadcast domains) are always restricted to a single switch block, with the layer-2 boundary at the distribution level. Cisco recommends restricting a VLAN to a single switch or switch-stack (local VLANs), but in practice it's often necessary to make your VLANs available throughout the switch block (End-to-End VLANs).
In the above picture I have depicted a switch block design. In this design the access switches are stacked for ease of management, but the VLANs are stretched over multiple access stacks (layer-2 in the picture). This means that we have loops in the network, and need Spanning Tree to block the redundant links.
It's pretty common to stack the distribution switches too, in order to eliminate the loops and blocked links in the switch block. This has the disadvantage that you cannot do maintenance on the distribution switches without disrupting the entire switch block, so that might not be a good trade-off for your design. I personally prefer a well-designed MSTP implementation over having a single point of failure in my network. If it's possible, a better solution is to try to limit VLANs to a single switch stack and route all the uplinks.
For a lot of networks, particularly in Europe and/or the smaller enterprise scale, this is all you need for your campus network. I've seen this type of design called "collapsed core" in documentation. This is only half true, and might cause a little confusion when you're doing CCNP/CCDP studies. To clarify this, I first need to explain what Cisco considers to be a "proper" campus network.
According to Cisco, a regular campus network has three layers: core, distribution, and access. The distribution and access layers I mentioned already, these are bundled together as a switch block. The core layer is a number of switches (or routers!) that are used to interconnect different switch blocks.
Core and distribution layer connect through layer-3 links only; it is not possible (or desirable for that matter) to stretch VLANs over multiple switch blocks.
So, when would you need more than one switch block? It's fairly common to see one switch block per location when you have a campus with multiple big buildings. You can also have a separate switch block for the in-house datacenter ("MER"), or a switch block for external connections ("WAN"). In fact, any time you want to limit the impact of an outage, it's smart to work with switch blocks; keep the blast radius as small as possible.
Collapsing the core
A core is called collapsed when you move the role of the core switches to the distribution switches, merging the core- and distribution layer. The role we're talking about is interconnecting different switch blocks. We do this by directly connection the distribution switches to each other, instead of through a core switch:
This saves you some expensive core switches, but it does require you to use more interconnect links: for N distribution switches you would need N(N-1)/2 links, compared to 2N+1 links for a three-tier network. This starts to get really messy when you have more than ~3 switch blocks.
Is this the best design?
All this is pretty much what we've been doing for at least a decade. So are there any better ways of doing things? The answer, obviously, is "it depends". If you want to do some research on other options, check these resources:
The increasing reliance on wireless access and the emergence of 4G and 5G will certainly have an impact on the way you design your campus network. The best advice I can give is probably to design something that can be adapted to suit a changing use of the network, rather than something that is meant to last a decade in an unaltered state. As usual, that is easier said than done.
As an aside, this is also what's happening in the move from chassis switches to leaf/spine architecture: replacing a big expensive chassis with less expensive 1U switches. The chassis backplane is equivalent to the core, which is collapsed to the ECMP spine at a cost of more interconnecting links. ↩︎