Ground Control to Major Faults: Towards a Fault Tolerant and Adaptive SDN Control Network


To provide high availability and fault-tolerance, SDN control planes should be distributed. However, distributed control planes are challenging to design and bootstrap, especially if to be done in-band, without dedicated control network, and without relying on legacy protocols. This paper promotes a distributed systems approach to build and maintain connectivity between a distributed control plane and the data plane. In particular, we make the case for a self-stabilizing distributed control plane, where from any initial configuration, controllers self-organize, and quickly establish a communication channel among themselves. Given the resulting managed control plane, arbitrary network services can be implemented on top.

This paper presents a model for the design of such self-stabilizing control planes, and identifies fundamental challenges. Subsequently, we present techniques which can be used to solve these challenges, and implement a plug & play distributed control plane which supports automatic topology discovery and management, as well as flexible controller membership: controllers can be added and removed dynamically. Interestingly, we argue that our approach can readily be implemented in today’s OpenFlow protocol. Moreover, our approach comes with interesting security features.

Proceedings of the 2nd Workshop on Dependability Issues on SDN and NFV (DISN'16)