Finding Almost-Invariants in Distributed Systems

Abstract

It is notoriously hard to develop dependable distributed systems. This is partly due to the difficulties in foreseeing various corner cases and failure scenarios while implementing a system that will be deployed over an asynchronous network. In contrast, reasoning about the desired distributed system behavior and the corresponding invariants is easier than reasoning about the code itself. Further, the invariants can be used for testing, theorem proving, and runtime enforcement.

In this paper, we propose an approach to observe the system behavior and automatically infer invariants which reveal implementation bugs. Using our tool, Avenger, we automatically generate a large number of potentially relevant properties, check them within the time and spatial domains using traces of system executions, and filter out all but a few properties before reporting them to the developer. Our key insight in filtering is that a good candidate for an invariant is the one that holds in all but a few cases, i.e., an “almost-invariant”. Our experimental results with the XORP BGP implementation demonstrate Avenger’s ability to identify the almost-invariants that lead the developer to programming errors.

Publication
Proceedings of the 30th IEEE Symposium on Reliable Distributed Systems (SRDS'11)