Way back in June 2015 I wrote about my failure detector project. I wanted to implement failure detectors for two reasons: to learn C++ and get a better understanding of an important and fundamental distributed systems topic.
After all these months, all of my work has converged to libab. It’s a small-ish C library that allows you to create a cluster of interconnected nodes and broadcast a sequence of messages to each of them. It sounds simple enough, but there are two important properties:
If you’re thinking that this sounds like atomic broadcast, you’re right. It’s a lot like atomic broadcast, but not entirely. Atomic broadcast requires that each participant receives all messages; libab only requires a majority.
“So, what’s the point?” you may ask. My goal for this project is to implement enough useful primitives so it’s possible to build much more complicated distributed systems primitives on top. Think about it: given the two properties I listed above, could you write something on top of this library that implements atomic broadcast? I think you can!
It’s available on GitHub for you to take a look at: https://github.com/Preetam/libab
I’m planning on writing a proxy that uses libab in front of Redis. Consistent Redis?! Hopefully I’ll announce something like that in the next State of the State!
Writing a simulation would’ve been easy, but I wanted to implement the real thing. That means I needed a peer-to-peer messaging library that would allow me to broadcast messages between failure detector nodes.
I wanted the following features:
After searching for a while, I couldn’t find a library that suited my needs. I ended up writing it myself. It took a long time to get it right (I didn’t start with libuv), but it ended up being really fun to eventually get right.
Finally, a list of things I find interesting: