Byzantine Reality

Searching for Byzantine failures in the world around us

It's Everyone's Fault

…and no one’s. We have an infrastructure that is ridiculously difficult to maintain and even more difficult to make “simple” changes to. It’s often come up as to whose fault it is that it ended up this way. But it’s my opinion that it’s all of our faults, and at the same time, none of our faults.

I don’t blame the original developers for their choice of language (OCaml). As in my ‘About’ page, I don’t think the choice of language particularly damns us. However, I believe that a very small number of critical mistakes were made that have resulted in the current state of affairs. I was not in employment when the development was made, so I do not have insight to their working conditions or their rebuttals to the following list. I can only speak for what I have seen, and I make this short list knowing it is incomplete and biased. In no particular order:

    The documentation is incomplete and inconsistent. We have placed most of it on a central server, but the organization of it does not easily lend itself to be found. I had to dive through all the folders to find documentation that until now, I didn’t even know existed or had seen before long ago. Furthermore, it lacks the documentation needed for several key infrastructure services such as our identity management daemon (RMWD).
    The knowledge transfer process was definitely lacking. This is the main part that I believe is everyone’s fault and no one’s. It’s like that feeling you get when you’re supposed to hang out with your friends and they don’t call you that day, but you don’t call them either. Neither the leaving developers nor the management made the knowledge transfer a priority, but we didn’t know the gravity of the situation. But then again, how could we have? It’s no excuse for sure, it’s just my rebuttal.

I suppose I could go on, but I think that pretty much covers it. Of course, please give me your input on this, especially if you feel differently.