Byzantine Reality

Searching for Byzantine failures in the world around us

The Technical Side of Being a Good Individual Contributor

I’ve spent the last two and a half years as an individual contributor across many different software projects. Over that time I’ve noticed and collected a few patterns on things that, in general, make my life easier. There are both technical and non-technical (e.g., process) things, but for this post, we’ll just focus on the technical stuff. I’ll also stay super high level since these ideas apply across different programming languages and software stacks and are thus fairly applicable. I make no claim to originality here, and this is really just a brain dump that I wish I could have given to a younger version of myself. And of course, these aren’t hard and fast rules that must be obeyed, but are really just heuristics. If they don’t work for you, don’t use them!

  1. The development process must be as simple and fast as possible. Since you’re spending most of your time in development, this needs to be well-optimized. This statement can go lots of ways, but what I really mean here is, how I test that my change does the thing it needs to do must be simple and fast. I’ve used stacks that allow me to write unit tests and verify that code does the right thing in seconds, stacks that require an app be pushed to a mobile device and manually verified in ~5 minutes, and stacks that require multiple binaries be built and pushed to a server in 30-40 minutes. Guess which one of these projects I actually enjoy working on? And as a follow-up, guess which one of these projects I’m happy to volunteer to fix bugs on and which I dread? For anything that’s measured in minutes, you pay an even larger cost, because now I have to context switch to something else, work on that for a while, and periodically check in and remember what I was doing on that build that’s taking forever. Anything that can be done to shave off time on these projects must be given high priority, since its return is immediately felt by those on the team.

  2. If multiple team members don’t understand why your code does something in a particular way, something has gone wrong. Some libraries / servers I write are things that I’m the sole maintainer on, but it doesn’t give me license to write things in an obtuse fashion. That is, somebody who is not me will eventually have to maintain this, so it should be written in the simplest, cleanest way that works. This is a sentiment that Mark Twain captured well, long ago, in a very different context:

    I didn’t have time to write a short letter, so I wrote a long one instead.

    This isn’t to say that everybody must immediately look at your code and understand what it does and why it isn’t (although that is the goal!). You could be using complex abstractions or the thing you’re working on employs counterintuitive business logic. But to fall back on an old classic:

    If you run into one of these situations, what do you do? You’ve got a few options, depending on what side you’re on (that is, did you write the confusing code or are you trying to tell somebody else that they wrote the confusing code?). If you wrote it, you’re probably getting this feedback in the context of a code review or maybe a new person joined the project and doesn’t understand why this is done this way. If you actually think the pattern is confusing, ask them (or ask the team via e-mail/IRC/in person) what a better way would be to do this. If you don’t think it’s confusing, explain what problems this avoids, as compared to how they would propose doing it (or as opposed to other systems that don’t do this). To give you a concrete example, we’ve integrated with Java libraries that only expose static methods and don’t have constructors, which make them hard to unit test. So one route we explored was just writing a class that wraps around it, exposing a constructor and non-static methods, and testing that instead. It added a level of indirection but made our code actually testable. An alternative on Android (where this problem came up) is to use Robolectric, which lets you write a fake that gets installed for static methods/final classes that Mockito can’t normally mock (but I prefer the first option).

  3. Favor breadth over depth. This one is advice that I got when I was interviewing at Google, and has served me well in my time there. That is, you tend to find two groups of engineers: ones who know one and exactly one thing, very, very well, and ones who know an acceptable amount across many things. Certainly the organization needs both types of engineers, and an engineer who knows how a set of legal requirements are implemented in code is worth their weight in gold. But it also means that they’re stuck on that project until they decide to leave the team, because they’re simply too valuable there compared to elsewhere (also since they tend to go deep on that skill and not pick up other skills). Similarly, if you only ever take on projects that touch the XYZ server, you perpetuate a cycle in which you’ll only get feature requests and bugs on the XYZ server. But if you can work on mobile clients and servers and also do productionization (e.g., monitoring, alerting, load testing), you’ll be pulled into all those things and have more impact (sorry for the scary manager word there). This last point is enough to break up into it’s own bullet point:

  4. Learn how to productionize your systems. I’ve seen lots of engineers who dread this part of the process, and like to kick this work to other people. It’s not seen as super sexy to work on monitoring, alerting, capacity planning, etc, as it is to JUST WRITE CODE. But guess what, you’re gonna be on-call for this eventually and you need to know (1) why your service is suddenly throwing errors at 3am and (2) what you should do about it. Even if you’re not on an on-call rotation, pretend like you are, because a user will file a bug and it won’t have the one piece of data you need to debug the problem and then you go down this path of rampant speculation about what the problem could be with the next-to-zero information that you have. Also something obvious to state: it might not even be your service that you get paged for, so if I’m getting paged for the XYZ service suddenly having elevated error rates but this service returns HTTP 200 OK even for errors and doesn’t log them, I’m going to be a salty sea snail with whoever owns this service in the morning. Find the standards your company puts in place, and either follow them, or challenge them and make them better for everyone. Also, this is kind of trite, but don’t think of productionization as a separate thing that you just staple onto the end of the product. Like security, privacy, etc etc, keep it in mind as you go. If you’re writing some code and you’re making a ton of serial calls to other services, you can already know you’re likely to incur some serious latency that you can probably do something about. Not always, but probably, and it should at least raise a red flag, because the last thing you want is a manager person coming to you saying “why the hell is this thing so slow” (god forbid it’s an actual user) and you rushed the implementation to make them happy but you’ve now paid for it by incurring technical debt.

  5. Document what choices you’ve made in the design of your system, the alternatives, and why you rejected them. I’ve joined projects where we had no idea why the original author of the code chose to do something that looked obviously inefficient, and we’ve had to deduce that it was (1) because of a limitation of a system it depended on at the time, which was since remedied but our side never got updated, (2) because they had to decide on something fast and didn’t think it though, or (3) actually correct and not intuitive at all why this was the right thing to do. I don’t just mean “document your code”, although this is obviously something you should do. What I mean is: you should have a design doc for your service that explains what use cases your service was built to solve, what it was not built to solve, sequence diagrams showing what order you expect calls to be made in, and so on. If a problem requires your full brain power to pull everything into context and solve, write that all down into a doc or diagram. Even if you’re the only one maintaining this server (ESPECIALLY if you’re the only one maintaining this server), you’re eventually going to forget the subtle intricacies of why the XYZ server accepts auth tokens with T lifetime and not T-1 lifetime. Save yourself the trouble of having to recalculate all this stuff and dig up all that context by writing it all down.

Next time we’ll run over some thoughts on how to optimize your process (non-technical stuff) so you can be a good individual contributor. Hopefully you get value out of the above – enjoy!