Jerreck McWilliams .NET Dev in DFW

Feature Freeze Failure

One of the questions I asked during Gene Kim's AMA as part of The Unicorn Project book club this week was this:

Part 2 kicks off with the announcement of the one month feature freeze for Parts Unlimited which we saw in TPP was a huge success. I've read about the successful feature freezes with LinkedIn's Inversion and Microsoft's freeze for SQL Server security updates in the early 00s, but I was wondering if yal knew about some unsuccessful feature freezes and what might have characterized those vs the successful ones?

Part of his response was a link to this twitter thread where Dan McKinley - a former engineer at Etsy - talks about his experience rewriting part of Etsy's early codebase.

Gene asked to post my synthesis of this, so I figured it was a good excuse to restart the 'ole blog, even if it's just a small post. So here's my response :)

In the context of feature freeze successfulness, I think the difference between the story Dan McKinley tells in that thread and the feature freeze described in The Unicorn Project is that the team in TUP 1) clearly understood and agreed on what their problems were, 2) then discussed and decided as a team the best ways to go about fixing them, and most importantly 3) their plans paid down their technical debt.

All of which is contrary to what the team in Dan's story did. His narrative paints a picture of there being significant disagreement about the actual problems with their existing architecture, their solution to those problems being the result of a political battle, and it wound up actually creating more technical debt for them in the end.

So I think the important question to ask here is "what could Dan's team have done differently that might've led them down a different path?"

Turns out, Dan offers some advice for us on that front when he later links what appears to be a fairly well known article of his (it was new to me) where he distills some of the lessons he learned in this process called "Choose Boring Technology".

There's a lot of good stuff in there, but this excerpt is the most topical (and probably my favorite):

One of the most worthwhile exercises I recommend here is to consider how you would solve your immediate problem without adding anything new. First, posing this question should detect the situation where the “problem” is that someone really wants to use the technology. If that is the case, you should immediately abort.

It can be amazing how far a small set of technology choices can go. The answer to this question in practice is almost never “we can’t do it,” it’s usually just somewhere on the spectrum of “well, we could do it, but it would be too hard” [4]. If you think you can’t accomplish your goals with what you’ve got now, you are probably just not thinking creatively enough.

It’s helpful to write down exactly what it is about the current stack that makes solving the problem prohibitively expensive and difficult. This is related to the previous exercise, but it’s subtly different.

New technology choices might be purely additive (for example: “we don’t have caching yet, so let’s add memcached”). But they might also overlap or replace things you are already using. If that’s the case, you should set clear expectations about migrating old functionality to the new system. The policy should typically be “we’re committed to migrating,” with a proposed timeline. The intention of this step is to keep wreckage at manageable levels, and to avoid proliferating locally-optimal solutions.

This process is not daunting, and it’s not much of a hassle. It’s a handful of questions to fill out as homework, followed by a meeting to talk about it. I think that if a new technology (or a new service to be created on your infrastructure) can pass through this gauntlet unscathed, adding it is fine.

I think these questions could do an excellent job of helping you spot things like the "drop in replacement fallacy" before it starts.

One last thing I wanted to share was another twitter thread from Dan along these same lines where he reminds us how important something a simple as a conversation can be in dev.

Outside of that, I don't have much more to add at the moment. So there you go.