It’s not a feature, it’s a bug

Today feature flags are ubiquitously used by software developers. While feature flags supercharge your development process, if poorly managed they could put you out of business!

Feature flags, also known as feature toggles (or gates), are the modern way to (i) experiment (A/B Test), (ii) create tailored end user experiences, (3) safely rollout features, and (4) mitigate issues in production. The latter two are especially important in mobile Apps where shipping new binaries to user devices takes too long.

Feature flagging is now widespread among SaaS companies. Many leading hi-tech enterprises such as Uber, Spotify and others have been using home grown feature flag solutions for many years. A number of third-party vendors have now emerged that provide feature flagging as a service, accelerating adoption.

Feature flags allow teams to turn features on or off without deploying new code. They do this by allowing you to inject dynamic configuration values into your program. These values are evaluated and updated at runtime. Feature flag services combine an SDK that provides an ergonomic API with a managed and scalable backend that serves values from a database. Checking for feature flags manifests as predicates for if statements:

isNewForm, err := client.BoolVariation("new_form", context, false)
if isNewForm {
  log.Println("Feature flag `new_form` is enabled.")
}

Feature flags, however, have a downside: Tech debt due to stale feature flags. Over time, flags become stale after they have been fully rolled out or rolled back. Stale flags lead to unnecessary checks and dead code paths, thus polluting the codebase. These code paths increase complexity, overload developers cognitively when debugging an outage, explode the space of configurations to test, and reduce the reliability of code coverage metrics. And, downloading stale flags over the air slows down mobile clients. Unless stale flags are cleaned up regularly, tech debt quickly accumulates.

Beyond the maintenance burden, stale flags can put you out of business or cause your rocket to veer off within a minute after take off. Configuration errors (due to misconfigurations or bugs in the feature flagging service) can light up code paths that haven’t been exercised or tested for years (or ever), resulting in user-facing incidents.

Knight Capital group was a big player on Wall Street, until it lost460 million dollars because an old code path under a stale feature flag was accidentally exercised. In anotherunfortunate case, Ariane flight V88 crashed in its maiden voyage because a piece of dead code was executed. This piece was left over from the previous Ariane 4 mission, which started nearly a decade before.

Unsurprisingly (or surprisingly), developers do not clean up stale flags consistently.

Today’s feature flagging services do not go far enough at automating stale feature flag removal. At best, they punt the problem to the developers and increase grunt work by surfacing these stale flags and creating tickets for cleaning them up. In practice, teams manually resolve these tickets via dedicated sprints, fix-it weeks, company wise quality initiatives or worse, Code Yellows. Given the constant stream of feature flags being added into the codebase, this adds a constant maintenance overhead on the development process.

Making matters worse, removing these stale flags is not straightforward. To clean up stale flags properly, a developer has to not only find all usages of stale flags but also propagate constants, fold expressions, inline functions, eliminate dead and unreachable code, and update tests. In addition, they have to create a PR, get it reviewed and then land it. Before they delete the flag from the database, they have to release the cleanups to production and in the case of mobile apps wait until all users have updated their apps.

The following example illustrates this complexity of cleaning up flags manually from the codebase. The figure below shows an example where only the code related to new_form is retained after the feature was rolled out to all the customers. After the cleanup, this function reduces to two statements. Here, the call ldClient.BoolVariation("new_form", context, false) was replaced with true (assuming it is fully rolled out) and the code was then optimized. At large, the variable isNew and err are inlined, that results in the first if statement being eliminated and the second one simplifying to the then branch. It should be noted that all the consequently unreachable code that invoked layout is also elided, along with the unused layout function.

- import ld "github.com/launchdarkly/go-server-sdk/v7"

func GetLayout(ldClient LDClient, color string) string {
-  context := ldcontext.NewBuilder("feedback-forms").Build()
-  isNew, err := ldClient.BoolVariation("new_form", context, false)
-  if err != nil {
-    log.Println(err)
-  }
-  if isNew {
    log.Println("Fetching the new form layout")
    return newLayout(color)
-  } 
-  log.Println("Fetching the form layout")
-  return layout(color) 
}

- func layout(color string) string { ... }

func newLayout(color string) string { ... }

Note that, depending on whether or not there are any other usages of the function newLayout(...), the developer might further wish to inline its implementation into GetLayout(...).

Manually deleting code related to stale feature flags is a Sisyphean task that is tedious and error-prone – but it is fully automatable. In fact, Piranha is an automated refactoring tool that we developed at Uber to clean up such stale feature flags. It was a super-hit internally, removing thousands of feature flags and deleting over half a million lines of code.

At Gitar we’ve built fully automatable solutions for such problems that combine program analysis and transformation with AI. If you are battling tech debt due to dead code or obsolete dependencies, please reach out to us.

We invite you to join our Slack community, where we continue to explore and discuss these topics further.