There is no DevOps without feature flags!

If you want to do DevOps, you have to constantly deliver value to your customers and gather evidence of the success in production. This means constant, zero downtime deployments and testing new features with smaller sets of customers (A/B testing, canary releasing).

To achieve this, you have to decouple the feature roll out from your code deployments. This is done by a design pattern that is called feature toggle, feature switch or feature flag. Originally it was called feature toggle because of its Boolean nature. With A/B testing and canary releasing the “toggles” became more complex. So today the term “feature flag” is more common.

Despite of how you call it, a feature flag changes the runtime behavior of your application depending on a configuration. The configuration can be:

  • per user or user attributes (country, name, group membership etc.),
  • per environment (machines, scaling units, tenants, networks etc.),
  • randomly (X percent of all users)

or a mix of the above. This is really powerful, because you can develop long term features inside the master branch and release them when you are ready. But it is also dangerous, because you have to maintain the compatibility between your features on all levels (persistence, UI etc.) and the complexity to test all runtime alternatives may increase dramatically.

FeatureFlags

Feature Flag Frameworks

The first instinct is to just use the configuration system of your application and write your own framework for feature flags. But when you think more about it, this has some disadvantages. A framework for feature flags should:

  • Allow the management of the flags outside of your application
  • Allow you to change the configuration during runtime without any downtime
  • Switch the configuration at once (on all servers and in all components)
  • Have a minimal fingerprint / a very high performance
  • Be failsafe (return a default value when the service is not available)
  • Allow you to change the configuration per user, machine, percentage (see above)

Implementing a framework that meets these requirements is pretty complex.

There are a lot of open source frameworks for the different languages. For Java there are Togglz, FF4J, Fitchy and Flip. For .Net there are FeatureSwitcherNFeature, FlipIt, FeatureToggle or FeatureBee. Some use strings, some enums and some classes – but none has a high scalable backend and a portal to manage your flags (at least not that I know).

That’s why I played around with LaunchDarkly the last months. This is not just a framework – it’s a complete “feature flag as a service” solution. It has a SDK for .Net, Java, Python, Ruby, Go, Node, JavaScript, iOS, Android and PHP. It has a portal to manage your flags and to set up experiments. It integrates with VSTS and BitBucket Pipelines, with Slack and HipChat, with Optimizley and New Relic. I wrote about a demo to show the performance and how fast config changes are applied. The pricing starts at $79 per month for two projects and 10,000 active users and goes to $699 for unlimited projects and 50,000 active users. It also has an enterprise plan and free plans for academic, none-profit and open source projects. So before building and running your own solution – I would give it a try.

Feature Flags and Technical Debt

If you start with feature flags the chance is high, that it gets really complex after some time. So when Jim Bird writes that Feature Toggles are one of the Worst Kinds of Technical Debt, it is for a reason. So how do you use feature flags “the right way”?

The first thing is, that not all feature flags are the same and you should not treat them that way. There are short-lived feature flags, that are used to roll out new features or conduct experiments. They live for some time and then go away. But there are also feature flags that are intended to stay – like flags that handle licensing (like advanced features etc.). And there are mid-term flags for major features that take a long time to develop. So the first thing to do is to create a naming convention for the flags. You may prefix your flag names with short-, temp-, mid- or something like that. So everyone knows, how the flag is intended to be used. Make sure to use meaningful names – especially for the long-lived flags – and manage them together with a long description in a central place.

Mid and long term flags should be applied on a pretty high level. Like bootstrapping your application or switching between micro services. If you find a mid or long term flag in a low level component you can bet this is technical debt.

Short term flags are different. They may need to reside on different levels and are therefore more complex to handle. It is a good idea is to use special branches to manage the cleanup of flags. So right when you introduce a new feature flag, you create a cleanup branch that removes all the flags and submit a pull request for it.

Summary

If you do DevOps, then you probably already do some kind of feature flags. You should build a strategy to do feature flags right to not end in chaos. Pick a good framework and build a scalable, manageable and extendable engineering system around it. Done right, feature flags are one of the most powerful patterns for DevOps. So take your time and do it right!

4 thoughts on “There is no DevOps without feature flags!

Leave a comment