Over the last several years and during my time at Atlassian, I’ve been lucky enough to learn from hundreds of engineering teams about their current development practices, their origins, and where they strive to be from an engineering perspective.
Culturally, one of the most common themes for each organization was moving towards some form of a DevOps model. I also noticed that the majority of organizations I spoke to were on a slightly different journey or maturity level in terms of how they’d adopted or planned to adopt DevOps. A lot of where they were today was based on the roots of where they started off as an organization. For that reason, it’s important we know a little bit about the history of how we’ve made it to our cloud-native world as it is today.
We can define the evolution of the internet over the past few decades as three different “acts”. Act one, two, and three. Within each act there’s a gradual evolution of the internet and how engineering teams operated their applications during each one of them.
Act 1: Early to mid 90s and the rise of dot com
A quick history:
This was the early years of the internet where available bandwidth was still lacking, and the complexity of software increased at a rate faster than the ability of hardware to handle it. It was critical for businesses to figure out a way to get apps on multiple PCs without filling up too much space, while also providing the ability to access from anywhere. The best way to do that was to host the data on someone else's computer, or what was then called a Data Center.
America Online (AOL)
Engineering cycles were extremely long as there were still three siloed roles which consisted of development, operations, and QA in order to move applications through the software development lifecycle and into production. Every single one of these applications were deployed into a dedicated data center which would be monitored by operations teams who were familiar with the systems if anything went wrong.
Act 2: Late 90s, early 2000s and the rise of internet hyper-growth superstars
A quick history:
With Salesforce.com, Amazon, and Google leading the way, the internet and SaaS started becoming more mainstream and new business models began emerging. This era not only ushered in an untold number of new internet businesses, but saw entirely new categories of software created and launched seemingly overnight. To stay competitive, this entire wave of enormous fast moving, hyper-growth companies each began re-examining how their software was being built, tested, and released. In doing so, many began adopting hints of what we see in today’s modern DevOps and agile development practices.
Amazon, Google, & Salesforce.com
Applications were still deployed and monitored in their own very large data centers, although because of the massive scale and growth of some of these companies, it made sense to build out a central infrastructure team that could manage common issues dependent on any services. This is also when Site Reliability Engineer and Systems Engineer roles were eventually born.
Act 3: The 2010s, beginnings of cloud-native, and high-availability cloud
A quick history:
During this era, internet and SaaS applications are developed on top of one of the big three public cloud providers: AWS, GCP, or Azure. The emergence of cloud-native in this era also came with more competition, so building faster, scalable, and resilient applications that contributed directly to business growth was key. Also for many of these organizations, the introduction of Kubernetes would create huge efficiency gains like reclaiming any unused capacity during off-peak hours and reducing build times.
Airbnb, Box, Pinterest
Software iterations happen at lightning speeds, and getting product in the hands of customers as fast as possible is the priority. The technology that comes out of the box from the big three public cloud providers allows for much more flexibility and freedom for the organizations born during this ‘act’. Act 3 marked the transition away from dedicated operations teams focused solely on making sure infrastructure was still running services, especially earlier on in the company's growth stages. Now, resources can be poured into engineers who can handle both development and operations -- leading to what we now call a “You build it, you run it” model.
Why DevOps works today and what to keep in mind
The big three public cloud providers have not only lowered barriers for start-ups, but they’ve also provided an amazing foundational infrastructure for any size company to build on. It’s really helped tear down the walls between development and operations, because with it your infrastructure starts to look a lot like software too. Engineers can much more easily understand the API driven nature, and their infrastructure starts to look a lot like code -- which is exactly what they’re used to.
Now that engineers are even closer to what’s going on in their infrastructure, operations more naturally comes as part of the day-to-day. Most of the rudimentary operational tasks we’ve experienced in Act 1 & 2 have been automated to a point where now you can both own it, all while shipping product faster. While some may think that the move to cloud-native will make the traditional operations engineer obsolete, this couldn’t be further from the truth. Even with all the latest advancements in infrastructure, the engineers who focus on operations and reliability still have critical roles and skill sets needed during different stages of a scaling organization.
All in all, if you’re wondering what’s the right model for your organization -- the answer is it really depends. Depending on the size, scale, growth or what makes sense for the engineering organization some may go the way of the site reliability or systems engineering model. For many companies, the right way might be a pure DevOps, “you build it, you run it” model built on the public cloud, taking into account the number of engineers, reliability/SLO’s required, and how fast the team needs to iterate and ship code.
What doesn’t change though is the need to ship code as fast, confidently, and reliably as possible. Whichever model you choose the dynamics and ecosystem for engineering teams is becoming extremely complex. All the moving pieces around your infrastructure, services, metadata, components, ownership, teams, deploys, on-call schedules, incidents etc. occupies a level of mindshare in the back of every engineer’s head. We need to set that free. effx helps the world’s best engineering teams do that -- you should take it for a spin here for free.