DevOps; a decade of confusion and frustration
What is “DevOps”? is a question I’ve heard a lot, often I’ve asked it implicitly to myself when reading job ads for “DevOps Engineers”.
According to Patrick Debois, a Belgian “agile” consultant and former sysadmin who coined the term in January of 2009: the term “Devops” (not “DevOps”) was supposed to be “Agile System Administrator”.1 it is a compound of “Developers”+“Operations”+“Days”2 and was not intended to be a methodology by that name or even a job title.1 There are some who refer to this talk, titled “10+ Deploys Per day” as the true origin of the DevOps methodology, and primary launch pad for the name. The talk regales the challenges in contemporary companies, especially young ones: Operations staff were oft instructed: “Do not break the site or let it get broke.. We consider slowness is broke-ness”, while, contrarily Development staff were told: “Move fast, deliver value”, and both were measured on success of these objectives.
Now, of course moving fast means breaking things, a tradeoff Facebook fully embraced in its old mantra; “move fast and break things” (though we did not presume they were talking about the very fabric of democracy); but even they came back to the notion that reliability is important, and changed their mantra to “Move Fast With Stable Infra.”.
“But, what if…” said many people, who did not enjoy this fighting between reliability and feature development: “.. we put developers and operations staff together to make DevOps!?”, and thus the silo-slash-wall was broken down and everyone rejoiced.
Except of course, that’s not what happened. #
Operations staff were renamed to be devops.
Developers were renamed to be devops.
What we might call “release engineering” was also renamed devops.
Whole teams of people with a singular discipline: devops
All engineering: Devops
Now you could be a junior sysadmin or a senior backend programmer only fluent in the most arcane elements of .NET and somehow you’re both: “DevOps” at least to someone.
There are also other methodolgies based on this: DevSecOps, ArchOps, TestOps, CloudOps
Lack of Cohesive Vision #
DevOps as a movement has no truly coherent definition or mantra other than removing the wall and a few images of a horizontal lemniscate3. People make their own definitions and it differs from person to person. Additionally; the keen eyed (or ancient) among you may have noticed though that “the wall” was not always inherently bad, as long as teams had shared ownership of the product and a clear contract of concerns then velocity could be achieved.
This brings me to SRE4, which, ironically had existed long before DevOps. SRE was founded as “Production” in 2003 at Google. Described by its founder, Benjamin Treynor Sloss as “What happens when you send Developers to do Operations”. The whole video is worth a watch but tl;dr: he codifies the contract between Feature Development and SRE (which, are other developers doing Operations work as stated). This does not break “the wall”, it is promoting shared ownership and shared responsibility, but SRE is not embedded in development and is not putting perpetual Ops support onto staff whom are focused on developing features. SRE only codifies and formalises contracts of what it means to hand something over, or to control releases. In my mind: this is a true evolution of Operations. But it should be noted that this was business as usual for a lot of established companies; tenured sysadmins generally grew into being quite close in ability to feature developers, many even becoming developers themselves, the inverse was also true with feature developers joining what was often referred to as “Platform Ops”
Benjamins talk is fantastic for other reasons too; he also critically points to the fact that you need people that can be free to focus on reliability. He indicates that developers/coders are the best people served to do this and I tend to agree with him, we should all be comfortable scripting/programming.
I was headed in a direction and I’m just going to go there now:
SRE is a nice concept and adds some good ideas to Operations, but it’s still operations. You’re still a Sysadmin. Yes, Sysadmins could code *gasps from the audience*. In fact sysadmins generally code some of the more gnarly stuff that keeps a business running, database migration systems, failover systems. Chatops as a concept, for example, is founded not from Development but from Operations who used to use IRC to instrument changes. It is Operations who pushed the idea of message queues for reliability reasons at my previous jobs. Developers at that company would have used MySQL for everything.
and.. devops… as a job title.. Well, my opinion is:
If you’re a devops who can’t sort a binary tree, you’re probably a sysadmin.5
if you’re a devops who can’t forcefully unload a stuck file descriptor from a running process, you’re probably a developer.6
if your job is to maintain CI pipelines: you’re a release engineer.
And there is no shame in these job descriptions or titles.. it’s just saying what the focus is. As I mentioned to my directors when they were keen on me hiring “devops engineers”:
DevOps, are not Developers who can install apache and read infra docs.
DevOps, are not Sysadmins who learned Python.
DevOps is not the people who run your Jenkins pipelines.
Personally; I’m sick to absolute death of hearing “DevOps Engineer”, it tells me nothing about what you do, how you do it, how you approach problems or what your real responsibilities are, it doesn’t even tell me what tools you use although there are definitely some “DEVOPS TOOLS!!1!” which are trying to make you believe you need to be using them if your title is devops engineer.. It’s akin to a doctor titled: “BodyFixing Engineer”; doctors specialise because the problem scope of a human body is wide, and when they generalise they misdiagnose.. unfortunately quite often.. and even amongst doctors generalists have a specific title indicating their focus (General Practitioners).
There also seem to be misconceptions floating around, especially on hackernews and lobsters:
Sysadmins did know how to code back in the day, they were coders, it’s weird to assume they didn’t code. If your sysadmin didn’t know your program it’s because they weren’t allowed (compliance) or there was a cultural problem in your company, it’s unlikely that your new “DevOps Engineers” know the codebase if your company was one of these.
Developers knew a fair bit about infrastructure back in the day, if only to get their staging/dev environments to work. It is weird to assume they didn’t.
Today, life is a little bit easier on either side of these camps: but please stop assuming you can do everything, you will only do both poorly.
And if you’re a hiring manager and you only want to hire “DevOps” from only from infrastructure backgrounds or only from software engineering backgrounds: you are actually hiring sysadmins or developers; using the term “DevOps Engineer” only serves to remove information about the focus of the role. It is likely increasing the amount of noise in your hiring pipeline.
Devopsdays is a worldwide series of technical conferences covering topics of software development, IT infrastructure operations, and the intersection between them. Each event is run by volunteers from the local area. ↩
Site reliability engineering is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. ↩
Not specifically this, “why would anybody want to do this!?” but distributed debugging, consistency of data, continuity of service and scalability is a difficult problem to do right, if you know more about this than about the intricacies of memory padding, then this applies to you. ↩
Overly simplistic and arbitrary, but the point is you probably know more algorithms and coding standards than infra standards. ↩