As a senior engineer it’s a part of the role to help onboard new engineers and mentor them through not just the first few months but their ongoing career. As a DevOps Engineer this is no different and through my time I’ve worked to onboard and mentor several engineers. So, in this post, I want to explore some of the skills and tools new engineers might want to consider learning as they get started in DevOps.

Up front, I want to say that this is not a definitive list of skills and tools that you need to learn to be a DevOps Engineer. It’s just a list of things that I’ve found useful in my career and that I’ve seen other engineers find useful. I’ve also tried to keep the list as short as possible, so it’s not exhaustive. Also, much like a carpenter, builder, or any other trade everything we’re going to look at here is a tool. They are often many tools to achieve the same goal and as an engineer it’s your job to know how to use the right tool for the job. A skilled engineer knows how to use their tools and when to use them. And with that, let’s get started!

TL:DR

For those short on time or who just want a shopping list of things to learn:

And for them all… practice, practice, practice!

Back to top

The Basics

In this section we’ll look at some skills and tools that I would consider the core, basic skills that you’ll need. They’re less focused on a given technology and more focused on what you’ll need to be successful regardless of the specific platform or technology you’re working with.

Back to top

Practice, Practice, Practice

Right up front let’s get this one out of the way. The best way to get better at anything is to practice. Whether it’s communicating with colleagues, working on the command line, or writing code, the more you do it the better you’ll get. Sure, you can get by with search skills (insert Google, Bing, StackOverflow, etc. here) but developing your own skills and proficiency will get you further faster. So, practice, practice, practice!

Back to top

Linux

While not exclusive to DevOps, and not to say that Windows systems may not fall under a DevOps team, the reality of working in a modern DevOps environment is that the majority of the systems you’ll be working with will be Linux based. So, it’s important to have a basic understanding of Linux. As part of that getting comfortable with the command line is key. Even if you’re working with automation tools like Ansible, Chef, or Puppet, you’ll still need to be able to use the command line to troubleshoot issues and understanding the commands that underlie the tasks you’re asking automation to perform will help you understand how to achieve the tasks required. Having a working understanding of Linux is an absolute must for any DevOps Engineer in my opinion.

Checkout my knowledge articles on Linux for some resources to get started.

I also highly recommend checking out Dave Cohen, aka tutorialinux on YouTube. He has some great video tutorials including a whole playlist series on Linux Basics. I’ll add the first video below to get you started.

Back to top

Linux Command Line

A vital subset of overall understanding Linux is to develop and hone your skills on the command line. This isn’t some kind of techie, nerdy flex like some other people might try to make it out to be. Being comfortable and skilled on the command line will help you work through tasks and issues quicker and more efficiently. It will also help you understand how the systems you’re working with work. Coming from a Windows background, as I did, I took a long time to get started on the command line and I wish I had learnt sooner. Even for Windows, command line skills can be really useful for getting stuff done. So, if you’re not already comfortable on the command line, I’d recommend getting started now! See the series from tutorialinux above for a great place to start.

Back to top

Communication

DevOps is a team sport, we don’t practice it in a vacuum. We’re constantly working with our own DevOps team mates, software engineers, product management, operations teams, and likely many more. Depending on the role, you might also be called upon to talk to customers too. So, communication is key. That said, communication is a skill we can learn and develop, so don’t be afraid to practice.

Some top tips for communication:

  1. Listen first, and practice Active Listening
    • It’s important to understand what those around us are actually asking of us. By actively listening and asking questions to clarify we can ensure we’re on the same page and we’re not missing anything.
    • Don’t jump to conclusions, ask questions, and make sure you understand what’s being asked of you.
    • By doing this we’ll have a clear understanding of the tasks we need to complete reducing the risk of redundant or incorrect work.
  2. Learn to say “I don’t know”
    • It’s ok to not know the answer to a question and to say “I don’t know”. We’re not all experts on everything on day one!
    • It’s not ok to make something up or to guess. If you don’t know the answer, say so and ask for help or say that you’ll find out and get back to them.
    • You’ll be surprised how often people will be happy to help you out even if they’re the one who asked you to do something in the first place.
  3. Be clear and concise, but also detailed
    • It’s important to be clear and concise when communicating. This will help you get your point across quickly and efficiently.
    • It’s also important to be detailed. This will help you avoid misunderstandings and ensure that you’re on the same page as the person you’re communicating with.
    • My personal approach here is when in doubt, for example when working with a team for the first time, tend towards being more detailed.
      • Once you’re more settled with the team then you can adjust the level of detail based on your shared understanding and experience
    • It’s a balance, so don’t be afraid to ask for feedback and to practice.
  4. Learn how others communicate
    • We all have different communication styles. Some people prefer to communicate via email, others prefer to talk in person, and others prefer to communicate via chat.
    • It’s important to learn how your colleagues communicate and to adapt your communication style to suit them.
    • This will help you get your point across more effectively and will help you build better relationships with your colleagues.
    • Try not to force your preferred communication style on others
      • Particularly early on in a working relationship, adapting to your colleagues preferred communication style will help you build better relationships with them
      • As you develop your relationship with them you can then start to introduce your preferred communication style and find a common approach that works for both of you

Back to top

Version Control

Any DevOps team will be using some form of version control. It’s a core part of the DevOps process and it’s a core part of the DevOps Engineer’s role. Version control is a way of tracking changes to files and code over time. It’s a way of keeping a history of changes and being able to roll back to a previous version if something goes wrong. It’s also a way of collaborating with other engineers and sharing code. There are many different version control systems out there, but the most popular ones are Git and Subversion. I’ve used both and I prefer Git. It’s a more modern system and it’s more flexible. It’s also the version control system that I’ve seen most DevOps teams use. So, if you’re looking to get started in DevOps, I’d recommend learning Git.

Checkout my Getting Started with Git blog post and Git Knowledge Article to get started.

Back to top

Networking

When we talk about modern software, and in particular cloud infrastructure, we’re talking about distributed systems. These systems are made up of many different components that are connected together across a network. We also typically need to allow external system and users, our customers, to connect and interact with the systems we build. So, it’s important to understand how networks work and how to configure them.

Understanding concepts such as IP Addressing, routing, DNS and load balancing are all important skills for a DevOps Engineer. I’d recommend learning these concepts and how they work. You don’t need to be an expert, but you should have a good understanding of how they work and how they fit together.

Back to top

Infrastructure as Code

A key skill for any DevOps Engineer is the ability to build and deploy environments. In the “old days” this would be manual “racking and stacking” work where you’d have to physically build and configure servers and then deploy software to them. Even in slightly more modern times where we replaced physical servers with virtual machines in most daily workloads the manual process is still the same, only without the physical lifting and carrying. We’d still have to build and configure the virtual machines and then deploy software to them. Whether it’s physical or virtual, this is a time consuming and error prone process, it’s not typically repeatable, and therefore it’s not scalable. It’s also not very efficient. So, we need a better way.

Another desirable requirement of our tooling is to introduce something called idempotency. In short, the ability for our tool to be run multiple times against the same environment and ensure that the state is exact as we declare it. As a simple example is we assume the starting state of a value is 0 and we want it to be 2 we could declare that we want the value to be +2. If we run this tool multiple times against the same environment it will always add +2 to the value, so the end result will be 4, 6, 8, etc. This is not idempotent. However if we state that we always want the value to be 2, and set the value to be 2, then run the tool multiple times, the value will always be 2. This is idempotent. Idempotency is a key concept, and requirement in DevOps tools such as configuration management tools and IaC tools.

Enter Infrastructure as Code (IaC)! IaC is a way of building and deploying environments in a repeatable, scalable, and efficient way. It’s a way of automating the process of building and deploying environments. Essentially we define what we want the environment to “look” like in some form of a code format and then have software tools build and deliver what we’ve defined. These techniques also offer us ways to:

  • Include our environments definitions in version control
  • Spin multiple, identical environments up and down quickly and easily
    • For example delivery dev, test, and prod environments that all look and function the same
  • Port our environments to multiple platforms
    • For example building local dev environments on VMware, Hyper-V or our local workstations, but then deploying staging and production environments to AWS or Azure, or both.

There are a number of different IaC tools out there, some are platform specific such as AWS CloudFormation or Cloud Development Kit (CDK), or Microsoft’s Azure Resource Manager Templates (ARM) and Bicep for Azure. Others are more universal such as Hashicorp Terraform, which includes providers to target multiple platforms. They all have their pros and cons, of course. Terraform is more flexible due to it’s intentional universality; however platform specific tools such as CDK and Bicep are often more tightly integrated with their respective platforms and offer quicker access to new features.

There is a growing trend in the industry towards something referred to as “Terransible”; a horrible amalgamation of two great tools, Terraform and Ansible. We’ll cover Ansible later (in Configuration Management), but this should give a clue towards considering Terraform as a good place to start when learning IaC tools.

One challenge to note with Terraform is state management. Where CDK and Bicep store the state of the declared environment in the native tools CloudFormation and Azure Resource Manager Templates respectively (and therefore don’t require any additional tooling), Terraform stores the state of the environment in a file on your local machine. This means that if you want to share your environment with other engineers, or if you want to build your environment on multiple machines, you need to store the state file in a shared location. There are several ways to achieve this, such as using a shared file system or a cloud storage service such as AWS S3 or Azure Blob Storage, or to use a Terraform Cloud account. However, this is an additional step that you need to consider when using Terraform.

Back to top

Configuration Management

In a similar vain to IaC, Configuration Management (CM) is a way of automating the process of building and deploying environments. However, CM is more focused on the configuration of the environment rather than the environment itself rather than how to build it. We also want our configuration to be idempotent. So, if we run our CM tool multiple times against the same environment, it will always ensure that the environment is configured as we’ve defined it.

There are multiple tools to help us with CM, but the most popular ones are Ansible, Chef and Puppet. While the tools all have their own syntax and quirks, they all have the same core concepts. They all have a way of defining the state of the environment in a code format, and then have software tools build and deliver what we’ve defined. As with everything IT there are many opinions and everyone has their favourite tool. For my 2-cents, I prefer Ansible. I like the YAML declarative syntax and the fact that it’s agentless, and so can be run anywhere. However, this is just my opinion.

As I mentioned in the Infrastructure as Code section, there is a growing trend in the industry towards something referred to as “Terransible”; and so along with Terraform, Ansible is a good place to start when learning CM tools.

Some people try to overlap Terraform and Ansible and perform IaC work with Ansible and configuration management with Terraform. To a point this is possible, but it’s not the best approach. Terraform’s strengths are in building and deploying environments, and Ansible’s strengths are in configuring environments. So, if you’re going to use Terraform to build and deploy your environments, then use Ansible to configure them. As always pick the best tool for the job; don’t try and force fit a tool to do something it’s not designed to do. You wouldn’t hammer in a nail with a screwdriver after all… would you?

Back to top

Application Workloads

The direction of travel in this area is pretty clear at this point in time; the industry is trying to move towards a looser coupling of application components and interaction through Application Programming Interfaces (APIs). This is not the only design pattern around, but it is an increasingly common on. What this means for DevOps Engineers is containerised workloads. Containers are a way of packaging up an application and it’s dependencies into a single unit that can be deployed and run anywhere. This means that we can build our application once and then deploy it to any environment that supports containers. There are now several options where it comes to containers but the ubiquitous name in the field is Docker. Docker is a containerisation platform that allows us to build, run and share containers. It’s a great tool, but it’s not the only one. There are other containerisation platforms such as Red Hat’s Podman.

Honestly, while Docker tries to offer a lot of functionality, it’s not the best tool for the job. It’s a great tool for building and running containers, but the industry seems to agree that it’s not the best tool for managing containers. Docker Swarm has much less take up when it comes to orchestration (managing and maintaining container deployments at scale) compared to Kubernetes. Kubernetes is a container orchestration platform that allows us to manage and maintain container deployments at scale. It’s a great tool, but it’s not the only one. There are other container orchestration platforms such as Red Hat’s OpenShift.

That said, my top tips for containerisation and orchestration are:

  • Learn Docker
    • This will help with general understanding and options for building and running containers
  • Learn Kubernetes
    • Here is the power to deploy and manage containerised workloads at scale

Back to top

Cloud Providers

The overwhelming majority of DevOps role have a strong focus on cloud. It’s not that DevOps is explicitly a cloud role, but the industry is moving towards cloud to provide on-demand services and scale and so DevOps Engineers need to be able to work with cloud providers. There are a number of cloud providers out there, but the big three are Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP). Each of these providers offer a wide range of services that can be used to build and deploy applications. They also offer a wide range of tools to help us manage and maintain our environments.

The 500-pound gorilla in the room is Amazon Web Services (AWS) and so for many it’s the “no-brainer” (or as I prefer to think of it the thinking engineer’s) place to start. If you survey the job market you’re more likely to find demands on AWS skills than any other cloud provider. However, this doesn’t mean that you should ignore the other cloud providers. Microsoft Azure and Google Cloud Platform are both great platforms and have a lot to offer. Also, other than names, the tools and services are very similar across the cloud providers in many cases. For example where AWS has EC2 and ECS (Elastic Container Service), Microsoft Azure has Virtual Machines and Azure Container Instances (ACI). The concepts are the same, but the names are different; the skills are transferable.

Some employers and recruiters will be specific and look for one particular platform, but most will be more open to candidates with experience on any of the major cloud providers. So, just be aware that you may need to learn more than one cloud provider.

In short though, start with AWS and you won’t go far wrong.

Back to top

AWS

Some of the core AWS services that you should look to be familiar with are:

  • EC2
    • Virtual machines on AWS
  • ECS
    • Container services on AWS
  • EKS
    • Managed Kubernetes on AWS
  • Lambda
    • Serverless compute on AWS
  • S3
    • Storage on AWS

I’d also strongly recommend becoming familiar with IAM (Identity and Access Management) and CloudFormation (IaC for AWS) so that you’re familiar with the security model and deploying environments using IaC techniques.

There are, of course, hundreds more services and options from AWS, but as a start these are some of the core services, in line with my other recommendations to get started with.

Back to top

Programming Languages

While you can get along perfectly well with IaC and Config Management tools, you’ll find that you’ll need to know at least one programming language to get the most out of your role. The most common programming languages used in DevOps roles are:

Bash ties in well with the Linux Command Line; if you know the commands you want to run a quick bash script can make it repeatable across multiple machines and multiple times. Bash is a great tool for “quick and dirty” scripting and is powerful enough to get the job done in many occasions. However, it should not be confused for a fully featured programming language. Advanced and complex logic, detailed logging and error handling are not it’s strong areas.

For where bash stops, python is a great tool to pick up. Python is a general purpose programming language that is easy to learn and has a wide range of libraries and tools available. It’s a great tool for building and maintaining automation and is a great tool for DevOps Engineers to have in their tool belt. Some will tell you that python is slow and might try and direct you to other tools, and like everything, in the right context they may, or may not, be right. However, for the vast majority of tasks python provides a quick and easy way to get the job done while also providing advanced features and a cross-platform approach (use configuration management to install python on Windows and you can use it there too).

Back to top

Other Resources

[Update 2024-02-03]

Since I wrote this article I’ve come across a great resource for learning and building skills across the whole range of common DevOps, SRE, and Platform/Cloud Engineer skills. You absolutely must check out bregman-arie/devops-exercises.

Back to top

Summary

So, that’s a lot of information to take in. I’ve tried to keep it as concise as possible, but there is a lot to cover. Remember, you’ll learn a lot as you go, particularly if you embrace practice, practice, practice. The key is to get started and to keep going. You’ll find that you’ll learn a lot more by doing than by reading alone.

I’d advise start with an area that interests you and grow out from there. For me, as a former sysadmin, I started with core Linux skills, before moving to Ansible and then jumping into a DevOps role where I found I really enjoy working with Kubernetes.

There’s a lot more to learn than what is listed here including continuous integration and continuous delivery (CI/CD), monitoring, logging, security, networking, and so on. However, these are the core skills that I’d recommend to get started with. Once you’ve got these down, you can start to explore the other areas.

If this article helped inspire you please consider sharing this article with your friends and colleagues, or let me know via LinkedIn or X / Twitter. If you have any ideas for further content you might like to see please let me know too.

Back to top

Updated: