Petya Petrova - Fotolia

Third-party tools help manage DevOps with AWS, multi-cloud

Many IT teams struggle with the cultural and technical shift to DevOps. Nathaniel Felsen, a DevOps professional, sorts through some tough choices.

DevOps sounds like an ideal application deployment methodology, but enterprises that want to run DevOps with AWS can struggle to put the puzzle pieces together.

Administrators must decide which AWS-native and third-party tools to run, which can have major long- and short-term effects on a cloud environment. There's no one-size-fits-all answer for these AWS questions. Businesses run different types of software and have different end goals, so admins must understand the infrastructure and how to apply the available tools and concepts. For example, your enterprise's automation needs might extend beyond native CloudFormation functionality, or you might prefer a different configuration management system.

Nonetheless, some best practices can help you structure DevOps with AWS. In this Q&A, Nathaniel Felsen, Tesla staff software engineer and author of Effective DevOps with AWS, addresses some of these automation and orchestration questions. You can also read a chapter excerpt from his book that discusses how to assess and boost app performance.

What factors impact an AWS customer's choice in a configuration management system, such as Chef or Ansible, for an AWS environment?

Nathaniel Felsen, Tesla engineer and DevOps authorNathaniel Felsen

Nathaniel Felsen: You do have some Chef support if you're using OpsWorks; they didn't try too hard to work on that concept because it's kind of a trend to use containers nowadays that more and more you want your infrastructure to be completely immutable. It goes a little against config management systems, and I think that's why they were not super on board with the idea of starting to support better Puppet, Chef or Ansible for most services. That said, I used to like all those config management tools that are well-used and have good support for AWS.

In the book, I use Ansible. It's one of the easiest and most used tools nowadays to get started. And they do have a good interaction, a good connection with AWS, because it's easy to rely on the AWS API to find VPCs [Virtual Private Clouds] and interact with them. I think it's a very good tool to get started with config management because you can quickly make changes to your infrastructure in a more safe way than if you had to do them by hand.

But as you start and grow in terms of scale and number of engineers … your infrastructure should be immutable [as] more and more applications lead to containers and microarchitectures. You don't have as much configuration to do on those systems usually. [For example, with] ECS [Elastic Container Service], when you create your container and define the metadata that goes with your container -- XML or YAML -- you're able to provide all the information that the container will need at one time to be part of the right environment.

I think, as much as config management are useful tools nowadays, I imagine in the near future those tools will be less and less used.

What automation and orchestration tools can help enterprises provision and manage resources across multiple clouds, particularly among AWS customers?

Felsen: I would look at tools like HashiCorp … [and] Terraform, and that's a good tool to create images. You can create AMI [Amazon Machine Image]; you can create containers; you can create [other public cloud resources]. It's a good way to create the base of your resources. Obviously, containers like Docker will also run on those container systems; using Kubernetes to orchestrate everything sounds like a good plan. Config management [tools] like Ansible support Google Cloud [and Azure], so I think it's a good tool for that.

But you look at monitoring -- Datadog [is] a good tool to handle most cloud providers. In the open source world … Prometheus is pretty good for Kubernetes, if you imagine that you're going to migrate to a multi-cloud system. … Combining Kubernetes and the ecosystem behind Kubernetes and Serverless [Framework] is a good system in my opinion.

What types of CloudFormation issues must users be aware of?

Felsen: CloudFormation only supports JSON or YAML to describe your infrastructure. [As] an example from the book … we go from using one VPC with one Ethernet to breaking out the infrastructure to multiple subnets. One big [problem] in that network [is], if you do that with CloudFormation, you're going to end up with hundreds of lines [of Python code] when you repeat for each subnet. I use troposphere for a Python library that lets me take advantage of a scripting language to generate those CloudFormation templates.

How do enterprises approach DevSecOps on AWS? What types of strategy and process hurdles do they run into?

Felsen: [DevSecOps] teams mostly focus on two separate subfields: application security and infrastructure security.

Historically, [application] security teams have been known for blocking the release of software in the very late stage of their development/release cycle, which gave them very bad reputations. It's not uncommon to see engineering teams trying to avoid going through security reviews for fear of not being able to ever ship their code. The DevOps movement laid out a new ground with infrastructure as code [IAC] and deployment pipelines, which security teams can take advantage of to try to move the security to the left of the pipeline. The goal is to automate the discovery of vulnerabilities and bad designs the same way most bugs are detected and fixed.

Too often people misunderstand [what DevOps means] and will just think that adding a bunch of tools will miraculously increase certain metrics they care about.
Nathaniel FelsenStaff software engineer, Tesla

AWS shines the most when you look at what they provide for infrastructure security. The goal of the DevSecOps work here is to detect intrusions. To help engineers succeed in their DevSecOps journey, AWS has several good options and services. IAM [Identity and Access Management] has a very granular system to configure security policies and ensure that AWS users and services are only allowed to perform certain API calls toward specific resources, and nothing more. It also offers tools such as CloudTrail to audit everything that's happening in your AWS account. Those trails can be used in conjunction [with] CloudWatch events, which let you execute certain actions based on what event triggered. For instance, you may decide to send an email notification if a user decides to change his or her security policy. Or, even better, you may trigger a Lambda function to restore the previous policy.

AWS also launched a really good service called GuardDuty. This service inspects VPC traffic, Route 53 calls and CloudTrail logs to automatically detect bad behavior. Using machine learning, the service learns over time what's expected and what's not and will trigger alerts when abnormal activities are detected.

Over the course of your career, what technical/cultural barriers have you run into when implementing DevOps with AWS?

Felsen: Most barriers I encountered are cultural barriers. There are a lot of reasons for that.

The first problem is that the term DevOps means too many different things to different people. The DevOps movement aims at improving the relationship between developers and operations by advocating for better communication and collaboration between these two business units. This is done partially with the help of tools, [such as] config management and monitoring, and methodologies, [such as] IAC, automation and CI/CD. But too often, people misunderstand this and will just think that adding a bunch of tools will miraculously increase certain metrics they care about, such as mean time to recovery, and make everyone happy.

The second issue I often see is people resisting to change. Here, too, you see very common patterns. Some people start worrying that their jobs are about to disappear; there is that common fear of the unknown and fear of failure.

In terms of the technical barrier, the biggest burden I see is the amount of tools and rate of innovation … which makes it hard to decide what tool to use and makes it hard to reuse the same tools over time or across companies.

The DevOps movement is still new, and people are making new discoveries, leading to the creation of new best practices constantly. On top of that, [as] engineers, we love to create new tools. As someone once said, 'Sometimes, one week of coding can save us from one hour of Googling.' There are probably more DevOps tools out there than actual DevOps engineers.

Dig Deeper on AWS cloud development