kentoh - Fotolia

3 ways to test in production promptly and productively

As much as Agile and DevOps changed development, they also shifted testers' roles -- to the right. Gerie Owen offers three ways to evaluate apps in production to find failures.

Organizations have overhauled a great deal of their application development practices over the last decade. These changes affected the roles of software testers and drove them increasingly to test in production rather than prior to code release.

First, test automation split the QA community into those with and without scripting skills, and some employers target only testers with that automation know-how. Second, testers needed to adjust to Agile, which transformed their specialized roles into those of general-purpose team members. Agile also imposed more rapid releases than previously prescribed development methodologies and removed formal testing cycles from the app dev process. This shift leaves some testers scrambling at the end of each sprint to execute the most important test cases and fix bugs.

Then came DevOps, which has made a fully automated application delivery pipeline the centerpiece of the development lifecycle. Since that became the norm, QA professionals have needed to figure out how to automate as much essential testing as possible and determine where tests should be placed in the larger automated workflow.

Testing under modern app dev methodologies

Testers find ways to continue good practices in most of these circumstances, but it's often cumbersome to adapt existing tools and practices to work within these newer frameworks.

For example, in many DevOps toolchains, automated tests run as smoke tests after integration and build. Testers use automation, along with a test harness, to execute such checks. Depending on the nature of the application, testers might find time for some exploratory testing after code delivery. The problems with this approach are that smoke tests are generally not comprehensive, and exploratory tests result in expensive fixes when performed late in the app dev process.

With these constraints in mind, I recommend that you search for failures, not bugs. It is incredibly difficult to fit traditional testing practices into modern app dev processes, as requirements, test procedures and test cases don't mesh with Agile or DevOps. So, a QA team has to reimagine its approach to enable professionals to test in production. A host of changes to application architectures and development practices presents challenges for bug hunters. In addition to the time and test constraints in Agile and DevOps, testers might neither adequately replicate the scale of the production environment, nor test all potential use cases.

Prioritize failures

While you should continue to look for bugs during development, the focus of testing should be to identify and investigate failures in production. Use these three methods to evaluate applications in production, with perhaps timelier and more effective means than preproduction testing.

1. Monitoring

Monitoring is a means to collect and evaluate data on the health of an application. Monitoring can be extremely simple. You might set up a basic shell script to ping the application once a minute; when the ping fails, you know you have a problem.

Monitoring for applications that support and drive business involves the collection of thousands of data points per day about the availability and responsiveness of the application, servers, data center or cloud environment, and network. It's difficult to evaluate that amount of data. But, when you have a failure or performance issue, the data will help diagnose it. And, if you do trend analysis on that data, you can predict an upcoming failure.

2. Synthetic testing

These periodically run automated tests simulate user actions and paths. This method relies on the use of scripts based on data from actual past transactions. Synthetic tests identify problems that real users might have with an application.

3. Chaos engineering

Developed primarily by Netflix with its Chaos Monkey tool, chaos engineering enables a QA team to perform a disruptive test in production that applies unpredictable and random behavior to an application. Chaos engineering aims to determine application resiliency -- its ability to withstand extreme events. This practice can involve a variety of tactics, including the removal of one or more active data centers from the application's pool of resources or blocking the primary domain name system.

The upshot of running a test in production is that, while failure might occur, you will know the cause and can get it running again quickly. Where possible, test only a portion of the deployment, and leave the application untouched on other nodes.

These activities all help testing add more value to an application. While testing in production is new to many in the field, they need to lead the way with this burgeoning form of QA.

Next Steps

How to set up a staging environment

Dig Deeper on Software testing tools and techniques