Getty Images/iStockphoto
A healthy perspective on software architecture scalability
It's easy to overlook architecture scalability and focus instead on application performance and cost. But architectural scalability is the basis for some important benefits.
Software architecture scalability refers to the ability of software applications to perform continuous, calculated reallocation of application resources in response to changing workloads. Without the ability to scale dependably, it can be hard for an architecture and the applications that reside within to maintain acceptable levels of performance, reliability and operational efficiency.
There are many factors that affect how quickly an architecture can respond to change, the efficiency of resource allocation processes and the degree of control developers can maintain, while still enabling applications to run in production. In this article, we examine some basic software architecture scalability concepts, some practical ways to prepare architectures for continuous workload changes and pertinent metrics to track when measuring an architecture's ability to scale effectively.
Basic software scalability strategies
There are multiple ways for an application to scale capacity. One of the most basic scalability strategies is to manually add or remove standalone application instances in response to changing demands. For example, if traffic to your application increases, you could deploy a new instance of the application to handle the additional workload. Alternatively, you could shut down instances when traffic declines to prevent unnecessary and wasteful provisioning of application resources.
A second way to manage scalability is to scale individual components, rather than an entire application. For example, if an application hosted as a set of microservices experiences a spike in authentication requests, it could be possible to continue adding authentication-specific microservices as needed, while leaving other services running at the same capacity as before.
These two approaches to managing scalability entail differing degrees of application architecture complexity and granularity of control. For instance, while a SOA approach certainly makes it possible to scale application instances and even individual services independently, it doesn't provide the same degree of granularity that a microservices-based architecture would. The tradeoff, of course, is that a microservices-based architecture also presents a level of management complexity and sophistication that not all organizations working with SOA are necessarily prepared to take on.
Characteristics of a properly scaled architecture
A scalable software architecture lays the foundation for an assortment of benefits related to performance, cost, stability and reliability, as well as automation. For one, a highly scalable architecture makes it possible to respond to sudden changes in workload requirements in real time. Scalable architectures also tend to make more efficient use of hosting resources than more architecturally rigid designs do since they eliminate the need to host parts of the application that you don't need. This could lead to lower hosting costs overall, enabling organizations to deliver the same level of performance from their applications with less overhead.
Scalable architectures also tend to instill heightened levels of application stability and reliability. If developers must scale an entire application instance, it may not always be possible to scale quickly enough when faced with a sudden uptick in demand to avoid application crashes. However, if teams can scale up certain parts of an application in seconds rather than minutes, they stand a much better chance of accommodating sudden surges in application load, while maintaining stable operations.
Although scalable architectures alone don't guarantee automated application operations, a final benefit of highly scalable architecture is the potential to simplify process automation. For example, orchestrators are often used to automatically deploy or remove container instances to host microservices in response to changing workloads. It's also possible to configure node autoscaling within a container environment to provide the hosting capacity necessary.
How to measure architecture scalability
Not all architecture scalability characteristics can necessarily be measured through the same exact lens, unfortunately. A line-of-business application used internally by a fixed group of employees, for example, should be viewed differently than a public-facing app with a fluctuating user base. To achieve acceptable cost, these two apps have much different performance and reliability goals, which means that the requirements for granular scalability are also much different.
Bearing this differentiation in mind, however, there are some basic metrics teams can examine as general indications of an architecture's scalability level:
- Average response time. Although response time is a function of many factors, including the efficiency of the code and orchestration configuration, a low average response time to incoming requests could suggest poor application scalability
- Overall resource utilization. The proportion of CPU and memory an application consumes as a proportion of total available resources can be an indicator of how efficiently an application scales. Ideally, total resource utilization should consistently approach -- but never exceed -- total available resources.
- Total number of microservices instances. If you use a microservices architecture, track how many instances each microservice runs at a given time. A scalable app is one where the total instances of each microservice changes constantly in response to shifts in demand for different types of application functionality.
- Application startup time. Tracking the time required for application instances to become available, or for new containers or pods to be ready, can help measure how quickly the application is able to complete scaling operations. The less time it takes to get a new instance, or part thereof, up and running, the better the application will perform.