Why do Azure Stack appliances have to be certified?

Azure MVP Marius Sandbu gives us a lesson on Azure Stack.

When Microsoft announced Azure Stack back in 2015, there was huge interest around the idea that we finally would be able to take the software and services that power Azure and run them in our own datacenters. The idea is attractive: use modern cloud concepts, but keep your data local. This would also give Microsoft a competitive edge in the public cloud space, as they could provide the same ecosystem running on-premises.

Azure Stack requires the use of certified hardware, and you may wonder why. Jack Madden was asking just this question, so I will answer it and share a bit of background.

The road to Azure Stack

Looking into the public cloud space, we see that all of the large providers, such as Google, AWS, and Azure, have a software-defined approach in their data centers, with virtualized networking and storage, and no hardware dependency. OpenStack, which shares a resemblance to Azure Stack and has been around for some years already, also allows enterprises to adopt a cloud offering on their own infrastructure, again with a software-defined approach using platforms such as GlusterFS for storage and Neutron for networking.

So that brings us to our question—why can’t we use our own hardware with Azure Stack?

First, some background. Azure Stack was not the first time that Microsoft ventured into bringing a copy of Azure down to your own data center. Their first attempt was in 2014 with CPS (Cloud Platform Systems), which was based on Azure Pack, System Center 2012 R2, and Windows Server 2012 R2, running on a converged platform provided by Dell. This previous effort had limited adoption.

CPS came with the software preinstalled and configured on a rack directly from the factory. This took away the complexity of building a cloud solution on your own. However, Microsoft noted that they needed to take another approach with Azure Stack.

One of the design principles that Microsoft made with Azure Stack was that it would look and behave more like Azure to provide better consistency—the expected performance for a virtual machine in the public Azure should be similar to a virtual machine in Azure Stack. This obviously requires a certain set of hardware in order to guarantee performance.

When Microsoft publicly announced that Azure Stack would be launching back in 2017, they announced the Azure Stack Development Kit (single-node), which could run on an any type of hardware as long as you had the minimum requirement. But the full Integrated Systems version (for multi-node deployments) was announced with a handful of OEMs that would each have a set of ready certified hardware to deliver it on.

Technical details

The underlying storage solution for Azure Stack uses S2D (Storage Spaces Direct) in Windows Server. For S2D to scale, it needs low-latency and high-bandwidth connections to ensure proper performance, so Microsoft recommends RDMA (Remote Direct Memory Access) -based networking. RDMA allows network communication to happen directly between the system memory on multiple hosts. So, Azure Stack appliances must have RDMA-based network cards and switches in the core.

In addition, Microsoft wanted to have lifecycle management capabilities as part of Azure Stack, to ensure that any patches and updates to the platform could be handled from the management plane. In the OS update process, Hyper-V hosts have to be restarted and booted on the new image. This process requires PXE boot, where the host boots from the network to get the new operating system configured and installed.

From a security level it also makes sense to provide Azure as an appliance, as the platform is locked down by default and can only be managed from the management plane. Microsoft also takes use of most of the new security functionality in Windows Server, such as Credential Guard and Device Guard, which require specific hardware features such as TPM 2.0. Overall, the main idea is have Azure Stack hardened by design and therefore make it easier to meet compliance standards.

The final piece is that Microsoft wanted to provide Azure Stack as an appliance since this is becoming more and more popular with the growth of hyper-converged infrastructure vendors.

Conclusions

To answer the original question, you can see that in order to meet this long and stringent list of requirements, Microsoft chose the route of working with partners to produce certified Azure Stack appliances. They wanted to remove the guessing game for customers, and creating prebuilt systems is the way to do that. You can also take a closer look at the validation work that Microsoft does for software-defined datacenter hardware.

Since the release of Azure Stack, there has been interest from both service providers and enterprise customers looking at using it for disconnected/edge functionality. Some are also looking into using Azure Stack for building their own services and providing them from within their country border for compliance purposes.

Of course, looking at the market today and seeing more and more vendors delivering pre-built integrated systems to allow for easier support and improved lifecycle management, I feel like Microsoft made the right choice. The only thing now is that Microsoft needs to keep up with its promise to provide more of the Azure functionality to Azure Stack. Up to this point, many of the important notes on the roadmap have been delayed already. We will keep watching to see what happens.

Dig Deeper on Converged infrastructure management