data:image/s3,"s3://crabby-images/4de87/4de87c43249424377a1cec90adebb1e42d88a732" alt=""
Getty Images
Understanding Windows kernel structure and why it matters
The kernel is key to Windows desktops, and when something goes wrong with it, the whole OS can break. Learn how the kernel functions to inform the ways to fix any issues.
For a computer OS like Windows, the kernel is the essential foundation on which the OS rests, so it's critical for Windows administrators to understand its structure to inform their work.
The kernel is almost always running and it resides almost exclusively in protected memory. That's because the kernel coordinates between the OS and underlying computer hardware while scheduling and handling tasks, assigning and managing processes, and orchestrating system resources such as memory, storage, communications, device access and more.
Why is the kernel important for a computer?
The kernel is such an important part of OSes -- including Windows -- that if it crashes or throws a serious error, it will prevent a PC from operating. Microsoft calls such errors "stop errors" because they force Windows to stop running and reboot. They're also known as blue screen or blue screen of death errors because they display white text against a solid blue background.
When CrowdStrike updated its cybersecurity device driver in July 2024, it included a bug that forced Windows to fail. Why? Because that driver runs in kernel mode, where a sufficiently serious error brings Windows to a halt. Unfortunately, each time Windows started, the same error recurred as the buggy driver loaded, leaving Windows unusable and inaccessible.
The CrowdStrike debacle left up to 8.5 million Windows PCs inoperable for anywhere from hours to days. Repairing affected devices required booting such PCs from alternate media, uninstalling the offending update, then restarting the PC. This proved particularly problematic for PCs in remote locations or in places where trained technicians couldn't easily access them. It also illustrates that ordinary users are most likely to encounter kernel errors as a result of an update or installation for software that partially runs in kernel mode.
data:image/s3,"s3://crabby-images/c82f3/c82f3dfd31dc159715e373b3cbbb2b6ab4742438" alt="A chart showing the levels of privilege within an OS with the kernel being at the center of the rings, indicating the most privilege."
Understanding Windows kernel architecture
Computer scientists like visual models to explain how things work. Intel developed a well-known ring model for processes running inside an OS -- it organizes processes from the kernel outward in rings numbered 0 through 3, as shown in Figure 1.
In this model, levels of privilege and access decrease as one moves from the center out to the edge. Thus, the kernel has unrestricted access to everything, including CPU, memory, storage, devices and services. Ring 1 is host to typical device drivers, which must interact with and through the kernel to provide device inputs and outputs. At ring 2, system services operate and intermediate between applications in ring 3 and device drivers in ring 1. Users and applications operate in the outermost ring -- often called user mode -- and must request to consume system resources of any kind. Sometimes, device drivers get promoted into the kernel, as with the CrowdStrike example, with the potential for bugs or incomplete testing to wreak havoc.
What does the Windows kernel actually do?
At a minimum, the Windows kernel is responsible for three important roles or tasks:
- Provides interfaces through which users and applications can interact with the OS (Figure 2).
- Launches and manages applications, and coordinates sharing and scheduling when multiple applications run simultaneously.
- Manages underlying system hardware devices and system services of all kinds.
data:image/s3,"s3://crabby-images/504d9/504d92e38ee3652affdea986b6549ab9ce780de1" alt="A chart that explains how kernels interact with apps and computing resources."
To make all these things possible, the kernel handles a slew of computing tasks that include at least the following:
- Loading and managing OS components such as device drivers, system services, file systems and handling user I/O.
- Organizing processes needed to execute tasks for applications as they run, which may include multiple individual threads for sub-tasks within any given process.
- Scheduling applications to receive time slices during which they can access and interact with the kernel, including supervising and policing such interactions as they occur.
- Assigning non-protected memory for each application process to use.
- Handling conflicts and errors related to memory assignments and use.
- Managing and organizing the use of resources such as CPU, cache, file system operation, network access and so forth.
- Managing and handling user I/O devices, such as keyboard, mouse, storage volumes, USB ports, network adapters, displays and cameras. Each kind of device has its own Windows subsystem which brings together drivers, services, interfaces and related APIs.
The kernel's primary function is to schedule and manage things: processes, resources, devices and everything else inside the PC. Its job is to juggle OS components, services and applications, which can run processes and threads that use or implement them. Figure 3 shows a snippet from the performance tab in the Windows 11 Task Manager with almost 700 processes and over 11,000 threads, all simultaneously active.
data:image/s3,"s3://crabby-images/89dd9/89dd9df5cd6d3042b316c371254aa9eb6d45afbd" alt="A screenshot of the active processes of a device."
A computer can only do one thing at a time, so the kernel takes responsibility for organizing, scheduling and executing individual processes and threads. It gives each active item a small time slice and lets it do some work before moving on to the next item in the job queue. This process repeats constantly while old processes complete and leave the job queue even as new processes start up and enter it.
How to address issues with the Windows kernel
As the CrowdStrike incident clearly illustrates, admins must get involved when kernel mode errors occur. Ordinary users won't know what to do to get recovery underway and most likely won't have the tools or privileges necessary to enact and complete necessary repairs.
Microsoft announced a new Windows Resiliency Initiative at its Ignite 2024 conference. A key aspect of this announcement included a Quick Machine Recovery tool designed to simplify and speed recovery in the face of future kernel errors.
This will help IT administrators address and fix complicated OS issues that could prevent booting or other basic functions with greater speed and precision. The tool will be available for public preview in 2025.
Ed Tittel is a 30-plus year IT veteran who has worked as a developer, networking consultant, technical trainer and writer.