runbook
What is a runbook?
Runbooks are a set of standardized written procedures for completing repetitive information technology (IT) processes within a company.
They are part of IT Infrastructure Library (ITIL) protocols, which incorporate information from IT processes, such as knowledge management and problem management.
What is the purpose of a runbook?
Runbooks provide IT teams with contextual documents that increase consistency and efficiency through standardization. They act as a walkthrough or step-by-step guide for both new and experienced IT professionals within the team.
They are typically used for optimizing routine IT operations and troubleshooting. Runbooks also act as documentation in incident management and reduce system downtime.
What are the different types of runbooks?
There are three types of runbooks:
- Manual. Containing step-by-step instructions to be followed by an operator.
- Semiautomatic. Composed of a combination of manual and automated steps.
- Automatic. Requiring no manual intervention.
Depending on their functions, runbooks can also be categorized as:
- General runbooks. For routine IT department activities, such as reviewing audit logs, performing daily backups or monitoring system performance.
- Specialized runbooks. For more complex operations processes, like disaster recovery (DR), network outages, DevOps, etc.
Runbooks vs. playbooks
While the terms may be used synonymously, there are significant differences between runbooks and playbooks.
While runbooks tend to focus on single-process workflows, playbooks usually deal with overarching responses to more significant issues. A playbook has a broader scope and may, in fact, incorporate multiple runbooks.
When should you create a runbook?
Organizations can create detailed runbooks once effective operations tasks have been established on their IT team. They can also be made proactively, anticipating potential IT systems failure, or after analyzing incident reports and post-mortems.
System administrators should regularly maintain and update runbooks once they have been created.
What elements are included in an effective runbook?
A successful runbook will have the following five attributes:
- Actionable. It documents what needs to be done during an incident.
- Accessible. Team members know where to find it.
- Accurate. It contains up-to-date, error-free information.
- Authoritative. Only one runbook made for a single IT process.
- Adaptable. It is easy to modify to prevent future redundancy.
A comprehensive runbook typically includes all details required for the efficient working of computer systems. A sample runbook template is given below:
- Overview. Give an overview of the process or service that is documented.
- Authorization. Identify key personnel or roles who can access the runbook.
- Process steps. Enter information about all required protocols, including installation and deployment.
- Monitoring system information. Outline all possible monitoring system alerts and step-by-step instructions for triggering them.
- DR plans. Include all service-level agreements, escalation protocols, and required incident response reporting and communications.
- Technical documentation. Refer or include any config, metrics or other critical system information.
How to write an effective runbook
Writing an effective runbook requires prior research and analysis. The steps typically include the following:
- Planning. This includes prioritizing which processes need to be documented, runbook templates and style guides that will be used.
- Research. Talk to subject matter experts, and identify critical steps. Screenshots, diagrams and flow charts can assist with the documentation.
- Writing. Runbooks may be created manually or automatically. Runbook automation software enables human intervention to occur at only predetermined points in the creation process.
- Testing. Runbooks must be thoroughly tested by different team members. Missing, extraneous or unclear content is edited and corrected.
- Updating. A system is established to ensure documented processes are updated regularly.