Decision trees
Decision trees add branching logic to your playbooks. Instead of a linear list of steps, a decision tree lets responders follow different paths based on the situation they are facing.
Why decision trees
Not every incident follows the same path. A database outage might require different actions depending on whether it is caused by a connection pool issue, disk space exhaustion, or a query performance problem. Decision trees guide responders to the right course of action without requiring them to figure out the branching logic on their own.
How they work
A decision tree consists of conditions and branches:
- Condition -- a question or check that determines which branch to follow.
- Branch -- the set of steps to follow if the condition is met.
Is the database responding to health checks?
|
+-- YES --> Is query latency below 100ms?
| |
| +-- YES --> Database is healthy. Monitor for 30 minutes.
| |
| +-- NO --> Investigate slow queries. Check for long-running transactions.
|
+-- NO --> Is the database process running?
|
+-- YES --> Check connection pool settings. Restart the pool.
|
+-- NO --> Restart the database process. Escalate to DBA team.Creating a decision tree
- Open your playbook in the editor.
- Add a Decision Tree block.
- Define the root condition (the first question).
- For each possible answer, define the next condition or a set of action steps.
- Continue branching until every path leads to a resolution or an escalation point.
Condition types
| Condition type | Description |
|---|---|
| Text match | Check if a value equals a specific string |
| Numeric threshold | Check if a metric is above or below a threshold |
| Status check | Check the status of a service or component |
| Manual choice | The responder selects the answer based on observation |
Best practices
- Keep decision trees as shallow as possible. Three to four levels of branching is usually sufficient.
- Always include an escalation path for scenarios that the tree does not cover.
- Test decision trees during mock incidents to identify gaps.
- Use clear, unambiguous conditions. "Is the service returning 5xx errors?" is better than "Is something wrong?"
Example: security incident decision tree
Was customer data exposed?
|
+-- YES --> Notify legal team immediately.
| Initiate incident response plan.
| Document all findings.
|
+-- NO --> Was the attack successful?
|
+-- YES --> Patch the vulnerability.
| Review access logs for lateral movement.
| Update firewall rules.
|
+-- NO --> Document the attempt.
Block the source IP.
Close the incident.