--- layout: markdown_page title: "Monitor workflow - Improve" --- - TOC {:toc} # Improve This page contains a description of the Gitlab **Improve** workflow vision as a part of our [Monitor](https://about.gitlab.com/handbook/engineering/development/ops/monitor/) stage. ## Why Improve? Improve is the process of reviewing all events that happened around (before, during, and after) an incident and identifying how to change processes, system behavior and human behavior to prevent future incidents. Conducting an effective Post Incident Review requires preparation ## User Journey ### Preparation Preparing for a Post Incident Review begins during the [Triage](https://about.gitlab.com/direction/monitor/workflows/triage/) process by documenting events and actions as they take place. This makes Post Incident Reviews much more effective. In addition to capturing events and actions taken by team members, it is helpful to collect metric visualizations that show when and how a system changed at the time of the incident. ### Post Incident Review Effective Post Incident Reviews are [blameless](https://opensource.com/article/19/4/psychology-behind-blameless-retrospective). It should be stated at the beginning of the review that everyone involved acted with good intent and that they made the best decision they could with the information that they had. Setting this tone at the beginning of a review helps the team discover all system flaws and potential improvements. The review will walk through the event timeline of the incident disucssing why for each step (the [Five whys method](https://en.wikipedia.org/wiki/Five_whys) is an iterative interogative technique utilized by the GitLab Infrastructure Team to uncover true root cause). ### Action Items Once the root causes of all critical events that happened during the incident have been uncovered and understood, the team will brainstorm improvements to change or prevent those events, ultimately preventing the incident from happening again or preparing better response plans in the case an similar incident occurs. Action items should be written down, prioritized, and scheduled. All action items should be assigned a DRI (directly responsible individual) to ensure completion. ### Follow-up Action items are no good if they team does not follow-up with the DRI to inquire on progress and completion. Follow-up may occur during daily or weekly stand-ups. ## Today ### What's possible We have not enabled the entire workflow detailed above, however, we do have a couple features you can take advantage of today to simplify your **Improve** processes: * Create an [Issue Template](https://docs.gitlab.com/ee/user/project/description_templates.html#using-the-templates) called **Post Incident Review** to make it quick to set up an issue for discussion following a fire-fight. You can easily link this to your incident issue using [Related Issues](https://docs.gitlab.com/ee/user/project/issues/related_issues.html). * Leverage [GitLab Flavoured Markdown](https://docs.gitlab.com/ee/user/markdown.html#gitlab-flavored-markdown-gfm) to add more detail to the Post Incident Review Issue. Add a [flowchart in markdown](https://docs.gitlab.com/ee/user/markdown.html#diagrams-and-flowcharts) using [Mermaid](https://mermaidjs.github.io/#/) to help you create a visual of the timeline and [Task Lists](https://docs.gitlab.com/ee/user/markdown.html#task-lists) will help to track action items as DRIs check them off the list. * Track completion of action items by [linking related Merge Requests](https://docs.gitlab.com/ee/user/project/issues/crosslinking_issues.html#from-merge-requests) to the Post Incident Review issue making it simple to review progress ### Maturity This workflow is currently at the **Planned** stage. Workflows in the Operations section are graded on the same [maturity scale](https://about.gitlab.com/direction/maturity/) as categories. ## What's next We plan to empower teams with a guided Post Incident Review experience that makes it simple to feed system and process improvements back into the Plan stage, completing the DevOps loop. Work supporting this work is captured in this [epic](https://gitlab.com/groups/gitlab-org/-/epics/1973).