Skip to content

Commit 6ca682c

Browse files
cperrykestherk15
andauthored
INA-7891: Extract off-topic sections from "Describe incident" documentation (#30917)
* Create new Response Team page in Incident Management docs * Update table of contents * Retitle new page * Update content/en/service_management/incident_management/declare.md Co-authored-by: Esther Kim <[email protected]> * Update content/en/service_management/incident_management/response_team.md Co-authored-by: Esther Kim <[email protected]> * Update content/en/service_management/incident_management/response_team.md Co-authored-by: Esther Kim <[email protected]> * Update content/en/service_management/incident_management/response_team.md Co-authored-by: Esther Kim <[email protected]> * Update content/en/service_management/incident_management/response_team.md Co-authored-by: Esther Kim <[email protected]> * Update content/en/service_management/incident_management/response_team.md Co-authored-by: Esther Kim <[email protected]> * Update content/en/service_management/incident_management/response_team.md Co-authored-by: Esther Kim <[email protected]> * Update content/en/service_management/incident_management/describe.md Co-authored-by: Esther Kim <[email protected]> --------- Co-authored-by: Esther Kim <[email protected]>
1 parent fe568d8 commit 6ca682c

File tree

4 files changed

+77
-72
lines changed

4 files changed

+77
-72
lines changed

config/_default/menus/main.en.yaml

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2418,16 +2418,21 @@ menu:
24182418
parent: incidents
24192419
identifier: incident_describe
24202420
weight: 2
2421+
- name: Response Team
2422+
url: service_management/incident_management/response_team
2423+
parent: incidents
2424+
identifier: response_team
2425+
weight: 3
24212426
- name: Notification
24222427
url: service_management/incident_management/notification
24232428
parent: incidents
24242429
identifier: incident_notification
2425-
weight: 3
2430+
weight: 4
24262431
- name: Investigate an Incident
24272432
url: service_management/incident_management/investigate
24282433
parent: incidents
24292434
identifier: incident_investigate
2430-
weight: 4
2435+
weight: 5
24312436
- name: Timeline
24322437
url: service_management/incident_management/investigate/timeline
24332438
parent: incident_investigate
@@ -2437,7 +2442,7 @@ menu:
24372442
url: service_management/incident_management/incident_settings
24382443
parent: incidents
24392444
identifier: incidents_settings
2440-
weight: 5
2445+
weight: 6
24412446
- name: Information
24422447
url: service_management/incident_management/incident_settings/information
24432448
parent: incidents_settings
@@ -2472,17 +2477,17 @@ menu:
24722477
url: service_management/incident_management/analytics
24732478
parent: incidents
24742479
identifier: analytics
2475-
weight: 6
2480+
weight: 7
24762481
- name: Datadog Clipboard
24772482
url: service_management/incident_management/datadog_clipboard
24782483
parent: incidents
24792484
identifier: incidents_clipboard
2480-
weight: 7
2485+
weight: 8
24812486
- name: Guides
24822487
url: service_management/incident_management/guides
24832488
parent: incidents
24842489
identifier: incidents_guide
2485-
weight: 10
2490+
weight: 9
24862491
- name: On-Call
24872492
url: service_management/on-call/
24882493
pre: on-call

content/en/service_management/incident_management/declare.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,19 @@ In the Datadog paradigm, any of the following are appropriate situations for dec
1111

1212
You can declare an incident from multiple places within the Datadog platform, such as a graph widget on a dashboard, the Incidents UI, or any alert reporting into Datadog.
1313

14+
## Declaration modal
15+
16+
When you declare an incident, a declaration modal appears. This modal has several core elements:
17+
18+
| Incident elements | Description |
19+
| ------------------ | ----------- |
20+
| Title | (Required) A descriptive title for the incident. |
21+
| Severity Level | (Required) By default, severity ranges from SEV-1 (most severe) to SEV-5 (least severe). You can customize the number of severities and their descriptions in Incident Management settings.
22+
| Incident Commander | The person assigned to lead the incident response. |
23+
24+
You can configure [Incident Management Settings][2] to include more fields in the incident declaration modal or require certain fields.
25+
26+
1427
## From the Incident page
1528

1629
In the [Datadog UI][1], click **Declare Incident** to create an incident.

content/en/service_management/incident_management/describe.md

Lines changed: 5 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -16,22 +16,13 @@ No matter where you [declare an incident][1], it’s important to describe it as
1616
- Why it happened
1717
- Attributes associated with the incident
1818

19-
## Incident elements
20-
21-
When you declare an incident, an incident modal comes up. This modal has several core elements:
22-
23-
| Incident elements | Description |
24-
| ------------------ | ----------- |
25-
| Title | (Required) Give your incident a descriptive title. |
26-
| Severity Level | (Required) Denotes the severity of your incident, from SEV-1 (most severe) to SEV-5 (least severe). If your incident is under initial investigation, and you do not know the severity yet, select UNKNOWN. <br> **Note**: You can customize the description of each severity level to fit the requirements of your organization.|
27-
| Incident Commander | (Required) This person is assigned as the leader of the incident investigation. |
28-
| Attributes (Teams) | Assign the appropriate group of users to an incident using [Datadog Teams][2]. Members of the assigned team are automatically invited to the Slack channels. |
29-
3019
## Incident details
3120

32-
An incident's status and details can be updated on the incident's Overview tab. Within an incident, fill out the Overview tab with relevant details—including incident description, customer impact, affected services, incident responders, root cause, detection method, and severity—to give your teams all the information they need to investigate and resolve an incident.
21+
An incident's status and details can be updated on the incident's Overview tab. Within an incident, fill out the Overview tab with relevant details—including incident summary, customer impact, affected services, incident responders, root cause, detection method, and severity—to give your teams all the information they need to investigate and resolve an incident.
3322

34-
Update the impact section to specify customer impact, the start and end times of the impact, and whether the incident is still active. This section also requires a description of the scope of impact to be completed.
23+
Update the impact section to record customer impacts, including their start and end times. These impacts influence incident analytics to help your organization analyze the impact of incidents on your business.
24+
25+
You can define your own custom incident fields on the [Incident Settings Property Fields][2] page.
3526

3627
### Status levels
3728

@@ -42,61 +33,9 @@ The default statuses are **Active**, **Stable**, and **Resolved**. You can add t
4233
* Resolved: Incident no longer affecting others and investigations complete.
4334
* Completed: All remediation complete.
4435

45-
As the status of an incident changes, Datadog tracks time-to-resolution as follows:
46-
47-
| Status Transition | Resolved Timestamp |
48-
| ------------------ | -----------|
49-
| `Active` to `Resolved`, `Active` to `Completed` | Current time |
50-
| `Active` to `Resolved` to `Completed`, `Active` to `Completed` to `Resolved` | Unchanged |
51-
| `Active` to `Completed` to `Active` to `Resolved` | Overridden on last transition |
52-
53-
### Response team
54-
55-
Form your response team by adding other users and assigning them roles to carry out in the process of resolving an incident. There are two default responder types provided by Datadog.
56-
57-
<div class="alert alert-info">Responder roles are unrelated to the <a href="/account_management/rbac/?tab=datadogapplication">Role Based Access Control (RBAC)</a> system. The Responder Type in Incident Management does not change a user's permissions in any capacity. </a></div>
58-
59-
Incident Commander
60-
: The individual responsible for leading the response team
61-
62-
Responder
63-
: An individual that actively contributes to investigating an incident and resolving its underlying issue
64-
65-
*Responders* are notified through the email associated with their Datadog account. Anyone is able to change the role of a responder, but to remove an individual from an incident's Response Team you must have the general `Responder` role assigned and have no activity in the incident. If there is already an `Incident Commander` assigned to an incident, assigning another individual as the `Incident Commander` transfers that role over to them. The previous `Incident Commander` is reassigned the general `Responder` role. A similar reassignment happens whenever you reassign one of your custom one person roles.
66-
67-
The **Response Team** tab saves the date and time when an individual was originally added to the response team of an incident, as well as the date and time when they last contributed something to the Incident Timeline.
68-
69-
#### Custom responder role
70-
71-
You can create custom responder roles in the [Incident Settings for Responder Types][3]. This allows you to create new responder types with custom names and descriptions. It also allows you to choose if a responder type should be a one person role or a multi person role.
72-
73-
## Attributes
74-
75-
Attributes are the metadata and context that you can define for each incident. These fields are [key:value metric tags][4]. Add these field keys on the [Incident Settings Property Fields][5] page. The values you add are then available when you are assessing the impact of an incident on the Overview tab. The following fields are available for assessment in all incidents:
76-
77-
Root Cause
78-
: This text field allows you to enter the description of the root cause, triggers, and contributing factors of the incident.
79-
80-
Detection Method
81-
: Specify how the incident was detected with these default options: customer, employee, monitor, other, or unknown.
82-
83-
Services
84-
: If you have APM configured, your APM services are available for incident assessment. To learn more about configuring your services in APM, see [the docs][5].<br><ul><li>If you are not using Datadog APM, you can upload service names as a CSV. Any values uploaded through CSV are only available within Incident Management for incident assessment purposes.</li><li>Datadog deduplicates service names case-insensitively, so if you use "My Service" or "my service", only the manually added one is shown.</li><li>Datadog overrides APM service names in favor of the manually uploaded list.</li><li>If the service is an APM service and no metrics are posted in the past seven days, it does not appear in the search results.</li><li>Further integrate with Datadog products and accurately assess service impact. The Services property field is automatically populated with APM services for customers using Datadog APM.</li></ul>
85-
86-
Teams
87-
: Choose from the [teams][2] defined in your organization. It is not necessary to upload a list of teams from a CSV file.
88-
89-
## Notifications
90-
91-
Configure incident notifications to share incident updates with all stakeholders and keep all involved members aware of the current investigation. For more information, see the [Notification][6] page.
92-
9336
## Further reading
9437

9538
{{< partial name="whats-next/whats-next.html" >}}
9639

9740
[1]: /service_management/incident_management/declare
98-
[2]: /account_management/teams/
99-
[3]: /service_management/incident_management/incident_settings/responder-types
100-
[4]: /getting_started/tagging/assigning_tags?tab=noncontainerizedenvironments#overview
101-
[5]: https://app.datadoghq.com/incidents/settings#Property-Fields
102-
[6]: /service_management/incident_management/notification
41+
[2]: https://app.datadoghq.com/incidents/settings#Property-Fields
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
title: Incident Response Team
3+
further_reading:
4+
- link: "/service_management/incident_management/incident_settings/responder_types"
5+
tag: "Documentation"
6+
text: "Customize responder types in Incident Settings"
7+
---
8+
9+
## Overview
10+
11+
Form your response team by adding other users and assigning them responder types (responder roles) so they know what they should focus on during the incident response.
12+
13+
## Adding responders
14+
15+
A responder is any Datadog user who participates in the response process for a particular incident.
16+
17+
When you add a responder to an incident:
18+
* Datadog notifies the responder about the by email.
19+
* If the incident is private, the responder can view it in Datadog.
20+
* If the incident has a Slack channel attached, the responders is automatically added to that channel.
21+
22+
Datadog also automatically adds users as responders when:
23+
* They perform any action that updates the incident, including writing to the timeline.
24+
* They are notified about the incident through a notification rule or a manual incident notification.
25+
26+
The **Response Team** tab of the Incident Details page records the time an individual was added to the incident’s response team. It also records the time the responder last took an action affecting the incident in Datadog, such as updating its attributes or writing to its timeline.
27+
28+
You can remove responders if they are not assigned to any responder types and if they have not yet performed any actions updating the incident.
29+
30+
## Assigning responder types
31+
32+
<div class="alert alert-info">Responder types are unrelated to the <a href="/account_management/rbac/?tab=datadogapplication">Role Based Access Control (RBAC)</a> system. The Responder Type in Incident Management does not affect a user’s permissions.</a></div>
33+
34+
From the **Response Team** tab of the Incident Details page, you can modify the responder types for any responder.
35+
36+
You can define additional single-person or multi-person responder types with custom names and descriptions in [Incident Settings][1].
37+
38+
## Managing responders in Slack
39+
40+
In Slack, you can manage responders and their responder types by entering the command `/dd incident responders` inside an incident channel. You can also click the "Manage Responders" button on the incident action tray.
41+
42+
When you assign a responder type, the assignee is notified about it in Slack.
43+
44+
## Further reading
45+
46+
{{< partial name="whats-next/whats-next.html" >}}
47+
48+
[1]: /service_management/incident_management/incident_settings/responder-types

0 commit comments

Comments
 (0)