Skip to content

Commit aed4e2d

Browse files
SUMO 264514: SQL Server OpenTelemetry App update (#5608)
* Updating sql server otel doc, new dashboards and monitors * updating details for supporting app cross platform linux and windows both * Minor updates in Sql server linux otel app, updating instructions in sql server otel app * Update docs/integrations/microsoft-azure/opentelemetry/sql-server-linux-opentelemetry.md Co-authored-by: Kim (Sumo Logic) <[email protected]> * implementing feedbacks --------- Co-authored-by: Kim (Sumo Logic) <[email protected]>
1 parent 8c895b4 commit aed4e2d

File tree

2 files changed

+105
-41
lines changed

2 files changed

+105
-41
lines changed

docs/integrations/microsoft-azure/opentelemetry/sql-server-linux-opentelemetry.md

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ import TabItem from '@theme/TabItem';
1111

1212
<img src={useBaseUrl('img/integrations/microsoft-azure/sql.png')} alt="thumbnail icon" width="50"/> <img src={useBaseUrl('img/send-data/otel-color.svg')} alt="Thumbnail icon" width="45"/>
1313

14+
:::note logs only
15+
This is a logs-only app. For collecting metrics and enabling comprehensive monitoring on both Linux and Windows, use the [Microsoft SQL Server - OpenTelemetry App](/docs/integrations/microsoft-azure/opentelemetry/sql-server-opentelemetry).
16+
:::
17+
1418
The Sumo Logic app for Microsoft SQL Server is a logs-based app that provides insight into your SQL Server for Linux. The app consists of predefined dashboards, providing visibility into your environment for real-time or historical analysis on backup, restore mirroring, general health and operations of your system.
1519

1620
This app has been tested with following SQL Server versions:
@@ -142,7 +146,7 @@ Following is the query from **Error and warning count** panel from the **SQL Ser
142146

143147
### Overview
144148

145-
The **SQL Server - Overview** dashboard provides a snapshot overview of your SQL Server instance. Use this dashboard to understand CPU, memory, and disk utilization of your SQL Server(s) deployed in your cluster. This dashboard also provides login activities and methods by users.
149+
The **SQL Server Linux - Overview** dashboard provides a snapshot overview of your SQL Server instance. Use this dashboard to understand CPU, memory, and disk utilization of your SQL Server(s) deployed in your cluster. This dashboard also provides login activities and methods by users.
146150

147151
Use this dashboard to:
148152
- Keep track of deadlocks, errors, backup failures, mirroring errors, and insufficient space issue counts.
@@ -152,7 +156,7 @@ Use this dashboard to:
152156

153157
### General Health
154158

155-
The **SQL Server - General Health** dashboard provides you the overall health of SQL Server. Use this dashboard to analyze server events including stopped/up servers and its corresponding down/uptime, monitor disk space percentage utilization, wait time trend, and app-domain issues by SQL server.
159+
The **SQL Server Linux - General Health** dashboard provides you the overall health of SQL Server. Use this dashboard to analyze server events including stopped/up servers and its corresponding down/uptime, monitor disk space percentage utilization, wait time trend, and app-domain issues by SQL server.
156160

157161
Use this dashboard to:
158162

@@ -164,7 +168,7 @@ Use this dashboard to:
164168

165169
### Backup Restore Mirroring
166170

167-
The **SQL Server - Backup Restore Mirroring** dashboard provides information about:
171+
The **SQL Server Linux - Backup Restore Mirroring** dashboard provides information about:
168172

169173
- Transaction log backup events
170174
- Database backup events
@@ -176,7 +180,7 @@ The **SQL Server - Backup Restore Mirroring** dashboard provides information abo
176180

177181
### Operations
178182

179-
The **SQL Server - Operations** dashboard displays recent server configuration changes, number and type of configuration updates, error and warnings, high severity error, and warning trends.
183+
The **SQL Server Linux - Operations** dashboard displays recent server configuration changes, number and type of configuration updates, error and warnings, high severity error, and warning trends.
180184

181185
Use this dashboard to:
182186

@@ -195,10 +199,10 @@ import CreateMonitors from '../../../reuse/apps/create-monitors.md';
195199

196200
| Name | Description | Alert Condition | Recover Condition |
197201
|:--|:--|:--|:--|
198-
| `SQL Server - AppDomain` | This alert is triggered when AppDomain-related issues are detected in your SQL Server instance. | Count `>=` 1 | Count `<` 1 |
199-
| `SQL Server - Backup Fail` | This alert is triggered when the SQL Server backup fails. | Count `>=` 1 | Count `<` 1 |
200-
| `SQL Server - Deadlock` | This alert is triggered when deadlocks are detected in a SQL Server instance. | Count `>` 5 | Count `<=` 5 |
201-
| `SQL Server - Instance Down` | This alert is triggered when the SQL Server instance is down for 5 minutes. | Count `>` 0 | Count `<=` 0 |
202-
| `SQL Server - Insufficient Space` | This alert is triggered when the SQL Server instance cannot allocate a new page for the database due to insufficient disk space in the filegroup. | Count `>` 0 | Count `<=` 0 |
203-
| `SQL Server - Login Fail` | This alert is triggered when the user is unable to login to the SQL Server. | Count `>=` 1 | Count `<` 1 |
204-
| `SQL Server - Mirroring Error` | This alert is triggered when an error occurs in SQL Server mirroring. | Count `>=` 1 | Count `<` 1 |
202+
| `SQL Server Linux - AppDomain` | This alert is triggered when AppDomain-related issues are detected in your SQL Server instance. | Count `>=` 1 | Count `<` 1 |
203+
| `SQL Server Linux - Backup Fail` | This alert is triggered when the SQL Server backup fails. | Count `>=` 1 | Count `<` 1 |
204+
| `SQL Server Linux - Deadlock` | This alert is triggered when deadlocks are detected in a SQL Server instance. | Count `>` 5 | Count `<=` 5 |
205+
| `SQL Server Linux - Instance Down` | This alert is triggered when the SQL Server instance is down for 5 minutes. | Count `>` 0 | Count `<=` 0 |
206+
| `SQL Server Linux - Insufficient Space` | This alert is triggered when the SQL Server instance cannot allocate a new page for the database due to insufficient disk space in the filegroup. | Count `>` 0 | Count `<=` 0 |
207+
| `SQL Server Linux - Login Fail` | This alert is triggered when the user is unable to login to the SQL Server. | Count `>=` 1 | Count `<` 1 |
208+
| `SQL Server Linux - Mirroring Error` | This alert is triggered when an error occurs in SQL Server mirroring. | Count `>=` 1 | Count `<` 1 |

docs/integrations/microsoft-azure/opentelemetry/sql-server-opentelemetry.md

Lines changed: 90 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,21 @@
22
id: sql-server-opentelemetry
33
title: Microsoft SQL Server - OpenTelemetry Collector
44
sidebar_label: Microsoft SQL Server - OTel Collector
5-
description: Learn about the Sumo Logic OpenTelemetry app for Microsoft SQL Server for Windows.
5+
description: Learn about the Sumo Logic OpenTelemetry app for Microsoft SQL Server.
66
---
77
import useBaseUrl from '@docusaurus/useBaseUrl';
88
import Tabs from '@theme/Tabs';
99
import TabItem from '@theme/TabItem';
1010

1111
<img src={useBaseUrl('img/integrations/microsoft-azure/sql.png')} alt="thumbnail icon" width="50"/> <img src={useBaseUrl('img/send-data/otel-color.svg')} alt="Thumbnail icon" width="45"/>
1212

13-
:::note
14-
The information provided in this page will only support the Sumo Logic OpenTelemetry app for Microsoft SQL Server for Windows.
15-
:::
1613
The SQL Server app is a unifies logs and metrics app to help you monitor the availability, performance, health, and resource utilization of your Microsoft SQL Server database clusters. Preconfigured dashboards provide insight into cluster status, performance, operations as well as backup and restore operations along with Performance metrics and metrics for transaction and transaction logs.
1714

1815
This app has been tested with following SQL Server versions:
1916

20-
- `Microsoft SQL Server 2016`
17+
- `Microsoft SQL Server 2022`
2118

22-
The diagram below illustrates the components of the SQL Server collection for each database server. OpenTelemetry collector runs on the same host as SQL Server, and uses the [SQL Server receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/sqlserverreceiver) to obtain SQL Server metrics. This receiver grabs metrics about a Microsoft SQL Server instance using the Windows Performance Counters. Because of this, it is a Windows only receiver. Thus metrics for SQL Server can be collected only if its in a windows machine.
19+
The diagram below illustrates the components of the SQL Server collection for each database server. OpenTelemetry collector runs on the same host as SQL Server, and uses the [SQL Server receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/sqlserverreceiver) to obtain SQL Server metrics. This receiver grabs metrics about a Microsoft SQL Server instance using the Windows Performance Counters (Windows only) and by connecting to SQL Server using the credentials (Windows and Linux both)
2320
SQL Server logs are sent to Sumo Logic through OpenTelemetry [filelog receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filelogreceiver).
2421

2522
<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-Schematics.png' alt="Redis Logs dashboards"/>
@@ -40,15 +37,27 @@ Following are the [Fields](/docs/manage/fields/) which will be created as part o
4037

4138
### For metrics collection
4239

43-
The [SQL server receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/sqlserverreceiver/README.md) for OpenTelemetry grabs metrics about a Microsoft SQL Server instance using the Windows Performance Counters.
40+
The [SQL server receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/sqlserverreceiver/README.md) for OpenTelemetry grabs metrics about a Microsoft SQL Server instance using different methods:
41+
42+
**Windows:**
43+
- Uses Windows Performance Counters for collecting system-level metrics
44+
- Connects directly to SQL Server using credentials for database-specific metrics
45+
46+
**Linux:**
47+
- Connects to SQL Server using credentials (Windows Authentication is not available on Linux)
48+
- Requires SQL Server authentication
4449

4550
### For logs collection
4651

4752
Make sure logging is turned on in SQL Server. Follow [this documentation](https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/scm-services-configure-sql-server-error-logs?view=sql-server-ver15) to enable it.
4853

49-
The Microsoft SQL Server App's queries and dashboards depend on logs from the SQL Server ERRORLOG, which is typically found in: `C:\Program Files\Microsoft SQL Server\MSSQL<version>.MSSQLSERVER\MSSQL\Log\ERRORLOG*`.
54+
The Microsoft SQL Server App's queries and dashboards depend on logs from the SQL Server ERRORLOG, which is typically found in:
55+
56+
**Windows:** `C:\Program Files\Microsoft SQL Server\MSSQL<version>.MSSQLSERVER\MSSQL\Log\ERRORLOG*`
5057

51-
The ERRORLOG is typically in UTF-16LE encoding, however, be sure to verify the file encoding used in your SQL Server configuration.
58+
**Linux:** `/var/opt/mssql/log/errorlog*` (default path for SQL Server on Linux)
59+
60+
The ERRORLOG is typically in UTF-16LE encoding on Windows and Linux both. Be sure to verify the file encoding used in your SQL Server configuration.
5261

5362
**ACL Support**
5463

@@ -68,6 +77,12 @@ $NewAcl.SetAccessRule($fileSystemAccessRule)
6877
Set-Acl -Path "<PATH_TO_LOG_FILE>" -AclObject $NewAcl
6978
```
7079

80+
For Linux systems, ensure the OpenTelemetry collector process has read access to the log files:
81+
```bash
82+
# Grant read access to the collector user (adjust paths as needed)
83+
sudo chmod +r /var/opt/mssql/log/errorlog*
84+
```
85+
7186
## Collection configuration and app installation
7287

7388
import ConfigAppInstall from '../../../reuse/apps/opentelemetry/config-app-install.md';
@@ -86,13 +101,31 @@ This will generate a command you can execute on the machine that you need to mon
86101

87102
### Step 2: Configure integration
88103

89-
1. The Microsoft SQL Server App's queries and dashboards depend on logs from the SQL Server ERRORLOG, which is typically found in:
90-
`C:\Program Files\Microsoft SQL Server\MSSQL<version>.MSSQLSERVER\MSSQL\Log\ERRORLOG*`
91-
2. To collect from a SQL Server with a named instance, both **Computer Name** and **Instance Name** are required. Toggle the `Enable metric collection for SQL Server with a named instance.` button. For a default SQL Server setup, these settings are optional.
92-
* **Computer Name**. The computer name identifies the SQL Server name or IP address of the computer being monitored.
93-
* **Instance Name**. The instance name identifies the specific SQL Server instance being monitored.
94-
3. You can add any custom fields which you want to tag along with the data ingested in Sumo Logic.
95-
4. Click on the **Download YAML File** button to get the yaml file.<br/><img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-YAML.png' style={{border:'1px solid gray'}} alt="YAML" />
104+
1. **Log File Path Configuration**:
105+
- **Windows**: The Microsoft SQL Server App's queries and dashboards depend on logs from the SQL Server ERRORLOG, which is typically found in: `C:\Program Files\Microsoft SQL Server\MSSQL<version>.MSSQLSERVER\MSSQL\Log\ERRORLOG*`
106+
- **Linux**: For SQL Server on Linux, logs are typically located at: `/var/opt/mssql/log/errorlog*`
107+
108+
2. **SQL Server Connection Configuration**: To collect metrics, you'll need to provide connection details:
109+
- **Server Address**: The hostname or IP address of your SQL Server instance (default: 0.0.0.0)
110+
- **Port**: The port number for SQL Server connection (default: 1433)
111+
- **Username**: SQL Server authentication username
112+
- **Password**: SQL Server authentication password
113+
114+
3. **Monitoring a Named SQL Server Instance (Windows Only)**
115+
116+
To collect metrics from a specific named instance of SQL Server on a **Windows** host, enable the `Enable metric collection for SQL Server with a named instance` option. For a default SQL Server setup, these settings are optional.
117+
118+
* **Computer Name**: The computer name identifies the SQL Server name or IP address of the computer being monitored. This is the network name of the machine hosting SQL Server.
119+
* **Instance Name**: The instance name identifies the specific SQL Server instance being monitored. This is required when SQL Server is installed as a named instance (e.g., SQLEXPRESS, INSTANCE01) rather than the default instance.
120+
121+
---
122+
:::note
123+
Monitoring metrics for named instance is not supported in linux.
124+
:::
125+
126+
4. You can add any custom fields which you want to tag along with the data ingested in Sumo Logic.
127+
128+
5. Click on the **Download YAML File** button to get the yaml file.<br/><img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-YAML.png' style={{border:'1px solid gray'}} alt="YAML" />
96129

97130
### Step 3: Send logs to Sumo Logic
98131

@@ -105,6 +138,7 @@ import LogsIntro from '../../../reuse/apps/opentelemetry/send-logs-intro.md';
105138
defaultValue="Windows"
106139
values={[
107140
{label: 'Windows', value: 'Windows'},
141+
{label: 'Linux', value: 'Linux'},
108142
{label: 'Chef', value: 'Chef'},
109143
{label: 'Ansible', value: 'Ansible'},
110144
{label: 'Puppet', value: 'Puppet'},
@@ -120,6 +154,16 @@ import LogsIntro from '../../../reuse/apps/opentelemetry/send-logs-intro.md';
120154

121155
</TabItem>
122156

157+
<TabItem value="Linux">
158+
159+
1. Copy the YAML file to `/etc/otelcol-sumo/conf.d/` folder in the machine which needs to be monitored.
160+
2. Restart the collector using:
161+
```sh
162+
sudo systemctl restart otelcol-sumo
163+
```
164+
165+
</TabItem>
166+
123167
<TabItem value="Chef">
124168

125169
import ChefNoEnv from '../../../reuse/apps/opentelemetry/chef-without-env.md';
@@ -256,16 +300,29 @@ Use this dashboard to:
256300

257301
<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-Transaction-And-Transaction-Logs.png' alt="Operations" />
258302

259-
### Performance Counters
303+
### Performance
260304

261-
The **SQL Server - Performance Counters** dashboard shows performance counters related to database activities, SQL statistics, and buffer cache.
305+
The **SQL Server - Performance** dashboard provides a deep dive into the internal workings of the SQL Server query engine. It helps DBAs and developers identify inefficient queries, contention issues, and opportunities for optimization.
262306

263-
Use this dashboard to:
307+
<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-Performance.png' alt="Performance" />
308+
309+
### I/O
310+
311+
The **SQL Server - I/O** dashboard shows the performance of the underlying disk subsystem as it relates to SQL Server database files. It helps answer questions like, "Is slow disk performance the cause of my application slowdown?" and "Which specific files are the hottest or slowest?"
312+
313+
<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-I-O.png' alt="Performance" />
314+
315+
### Replication
316+
317+
The **SQL Server - Replication** dashboard provide dedicated visibility into the health, throughput, and latency of SQL Server's high-availability and disaster recovery (HA/DR) features, such as Availability Groups.
318+
319+
<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-Replication.png' alt="Performance" />
320+
321+
### Windows Host Performance
264322

265-
- Get info for page buffer hit % and page split rate.
266-
- Insight into lock waits rate, page read and write rate along with patch request rate and SQL compilation, and recompilation per sec.
323+
The **SQL Server - Windows Host Performance** dashboard isolates metrics that are only available via Windows Performance Counters. It provides deeper insights into Windows-specific memory management and transaction log behavior. The key use case is to provide continuity for Windows DBAs familiar with these classic counters.
267324

268-
<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-Performance-Counters.png' alt="Performance-Counters" />
325+
<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-OpenTelemetry/SQL-Server-Windows-Host-Performance.png' alt="Performance" />
269326

270327
## Create monitors for Microsoft SQL Server app
271328

@@ -275,12 +332,15 @@ import CreateMonitors from '../../../reuse/apps/create-monitors.md';
275332

276333
### Microsoft SQL Server alerts
277334

278-
| Alert Name | Alert Description and conditions | Alert Condition | Recover Condition |
335+
| Name | Description | Alert Condition | Recover Condition |
279336
|:--|:--|:--|:--|
280-
| `SQL Server - AppDomain Alert` | This alert gets triggered when we detect AppDomain related issues in your SQL Server instance. | Count > = 1 | Count < 1 |
281-
| `SQL Server - Backup Fail Alert` | This alert gets triggered when we detect that the SQL Server backup failed. | Count > = 1 | Count < 1 |
282-
| `SQL Server - Instance Down Alert` | This alert gets triggered when we detect that the SQL Server instance is down for 5 minutes. | Count > 0 | Count < = 0 |
283-
| `SQL Server - Insufficient Space Alert` | This alert gets triggered when SQL Server instance could not allocate a new page for database because of insufficient disk space in filegroup. | Count > = 1 | Count < 1 |
284-
| `SQL Server - Login Fail Alert` | This alert gets triggered when we detect that the user cannot login to SQL Server. | Count > = 1 | Count < 1 |
285-
| `SQL Server - Mirroring Error Alert` | This alert gets triggered when we detect that the SQL Server mirroring has error. | Count > = 1 | Count < 1 |
286-
| `SQL Server - Processes Blocked Alert` | This alert gets triggered when we detect that SQL Server has blocked processes. | Count > 1 | Count < = 1 |
337+
| `SQL Server - AppDomain` | This alert is triggered when we detect AppDomain related issues in your SQL Server instance. | Count > = 1 | Count < 1 |
338+
| `SQL Server - Backup Fail` | This alert is triggered when we detect that the SQL Server backup failed. | Count > = 1 | Count < 1 |
339+
| `SQL Server - Buffer Cache Hit Ratio` | This alert is triggered when the Buffer Cache Hit Ratio drops below 95%, indicating significant memory pressure and a potential for slow performance due to increased disk reads. | Count < 95 | Count > = 95 |
340+
| `SQL Server - Deadlock` | This alert is triggered when we detect deadlocks in a SQL Server instance. | Count > 5 | Count < = 5 |
341+
| `SQL Server - Instance Down` | This alert is triggered when we detect that the SQL Server instance is down for 5 minutes. | Count > 0 | Count < = 0 |
342+
| `SQL Server - Insufficient Space` | This alert is triggered when SQL Server instance could not allocate a new page for database because of insufficient disk space in filegroup. | Count > 0 | Count < = 0 |
343+
| `SQL Server - Login Fail` | This alert is triggered when we detect that the user cannot login to SQL Server. | Count > = 1 | Count < 1 |
344+
| `SQL Server - Mirroring Error` | This alert is triggered when we detect that the SQL Server mirroring has error. | Count > = 1 | Count < 1 |
345+
| `SQL Server - Non Operational Database` | This alert is triggered if any database enters a 'suspect' or 'offline' state, indicating it is unavailable. | Count > 0 | Count < = 0 |
346+
| `SQL Server - Processes Blocked` | This alert is triggered when blocked processes are detected in SQL Server. | Count > 0 | Count < = 0 |

0 commit comments

Comments
 (0)