From 35a621a97fe0aa721b116ae7f494bc479fa57fb1 Mon Sep 17 00:00:00 2001 From: Marci W <333176+marciw@users.noreply.github.com> Date: Thu, 24 Jul 2025 17:25:21 -0400 Subject: [PATCH 1/6] Edit and restructure, part 1 --- .../data-streams/downsampling-concepts.md | 105 +++++++++++++ .../downsampling-time-series-data-stream.md | 147 ++---------------- .../data-streams/run-downsampling-manually.md | 5 +- ...ownsampling-using-data-stream-lifecycle.md | 5 +- .../data-streams/run-downsampling-with-ilm.md | 5 +- .../data-streams/run-downsampling.md | 65 ++++++++ .../data-store/data-streams/set-up-tsds.md | 4 +- .../time-series-data-stream-tsds.md | 6 +- manage-data/toc.yml | 9 +- 9 files changed, 208 insertions(+), 143 deletions(-) create mode 100644 manage-data/data-store/data-streams/downsampling-concepts.md create mode 100644 manage-data/data-store/data-streams/run-downsampling.md diff --git a/manage-data/data-store/data-streams/downsampling-concepts.md b/manage-data/data-store/data-streams/downsampling-concepts.md new file mode 100644 index 0000000000..a6e9f4d32d --- /dev/null +++ b/manage-data/data-store/data-streams/downsampling-concepts.md @@ -0,0 +1,105 @@ +--- +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Downsampling concepts [how-downsampling-works] + +:::{warning} +🚧 Work in progress 🚧 +::: + +A [time series](time-series-data-stream-tsds.md#time-series) is a sequence of observations taken over time for a specific entity. The observed samples can be represented as a continuous function, where the time series dimensions remain constant and the time series metrics change over time. + +:::{image} /manage-data/images/elasticsearch-reference-time-series-function.png +:alt: time series function +::: + +In an Elasticsearch index, a single document is created for each timestamp. The document contains the immutable time series dimensions, together with metric names and values. Several time series dimensions and metrics can be stored for a single timestamp. + +:::{image} /manage-data/images/elasticsearch-reference-time-series-metric-anatomy.png +:alt: time series metric anatomy +::: + +For your most current and relevant data, the metrics series typically has a low sampling time interval, so it's optimized for queries that require a high data resolution. + +:::{image} /manage-data/images/elasticsearch-reference-time-series-original.png +:alt: time series original +:title: Original metrics series +::: + +Downsampling reduces the footprint of older, less frequently accessed data by replacing the original time series with a data stream of a higher sampling interval, plus statistical representations of the data. For example, if the original metrics samples were taken every 10 seconds, as the data ages you might choose to reduce the sample granularity to hourly or daily. Or you might choose to reduce the granularity of `cold` archival data to monthly or less. + +:::{image} /manage-data/images/elasticsearch-reference-time-series-downsampled.png +:alt: time series downsampled +:title: Downsampled metrics series +::: + + +### The downsampling process [downsample-api-process] + +The downsampling operation traverses the source TSDS index and performs the following steps: + +1. Creates a new document for each value of the `_tsid` field and each `@timestamp` value, rounded to the `fixed_interval` defined in the downsampling configuration. +2. For each new document, copies all [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) from the source index to the target index. Dimensions in a TSDS are constant, so this step happens only once per bucket. +3. For each [time series metric](time-series-data-stream-tsds.md#time-series-metric) field, computes aggregations for all documents in the bucket. The set of pre-aggregated results differs by metric field type: + + * `gauge` field type: + * `min`, `max`, `sum`, and `value_count` are stored + * `value_count` is stored as type `aggregate_metric_double` + * `counter field type: + * `last_value` is stored. + +4. For all other fields, the most recent value is copied to the target index. + +% TODO ^^ consider mini table in step 3; refactor generally + +### Source and target index field mappings [downsample-api-mappings] + +Fields in the target downsampled index are created based on fields in the original source index, as follows: + +1. **Dimensions:** Fields mapped with the `time-series-dimension` parameter are created in the target downsampled index with the same mapping as in the source index. +2. **Metrics:** Fields mapped with the `time_series_metric` parameter are created in the target downsampled index with the same mapping as in the source index, with one exception: `time_series_metric: gauge` fields are changed to `aggregate_metric_double`. +3. **Labels:** Label fields (fields that are neither dimensions nor metrics) are created in the target downsampled index with the same mapping as in the source index. + +% TODO ^^ make this more concise + +## Querying downsampled indices [querying-downsampled-indices] + +You can use the [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) and [`_async_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) endpoints to query a downsampled index. Multiple raw data and downsampled indices can be queried in a single request, and a single request can include downsampled indices at different granularities (different bucket timespan). That is, you can query data streams that contain downsampled indices with multiple downsampling intervals (for example, `15m`, `1h`, `1d`). + +The result of a time based histogram aggregation is in a uniform bucket size and each downsampled index returns data ignoring the downsampling time interval. For example, if you run a `date_histogram` aggregation with `"fixed_interval": "1m"` on a downsampled index that has been downsampled at an hourly resolution (`"fixed_interval": "1h"`), the query returns one bucket with all of the data at minute 0, then 59 empty buckets, and then a bucket with data again for the next hour. + + +### Notes on downsample queries [querying-downsampled-indices-notes] + +There are a few things to note about querying downsampled indices: + +* When you run queries in {{kib}} and through Elastic solutions, a normal response is returned without notification that some of the queried indices are downsampled. +* For [date histogram aggregations](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md), only `fixed_intervals` (and not calendar-aware intervals) are supported. +* Timezone support comes with caveats: + + * Date histograms at intervals that are multiples of an hour are based on values generated at UTC. This works well for timezones that are on the hour, e.g. +5:00 or -3:00, but requires offsetting the reported time buckets, e.g. `2020-01-01T10:30:00.000` instead of `2020-03-07T10:00:00.000` for timezone +5:30 (India), if downsampling aggregates values per hour. In this case, the results include the field `downsampled_results_offset: true`, to indicate that the time buckets are shifted. This can be avoided if a downsampling interval of 15 minutes is used, as it allows properly calculating hourly values for the shifted buckets. + * Date histograms at intervals that are multiples of a day are similarly affected, in case downsampling aggregates values per day. In this case, the beginning of each day is always calculated at UTC when generated the downsampled values, so the time buckets need to be shifted, e.g. reported as `2020-03-07T19:00:00.000` instead of `2020-03-07T00:00:00.000` for timezone `America/New_York`. The field `downsampled_results_offset: true` is added in this case too. + * Daylight savings and similar peculiarities around timezones affect reported results, as [documented](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#datehistogram-aggregation-time-zone) for date histogram aggregation. Besides, downsampling at daily interval hinders tracking any information related to daylight savings changes. + + + +## Restrictions and limitations [downsampling-restrictions] + +The following restrictions and limitations apply for downsampling: + +* Only indices in a [time series data stream](time-series-data-stream-tsds.md) are supported. +* Data is downsampled based on the time dimension only. All other dimensions are copied to the new index without any modification. +* Within a data stream, a downsampled index replaces the original index and the original index is deleted. Only one index can exist for a given time period. +* A source index must be in read-only mode for the downsampling process to succeed. Check the [Run downsampling manually](./run-downsampling-manually.md) example for details. +* Downsampling data for the same period many times (downsampling of a downsampled index) is supported. The downsampling interval must be a multiple of the interval of the downsampled index. +* Downsampling is provided as an ILM action. See [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md). +* The new, downsampled index is created on the data tier of the original index and it inherits its settings (for example, the number of shards and replicas). +* The numeric `gauge` and `counter` [metric types](elasticsearch://reference/elasticsearch/mapping-reference/mapping-field-meta.md) are supported. +* The downsampling configuration is extracted from the time series data stream [index mapping](./set-up-tsds.md#create-tsds-index-template). The only additional required setting is the downsampling `fixed_interval`. + + diff --git a/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md b/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md index 234c8f41dd..ea575dfb89 100644 --- a/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md +++ b/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md @@ -1,4 +1,5 @@ --- +navigation_title: "Downsample a TSDS" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/downsampling.html applies_to: @@ -8,145 +9,29 @@ products: - id: elasticsearch --- -# Downsampling a time series data stream [downsampling] +# Downsample a time series data stream [downsampling] -Downsampling provides a method to reduce the footprint of your [time series data](time-series-data-stream-tsds.md) by storing it at reduced granularity. - -Metrics solutions collect large amounts of time series data that grow over time. As that data ages, it becomes less relevant to the current state of the system. The downsampling process rolls up documents within a fixed time interval into a single summary document. Each summary document includes statistical representations of the original data: the `min`, `max`, `sum` and `value_count` for each metric. Data stream [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) are stored unchanged. - -Downsampling, in effect, lets you to trade data resolution and precision for storage size. You can include it in an [{{ilm}} ({{ilm-init}})](../../lifecycle/index-lifecycle-management.md) policy to automatically manage the volume and associated cost of your metrics data at it ages. - -Check the following sections to learn more: - -* [How it works](#how-downsampling-works) -* [Running downsampling on time series data](#running-downsampling) -* [Querying downsampled indices](#querying-downsampled-indices) -* [Restrictions and limitations](#downsampling-restrictions) -* [Try it out](#try-out-downsampling) - - -## How it works [how-downsampling-works] - -A [time series](time-series-data-stream-tsds.md#time-series) is a sequence of observations taken over time for a specific entity. The observed samples can be represented as a continuous function, where the time series dimensions remain constant and the time series metrics change over time. - -:::{image} /manage-data/images/elasticsearch-reference-time-series-function.png -:alt: time series function -::: - -In an Elasticsearch index, a single document is created for each timestamp, containing the immutable time series dimensions, together with the metrics names and the changing metrics values. For a single timestamp, several time series dimensions and metrics may be stored. - -:::{image} /manage-data/images/elasticsearch-reference-time-series-metric-anatomy.png -:alt: time series metric anatomy +:::{warning} +🚧 Work in progress 🚧 ::: -For your most current and relevant data, the metrics series typically has a low sampling time interval, so it’s optimized for queries that require a high data resolution. +Downsampling reduces the footprint of your [time series data](time-series-data-stream-tsds.md) by storing it at reduced granularity. -:::{image} /manage-data/images/elasticsearch-reference-time-series-original.png -:alt: time series original -:title: Original metrics series -::: +Metrics tools and solutions collect large amounts of time series data over time. As the data ages, it becomes less relevant to the current state of the system. _Downsampling_ lets you reduce the resolution and precision of older data, in exchange for increased storage space. -Downsampling works on older, less frequently accessed data by replacing the original time series with both a data stream of a higher sampling interval and statistical representations of that data. Where the original metrics samples may have been taken, for example, every ten seconds, as the data ages you may choose to reduce the sample granularity to hourly or daily. You may choose to reduce the granularity of `cold` archival data to monthly or less. +The downsampling process rolls up documents within a fixed time interval into a single summary document. Each summary document includes statistical representations of the original data: the `min`, `max`, `sum`, and `value_count` for each metric. Data stream [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) are stored as is, with no changes. -:::{image} /manage-data/images/elasticsearch-reference-time-series-downsampled.png -:alt: time series downsampled -:title: Downsampled metrics series +:::{tip} +You can include downsampling in an [{{ilm}} ({{ilm-init}})](../../lifecycle/index-lifecycle-management.md) policy to automatically manage the volume and associated cost of your metrics data at it ages. ::: +This section explains the available downsampling options and helps you understand the process. -### The downsampling process [downsample-api-process] - -The downsampling operation traverses the source TSDS index and performs the following steps: - -1. Creates a new document for each value of the `_tsid` field and each `@timestamp` value, rounded to the `fixed_interval` defined in the downsample configuration. -2. For each new document, copies all [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) from the source index to the target index. Dimensions in a TSDS are constant, so this is done only once per bucket. -3. For each [time series metric](time-series-data-stream-tsds.md#time-series-metric) field, computes aggregations for all documents in the bucket. Depending on the metric type of each metric field a different set of pre-aggregated results is stored: - - * `gauge`: The `min`, `max`, `sum`, and `value_count` are stored; `value_count` is stored as type `aggregate_metric_double`. - * `counter`: The `last_value` is stored. - -4. For all other fields, the most recent value is copied to the target index. - - -### Source and target index field mappings [downsample-api-mappings] - -Fields in the target, downsampled index are created based on fields in the original source index, as follows: - -1. All fields mapped with the `time-series-dimension` parameter are created in the target downsample index with the same mapping as in the source index. -2. All fields mapped with the `time_series_metric` parameter are created in the target downsample index with the same mapping as in the source index. An exception is that for fields mapped as `time_series_metric: gauge` the field type is changed to `aggregate_metric_double`. -3. All other fields that are neither dimensions nor metrics (that is, label fields), are created in the target downsample index with the same mapping that they had in the source index. - - -## Running downsampling on time series data [running-downsampling] - -To downsample a time series index, use the [Downsample API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-downsample) and set `fixed_interval` to the level of granularity that you’d like: - -```console -POST /my-time-series-index/_downsample/my-downsampled-time-series-index -{ - "fixed_interval": "1d" -} -``` - -To downsample time series data as part of ILM, include a [Downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) in your ILM policy and set `fixed_interval` to the level of granularity that you’d like: - -```console -PUT _ilm/policy/my_policy -{ - "policy": { - "phases": { - "warm": { - "actions": { - "downsample" : { - "fixed_interval": "1h" - } - } - } - } - } -} -``` - - -## Querying downsampled indices [querying-downsampled-indices] - -You can use the [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) and [`_async_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) endpoints to query a downsampled index. Multiple raw data and downsampled indices can be queried in a single request, and a single request can include downsampled indices at different granularities (different bucket timespan). That is, you can query data streams that contain downsampled indices with multiple downsampling intervals (for example, `15m`, `1h`, `1d`). - -The result of a time based histogram aggregation is in a uniform bucket size and each downsampled index returns data ignoring the downsampling time interval. For example, if you run a `date_histogram` aggregation with `"fixed_interval": "1m"` on a downsampled index that has been downsampled at an hourly resolution (`"fixed_interval": "1h"`), the query returns one bucket with all of the data at minute 0, then 59 empty buckets, and then a bucket with data again for the next hour. - - -### Notes on downsample queries [querying-downsampled-indices-notes] - -There are a few things to note about querying downsampled indices: - -* When you run queries in {{kib}} and through Elastic solutions, a normal response is returned without notification that some of the queried indices are downsampled. -* For [date histogram aggregations](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md), only `fixed_intervals` (and not calendar-aware intervals) are supported. -* Timezone support comes with caveats: - - * Date histograms at intervals that are multiples of an hour are based on values generated at UTC. This works well for timezones that are on the hour, e.g. +5:00 or -3:00, but requires offsetting the reported time buckets, e.g. `2020-01-01T10:30:00.000` instead of `2020-03-07T10:00:00.000` for timezone +5:30 (India), if downsampling aggregates values per hour. In this case, the results include the field `downsampled_results_offset: true`, to indicate that the time buckets are shifted. This can be avoided if a downsampling interval of 15 minutes is used, as it allows properly calculating hourly values for the shifted buckets. - * Date histograms at intervals that are multiples of a day are similarly affected, in case downsampling aggregates values per day. In this case, the beginning of each day is always calculated at UTC when generated the downsampled values, so the time buckets need to be shifted, e.g. reported as `2020-03-07T19:00:00.000` instead of `2020-03-07T00:00:00.000` for timezone `America/New_York`. The field `downsampled_results_offset: true` is added in this case too. - * Daylight savings and similar peculiarities around timezones affect reported results, as [documented](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#datehistogram-aggregation-time-zone) for date histogram aggregation. Besides, downsampling at daily interval hinders tracking any information related to daylight savings changes. - - - -## Restrictions and limitations [downsampling-restrictions] - -The following restrictions and limitations apply for downsampling: - -* Only indices in a [time series data stream](time-series-data-stream-tsds.md) are supported. -* Data is downsampled based on the time dimension only. All other dimensions are copied to the new index without any modification. -* Within a data stream, a downsampled index replaces the original index and the original index is deleted. Only one index can exist for a given time period. -* A source index must be in read-only mode for the downsampling process to succeed. Check the [Run downsampling manually](./run-downsampling-manually.md) example for details. -* Downsampling data for the same period many times (downsampling of a downsampled index) is supported. The downsampling interval must be a multiple of the interval of the downsampled index. -* Downsampling is provided as an ILM action. See [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md). -* The new, downsampled index is created on the data tier of the original index and it inherits its settings (for example, the number of shards and replicas). -* The numeric `gauge` and `counter` [metric types](elasticsearch://reference/elasticsearch/mapping-reference/mapping-field-meta.md) are supported. -* The downsampling configuration is extracted from the time series data stream [index mapping](./set-up-tsds.md#create-tsds-index-template). The only additional required setting is the downsampling `fixed_interval`. - - -## Try it out [try-out-downsampling] - -To take downsampling for a test run, try our example of [running downsampling manually](./run-downsampling-manually.md). +% TODO add subsection links and conceptual links after restructuring -Downsampling can easily be added to your ILM policy. To learn how, try our [Run downsampling with ILM](./run-downsampling-with-ilm.md) example. +## Next steps +% TODO confirm patterns +* Run downsampling +* Downsampling concepts +* Time series data streams overview \ No newline at end of file diff --git a/manage-data/data-store/data-streams/run-downsampling-manually.md b/manage-data/data-store/data-streams/run-downsampling-manually.md index e689bb3ca0..4364c2f43d 100644 --- a/manage-data/data-store/data-streams/run-downsampling-manually.md +++ b/manage-data/data-store/data-streams/run-downsampling-manually.md @@ -9,10 +9,11 @@ products: - id: elasticsearch --- - - # Run downsampling manually [downsampling-manual] +:::{warning} +🚧 Work in progress 🚧 +::: The recommended way to [downsample](./downsampling-time-series-data-stream.md) a [time-series data stream (TSDS)](../data-streams/time-series-data-stream-tsds.md) is [through index lifecycle management (ILM)](run-downsampling-with-ilm.md). However, if you’re not using ILM, you can downsample a TSDS manually. This guide shows you how, using typical Kubernetes cluster monitoring data. diff --git a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md b/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md index c32ae1e919..454425b413 100644 --- a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md +++ b/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md @@ -9,10 +9,11 @@ products: - id: elasticsearch --- - - # Run downsampling using data stream lifecycle [downsampling-dsl] +:::{warning} +🚧 Work in progress 🚧 +::: This is a simplified example that allows you to see quickly how [downsampling](./downsampling-time-series-data-stream.md) works as part of a datastream lifecycle to reduce the storage size of a sampled set of metrics. The example uses typical Kubernetes cluster monitoring data. To test out downsampling with data stream lifecycle, follow these steps: diff --git a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md b/manage-data/data-store/data-streams/run-downsampling-with-ilm.md index 14e87ee04e..9a48ee18cf 100644 --- a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md +++ b/manage-data/data-store/data-streams/run-downsampling-with-ilm.md @@ -9,10 +9,11 @@ products: - id: elasticsearch --- - - # Run downsampling with ILM [downsampling-ilm] +:::{warning} +🚧 Work in progress 🚧 +::: This is a simplified example that allows you to see quickly how [downsampling](./downsampling-time-series-data-stream.md) works as part of an ILM policy to reduce the storage size of a sampled set of metrics. The example uses typical Kubernetes cluster monitoring data. To test out downsampling with ILM, follow these steps: diff --git a/manage-data/data-store/data-streams/run-downsampling.md b/manage-data/data-store/data-streams/run-downsampling.md new file mode 100644 index 0000000000..e42a1db748 --- /dev/null +++ b/manage-data/data-store/data-streams/run-downsampling.md @@ -0,0 +1,65 @@ +--- +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Run downsampling on time series data [running-downsampling] + +:::{warning} +🚧 Work in progress 🚧 +::: + +% TODO consider retitling to "Downsample time series data" + +To downsample a time series index, you can use the `downsample API`, index lifecycle management (ILM), or a data stream lifecycle. + + +::::{tab-set} +:::{tab-item} Downsample API + +## Use the downsample API + +Issue a [downsample API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-downsample) request, setting `fixed_interval` to your preferred level of granularity: + +```console +POST /my-time-series-index/_downsample/my-downsampled-time-series-index +{ + "fixed_interval": "1d" +} +``` +::: + +:::{tab-item} Index lifecycle + +## Downsample with index lifecycle management + +To downsample time series data as part of index lifecycle management (ILM), include a [downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) in your ILM policy, setting `fixed_interval` to your preferred level of granularity: + +```console +PUT _ilm/policy/my_policy +{ +"policy": { + "phases": { + "warm": { + "actions": { + "downsample" : { + "fixed_interval": "1h" + } + } + } + } +} +} +``` +::: + +:::{tab-item} Data stream lifecycle + +Move tutorial here + +::: + +:::: diff --git a/manage-data/data-store/data-streams/set-up-tsds.md b/manage-data/data-store/data-streams/set-up-tsds.md index c3f0c41cbe..6068e057c5 100644 --- a/manage-data/data-store/data-streams/set-up-tsds.md +++ b/manage-data/data-store/data-streams/set-up-tsds.md @@ -11,10 +11,10 @@ products: -# Set up a TSDS [set-up-tsds] +# Set up a time series data stream [set-up-tsds] -To set up a [time series data stream (TSDS)](../data-streams/time-series-data-stream-tsds.md), follow these steps: +To set up a [time series data stream (TSDS)](../data-streams/time-series-data-stream-tsds.md), complete these steps: 1. Check the [prerequisites](#tsds-prereqs). 2. [Create an index lifecycle policy](#tsds-ilm-policy). diff --git a/manage-data/data-store/data-streams/time-series-data-stream-tsds.md b/manage-data/data-store/data-streams/time-series-data-stream-tsds.md index 702f1df002..4a3fcf52da 100644 --- a/manage-data/data-store/data-streams/time-series-data-stream-tsds.md +++ b/manage-data/data-store/data-streams/time-series-data-stream-tsds.md @@ -8,7 +8,11 @@ products: - id: elasticsearch --- -# Time series data stream (TSDS) [tsds] +# Time series data streams (TSDS) [tsds] + +:::{warning} +🚧 Work in progress 🚧 +::: A time series data stream (TSDS) models timestamped metrics data as one or more time series. diff --git a/manage-data/toc.yml b/manage-data/toc.yml index bbca9ac4a0..17e608b59b 100644 --- a/manage-data/toc.yml +++ b/manage-data/toc.yml @@ -15,9 +15,12 @@ toc: children: - file: data-store/data-streams/set-up-tsds.md - file: data-store/data-streams/downsampling-time-series-data-stream.md - - file: data-store/data-streams/run-downsampling-with-ilm.md - - file: data-store/data-streams/run-downsampling-manually.md - - file: data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md + children: + - file: data-store/data-streams/run-downsampling.md + - file: data-store/data-streams/run-downsampling-with-ilm.md + - file: data-store/data-streams/run-downsampling-manually.md + - file: data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md + - file: data-store/data-streams/downsampling-concepts.md - file: data-store/data-streams/reindex-tsds.md - file: data-store/data-streams/logs-data-stream.md - file: data-store/data-streams/failure-store.md From a9368d57e9757e8282815bd798b4f52161667af8 Mon Sep 17 00:00:00 2001 From: Marci W <333176+marciw@users.noreply.github.com> Date: Thu, 24 Jul 2025 17:30:34 -0400 Subject: [PATCH 2/6] Breadcrumbs --- manage-data/data-store/data-streams/downsampling-concepts.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/manage-data/data-store/data-streams/downsampling-concepts.md b/manage-data/data-store/data-streams/downsampling-concepts.md index a6e9f4d32d..5588d18b11 100644 --- a/manage-data/data-store/data-streams/downsampling-concepts.md +++ b/manage-data/data-store/data-streams/downsampling-concepts.md @@ -67,6 +67,9 @@ Fields in the target downsampled index are created based on fields in the origin % TODO ^^ make this more concise +% first pass edits up to here +% TODO resume editing from this line down + ## Querying downsampled indices [querying-downsampled-indices] You can use the [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) and [`_async_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) endpoints to query a downsampled index. Multiple raw data and downsampled indices can be queried in a single request, and a single request can include downsampled indices at different granularities (different bucket timespan). That is, you can query data streams that contain downsampled indices with multiple downsampling intervals (for example, `15m`, `1h`, `1d`). From f5e7ca5d9d4f52c487f5da2aa0c1adf438f54f5b Mon Sep 17 00:00:00 2001 From: Marci W <333176+marciw@users.noreply.github.com> Date: Thu, 24 Jul 2025 17:33:19 -0400 Subject: [PATCH 3/6] Fix anchors --- .../data-store/data-streams/run-downsampling-with-ilm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md b/manage-data/data-store/data-streams/run-downsampling-with-ilm.md index 9a48ee18cf..12a018467d 100644 --- a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md +++ b/manage-data/data-store/data-streams/run-downsampling-with-ilm.md @@ -352,7 +352,7 @@ After the ILM policy has taken effect, the original `.ds-datastream-2022.08.26-0 ... ``` -Run a search query on the datastream (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-time-series-data-stream.md#querying-downsampled-indices-notes)). +Run a search query on the datastream (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-concepts.md#querying-downsampled-indices-notes)). ```console GET datastream/_search From 601494fd2ede974407312671a0885f956584e49d Mon Sep 17 00:00:00 2001 From: Marci W <333176+marciw@users.noreply.github.com> Date: Thu, 24 Jul 2025 17:34:15 -0400 Subject: [PATCH 4/6] Save your changes before committing --- .../data-store/data-streams/run-downsampling-manually.md | 2 +- .../run-downsampling-using-data-stream-lifecycle.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/manage-data/data-store/data-streams/run-downsampling-manually.md b/manage-data/data-store/data-streams/run-downsampling-manually.md index 4364c2f43d..70896d183f 100644 --- a/manage-data/data-store/data-streams/run-downsampling-manually.md +++ b/manage-data/data-store/data-streams/run-downsampling-manually.md @@ -405,7 +405,7 @@ You can now delete the old backing index. But be aware this will delete the orig ## View the results [downsampling-manual-view-results] -Re-run the earlier search query (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-time-series-data-stream.md#querying-downsampled-indices-notes)): +Re-run the earlier search query (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-concepts.md#querying-downsampled-indices-notes)): ```console GET /my-data-stream/_search diff --git a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md b/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md index 454425b413..21aca001ef 100644 --- a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md +++ b/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md @@ -351,7 +351,7 @@ After the data stream lifecycle action was executed, original `.ds-datastream-20 ... ``` -Run a search query on the datastream (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-time-series-data-stream.md#querying-downsampled-indices-notes)). +Run a search query on the datastream (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-concepts.md#querying-downsampled-indices-notes)). ```console GET datastream/_search From 3a0f5155197f2943affc034969adef6feee02bde Mon Sep 17 00:00:00 2001 From: Marci W <333176+marciw@users.noreply.github.com> Date: Thu, 24 Jul 2025 17:35:57 -0400 Subject: [PATCH 5/6] wip banners --- manage-data/data-store/data-streams/reindex-tsds.md | 6 +++--- manage-data/data-store/data-streams/set-up-tsds.md | 5 +++-- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/manage-data/data-store/data-streams/reindex-tsds.md b/manage-data/data-store/data-streams/reindex-tsds.md index a3a95b4cfa..3b420618c9 100644 --- a/manage-data/data-store/data-streams/reindex-tsds.md +++ b/manage-data/data-store/data-streams/reindex-tsds.md @@ -9,11 +9,11 @@ products: - id: elasticsearch --- - - # Reindex a TSDS [tsds-reindex] - +:::{warning} +🚧 Work in progress 🚧 +::: ## Introduction [tsds-reindex-intro] diff --git a/manage-data/data-store/data-streams/set-up-tsds.md b/manage-data/data-store/data-streams/set-up-tsds.md index 6068e057c5..3b37deea9a 100644 --- a/manage-data/data-store/data-streams/set-up-tsds.md +++ b/manage-data/data-store/data-streams/set-up-tsds.md @@ -9,10 +9,11 @@ products: - id: elasticsearch --- - - # Set up a time series data stream [set-up-tsds] +:::{warning} +🚧 Work in progress 🚧 +::: To set up a [time series data stream (TSDS)](../data-streams/time-series-data-stream-tsds.md), complete these steps: From 4cb3d4fb4f0ae05da9d1adb49594063808e823b1 Mon Sep 17 00:00:00 2001 From: Marci W <333176+marciw@users.noreply.github.com> Date: Tue, 12 Aug 2025 19:14:14 -0400 Subject: [PATCH 6/6] Consolidate further; remove tutorial content --- .../data-streams/downsampling-concepts.md | 4 +- .../downsampling-time-series-data-stream.md | 9 +- .../data-streams/run-downsampling-manually.md | 568 ------------------ ...ownsampling-using-data-stream-lifecycle.md | 498 --------------- .../data-streams/run-downsampling-with-ilm.md | 473 --------------- .../data-streams/run-downsampling.md | 110 +++- manage-data/toc.yml | 3 - 7 files changed, 108 insertions(+), 1557 deletions(-) delete mode 100644 manage-data/data-store/data-streams/run-downsampling-manually.md delete mode 100644 manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md delete mode 100644 manage-data/data-store/data-streams/run-downsampling-with-ilm.md diff --git a/manage-data/data-store/data-streams/downsampling-concepts.md b/manage-data/data-store/data-streams/downsampling-concepts.md index 5588d18b11..a5a065db6c 100644 --- a/manage-data/data-store/data-streams/downsampling-concepts.md +++ b/manage-data/data-store/data-streams/downsampling-concepts.md @@ -98,7 +98,9 @@ The following restrictions and limitations apply for downsampling: * Only indices in a [time series data stream](time-series-data-stream-tsds.md) are supported. * Data is downsampled based on the time dimension only. All other dimensions are copied to the new index without any modification. * Within a data stream, a downsampled index replaces the original index and the original index is deleted. Only one index can exist for a given time period. -* A source index must be in read-only mode for the downsampling process to succeed. Check the [Run downsampling manually](./run-downsampling-manually.md) example for details. +* A source index must be in read-only mode for the downsampling process to succeed. Check the Run downsampling manually example for details. +* Downsampling data for the same period many times (downsampling of a downsampled index) is supported. The downsampling interval must be a multiple of the interval of the downsampled index. +* A source index must be in read-only mode for the downsampling process to succeed. Check the Run downsampling manually example for details. * Downsampling data for the same period many times (downsampling of a downsampled index) is supported. The downsampling interval must be a multiple of the interval of the downsampled index. * Downsampling is provided as an ILM action. See [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md). * The new, downsampled index is created on the data tier of the original index and it inherits its settings (for example, the number of shards and replicas). diff --git a/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md b/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md index ea575dfb89..074ac8b88c 100644 --- a/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md +++ b/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md @@ -11,8 +11,8 @@ products: # Downsample a time series data stream [downsampling] -:::{warning} -🚧 Work in progress 🚧 +:::{admonition} Page status +🟢 Ready for review ::: Downsampling reduces the footprint of your [time series data](time-series-data-stream-tsds.md) by storing it at reduced granularity. @@ -32,6 +32,5 @@ This section explains the available downsampling options and helps you understan ## Next steps % TODO confirm patterns -* Run downsampling -* Downsampling concepts -* Time series data streams overview \ No newline at end of file +* [](run-downsampling.md) +* [](downsampling-concepts.md) \ No newline at end of file diff --git a/manage-data/data-store/data-streams/run-downsampling-manually.md b/manage-data/data-store/data-streams/run-downsampling-manually.md deleted file mode 100644 index 70896d183f..0000000000 --- a/manage-data/data-store/data-streams/run-downsampling-manually.md +++ /dev/null @@ -1,568 +0,0 @@ ---- -navigation_title: Run downsampling manually -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/downsampling-manual.html -applies_to: - stack: ga - serverless: ga -products: - - id: elasticsearch ---- - -# Run downsampling manually [downsampling-manual] - -:::{warning} -🚧 Work in progress 🚧 -::: - -The recommended way to [downsample](./downsampling-time-series-data-stream.md) a [time-series data stream (TSDS)](../data-streams/time-series-data-stream-tsds.md) is [through index lifecycle management (ILM)](run-downsampling-with-ilm.md). However, if you’re not using ILM, you can downsample a TSDS manually. This guide shows you how, using typical Kubernetes cluster monitoring data. - -To test out manual downsampling, follow these steps: - -1. Check the [prerequisites](#downsampling-manual-prereqs). -2. [Create a time series data stream](#downsampling-manual-create-index). -3. [Ingest time series data](#downsampling-manual-ingest-data). -4. [Downsample the TSDS](#downsampling-manual-run). -5. [View the results](#downsampling-manual-view-results). - - -## Prerequisites [downsampling-manual-prereqs] - -* Refer to the [TSDS prerequisites](./set-up-tsds.md#tsds-prereqs). -* It is not possible to downsample a [data stream](../data-streams.md) directly, nor multiple indices at once. It’s only possible to downsample one time series index (TSDS backing index). -* In order to downsample an index, it needs to be read-only. For a TSDS write index, this means it needs to be rolled over and made read-only first. -* Downsampling uses UTC timestamps. -* Downsampling needs at least one metric field to exist in the time series index. - - -## Create a time series data stream [downsampling-manual-create-index] - -First, you’ll create a TSDS. For simplicity, in the time series mapping all `time_series_metric` parameters are set to type `gauge`, but [other values](time-series-data-stream-tsds.md#time-series-metric) such as `counter` and `histogram` may also be used. The `time_series_metric` values determine the kind of statistical representations that are used during downsampling. - -The index template includes a set of static [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension): `host`, `namespace`, `node`, and `pod`. The time series dimensions are not changed by the downsampling process. - -```console -PUT _index_template/my-data-stream-template -{ - "index_patterns": [ - "my-data-stream*" - ], - "data_stream": {}, - "template": { - "settings": { - "index": { - "mode": "time_series", - "routing_path": [ - "kubernetes.namespace", - "kubernetes.host", - "kubernetes.node", - "kubernetes.pod" - ], - "number_of_replicas": 0, - "number_of_shards": 2 - } - }, - "mappings": { - "properties": { - "@timestamp": { - "type": "date" - }, - "kubernetes": { - "properties": { - "container": { - "properties": { - "cpu": { - "properties": { - "usage": { - "properties": { - "core": { - "properties": { - "ns": { - "type": "long" - } - } - }, - "limit": { - "properties": { - "pct": { - "type": "float" - } - } - }, - "nanocores": { - "type": "long", - "time_series_metric": "gauge" - }, - "node": { - "properties": { - "pct": { - "type": "float" - } - } - } - } - } - } - }, - "memory": { - "properties": { - "available": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - }, - "majorpagefaults": { - "type": "long" - }, - "pagefaults": { - "type": "long", - "time_series_metric": "gauge" - }, - "rss": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - }, - "usage": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - }, - "limit": { - "properties": { - "pct": { - "type": "float" - } - } - }, - "node": { - "properties": { - "pct": { - "type": "float" - } - } - } - } - }, - "workingset": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - } - } - }, - "name": { - "type": "keyword" - }, - "start_time": { - "type": "date" - } - } - }, - "host": { - "type": "keyword", - "time_series_dimension": true - }, - "namespace": { - "type": "keyword", - "time_series_dimension": true - }, - "node": { - "type": "keyword", - "time_series_dimension": true - }, - "pod": { - "type": "keyword", - "time_series_dimension": true - } - } - } - } - } - } -} -``` - - -## Ingest time series data [downsampling-manual-ingest-data] - -Because time series data streams have been designed to [only accept recent data](time-series-data-stream-tsds.md#tsds-accepted-time-range), in this example, you’ll use an ingest pipeline to time-shift the data as it gets indexed. As a result, the indexed data will have an `@timestamp` from the last 15 minutes. - -Create the pipeline with this request: - -```console -PUT _ingest/pipeline/my-timestamp-pipeline -{ - "description": "Shifts the @timestamp to the last 15 minutes", - "processors": [ - { - "set": { - "field": "ingest_time", - "value": "{{_ingest.timestamp}}" - } - }, - { - "script": { - "lang": "painless", - "source": """ - def delta = ChronoUnit.SECONDS.between( - ZonedDateTime.parse("2022-06-21T15:49:00Z"), - ZonedDateTime.parse(ctx["ingest_time"]) - ); - ctx["@timestamp"] = ZonedDateTime.parse(ctx["@timestamp"]).plus(delta,ChronoUnit.SECONDS).toString(); - """ - } - } - ] -} -``` - -Next, use a bulk API request to automatically create your TSDS and index a set of ten documents: - -```console -PUT /my-data-stream/_bulk?refresh&pipeline=my-timestamp-pipeline -{"create": {}} -{"@timestamp":"2022-06-21T15:49:00Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":91153,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":463314616},"usage":{"bytes":307007078,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":585236},"rss":{"bytes":102728},"pagefaults":120901,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:45:50Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":124501,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":982546514},"usage":{"bytes":360035574,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1339884},"rss":{"bytes":381174},"pagefaults":178473,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:50Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":38907,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":862723768},"usage":{"bytes":379572388,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":431227},"rss":{"bytes":386580},"pagefaults":233166,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:40Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":86706,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":567160996},"usage":{"bytes":103266017,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1724908},"rss":{"bytes":105431},"pagefaults":233166,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:00Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":150069,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":639054643},"usage":{"bytes":265142477,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1786511},"rss":{"bytes":189235},"pagefaults":138172,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:42:40Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":82260,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":854735585},"usage":{"bytes":309798052,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":924058},"rss":{"bytes":110838},"pagefaults":259073,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:42:10Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":153404,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":279586406},"usage":{"bytes":214904955,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1047265},"rss":{"bytes":91914},"pagefaults":302252,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:40:20Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":125613,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":822782853},"usage":{"bytes":100475044,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":2109932},"rss":{"bytes":278446},"pagefaults":74843,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:40:10Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":100046,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":567160996},"usage":{"bytes":362826547,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1986724},"rss":{"bytes":402801},"pagefaults":296495,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:38:30Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":40018,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":1062428344},"usage":{"bytes":265142477,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":2294743},"rss":{"bytes":340623},"pagefaults":224530,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -``` - -You can use the search API to check if the documents have been indexed correctly: - -```console -GET /my-data-stream/_search -``` - -Run the following aggregation on the data to calculate some interesting statistics: - -```console -GET /my-data-stream/_search -{ - "size": 0, - "aggs": { - "tsid": { - "terms": { - "field": "_tsid" - }, - "aggs": { - "over_time": { - "date_histogram": { - "field": "@timestamp", - "fixed_interval": "1d" - }, - "aggs": { - "min": { - "min": { - "field": "kubernetes.container.memory.usage.bytes" - } - }, - "max": { - "max": { - "field": "kubernetes.container.memory.usage.bytes" - } - }, - "avg": { - "avg": { - "field": "kubernetes.container.memory.usage.bytes" - } - } - } - } - } - } - } -} -``` - - -## Downsample the TSDS [downsampling-manual-run] - -A TSDS can’t be downsampled directly. You need to downsample its backing indices instead. You can see the backing index for your data stream by running: - -```console -GET /_data_stream/my-data-stream -``` - -This returns: - -```console-result -{ - "data_streams": [ - { - "name": "my-data-stream", - "timestamp_field": { - "name": "@timestamp" - }, - "indices": [ - { - "index_name": ".ds-my-data-stream-2023.07.26-000001", <1> - "index_uuid": "ltOJGmqgTVm4T-Buoe7Acg", - "prefer_ilm": true, - "managed_by": "Unmanaged" - } - ], - "generation": 1, - "status": "GREEN", - "next_generation_managed_by": "Unmanaged", - "prefer_ilm": true, - "template": "my-data-stream-template", - "hidden": false, - "system": false, - "allow_custom_routing": false, - "replicated": false, - "rollover_on_write": false, - "time_series": { - "temporal_ranges": [ - { - "start": "2023-07-26T09:26:42.000Z", - "end": "2023-07-26T13:26:42.000Z" - } - ] - } - } - ] -} -``` - -1. The backing index for this data stream. - - -Before a backing index can be downsampled, the TSDS needs to be rolled over and the old index needs to be made read-only. - -Roll over the TSDS using the [rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover): - -```console -POST /my-data-stream/_rollover/ -``` - -Copy the name of the `old_index` from the response. In the following steps, replace the index name with that of your `old_index`. - -The old index needs to be set to read-only mode. Run the following request: - -```console -PUT /.ds-my-data-stream-2023.07.26-000001/_block/write -``` - -Next, use the [downsample API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-downsample) to downsample the index, setting the time series interval to one hour: - -```console -POST /.ds-my-data-stream-2023.07.26-000001/_downsample/.ds-my-data-stream-2023.07.26-000001-downsample -{ - "fixed_interval": "1h" -} -``` - -Now you can [modify the data stream](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-modify-data-stream), and replace the original index with the downsampled one: - -```console -POST _data_stream/_modify -{ - "actions": [ - { - "remove_backing_index": { - "data_stream": "my-data-stream", - "index": ".ds-my-data-stream-2023.07.26-000001" - } - }, - { - "add_backing_index": { - "data_stream": "my-data-stream", - "index": ".ds-my-data-stream-2023.07.26-000001-downsample" - } - } - ] -} -``` - -You can now delete the old backing index. But be aware this will delete the original data. Don’t delete the index if you may need the original data in the future. - - -## View the results [downsampling-manual-view-results] - -Re-run the earlier search query (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-concepts.md#querying-downsampled-indices-notes)): - -```console -GET /my-data-stream/_search -``` - -The TSDS with the new downsampled backing index contains just one document. For counters, this document would only have the last value. For gauges, the field type is now `aggregate_metric_double`. You see the `min`, `max`, `sum`, and `value_count` statistics based off of the original sampled metrics: - -```console-result -{ - "took": 2, - "timed_out": false, - "_shards": { - "total": 4, - "successful": 4, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 1, - "relation": "eq" - }, - "max_score": 1, - "hits": [ - { - "_index": ".ds-my-data-stream-2023.07.26-000001-downsample", - "_id": "0eL0wC_4-45SnTNFAAABiZHbD4A", - "_score": 1, - "_source": { - "@timestamp": "2023-07-26T11:00:00.000Z", - "_doc_count": 10, - "ingest_time": "2023-07-26T11:26:42.715Z", - "kubernetes": { - "container": { - "cpu": { - "usage": { - "core": { - "ns": 12828317850 - }, - "limit": { - "pct": 0.0000277905 - }, - "nanocores": { - "min": 38907, - "max": 153404, - "sum": 992677, - "value_count": 10 - }, - "node": { - "pct": 0.0000277905 - } - } - }, - "memory": { - "available": { - "bytes": { - "min": 279586406, - "max": 1062428344, - "sum": 7101494721, - "value_count": 10 - } - }, - "majorpagefaults": 0, - "pagefaults": { - "min": 74843, - "max": 302252, - "sum": 2061071, - "value_count": 10 - }, - "rss": { - "bytes": { - "min": 91914, - "max": 402801, - "sum": 2389770, - "value_count": 10 - } - }, - "usage": { - "bytes": { - "min": 100475044, - "max": 379572388, - "sum": 2668170609, - "value_count": 10 - }, - "limit": { - "pct": 0.00009923134 - }, - "node": { - "pct": 0.017700378 - } - }, - "workingset": { - "bytes": { - "min": 431227, - "max": 2294743, - "sum": 14230488, - "value_count": 10 - } - } - }, - "name": "container-name-44", - "start_time": "2021-03-30T07:59:06.000Z" - }, - "host": "gke-apps-0", - "namespace": "namespace26", - "node": "gke-apps-0-0", - "pod": "gke-apps-0-0-0" - } - } - } - ] - } -} -``` - -Re-run the earlier aggregation. Even though the aggregation runs on the downsampled TSDS that only contains 1 document, it returns the same results as the earlier aggregation on the original TSDS. - -```console -GET /my-data-stream/_search -{ - "size": 0, - "aggs": { - "tsid": { - "terms": { - "field": "_tsid" - }, - "aggs": { - "over_time": { - "date_histogram": { - "field": "@timestamp", - "fixed_interval": "1d" - }, - "aggs": { - "min": { - "min": { - "field": "kubernetes.container.memory.usage.bytes" - } - }, - "max": { - "max": { - "field": "kubernetes.container.memory.usage.bytes" - } - }, - "avg": { - "avg": { - "field": "kubernetes.container.memory.usage.bytes" - } - } - } - } - } - } - } -} -``` - -This example demonstrates how downsampling can dramatically reduce the number of documents stored for time series data, within whatever time boundaries you choose. It’s also possible to perform downsampling on already downsampled data, to further reduce storage and associated costs, as the time series data ages and the data resolution becomes less critical. - -The recommended way to downsample a TSDS is with ILM. To learn more, try the [Run downsampling with ILM](./run-downsampling-with-ilm.md) example. - diff --git a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md b/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md deleted file mode 100644 index 21aca001ef..0000000000 --- a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md +++ /dev/null @@ -1,498 +0,0 @@ ---- -navigation_title: Run downsampling using data stream lifecycle -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/downsampling-dsl.html -applies_to: - stack: ga - serverless: ga -products: - - id: elasticsearch ---- - -# Run downsampling using data stream lifecycle [downsampling-dsl] - -:::{warning} -🚧 Work in progress 🚧 -::: - -This is a simplified example that allows you to see quickly how [downsampling](./downsampling-time-series-data-stream.md) works as part of a datastream lifecycle to reduce the storage size of a sampled set of metrics. The example uses typical Kubernetes cluster monitoring data. To test out downsampling with data stream lifecycle, follow these steps: - -1. Check the [prerequisites](#downsampling-dsl-prereqs). -2. [Create an index template with data stream lifecycle](#downsampling-dsl-create-index-template). -3. [Ingest time series data](#downsampling-dsl-ingest-data). -4. [View current state of data stream](#downsampling-dsl-view-data-stream-state). -5. [Roll over the data stream](#downsampling-dsl-rollover). -6. [View downsampling results](#downsampling-dsl-view-results). - - -## Prerequisites [downsampling-dsl-prereqs] - -Refer to [time series data stream prerequisites](./set-up-tsds.md#tsds-prereqs). - - -## Create an index template with data stream lifecycle [downsampling-dsl-create-index-template] - -This creates an index template for a basic data stream. The available parameters for an index template are described in detail in [Set up a time series data stream](set-up-data-stream.md). - -For simplicity, in the time series mapping all `time_series_metric` parameters are set to type `gauge`, but the `counter` metric type may also be used. The `time_series_metric` values determine the kind of statistical representations that are used during downsampling. - -The index template includes a set of static [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension): `host`, `namespace`, `node`, and `pod`. The time series dimensions are not changed by the downsampling process. - -To enable downsampling, this template includes a `lifecycle` section with [downsampling](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) object. `fixed_interval` parameter sets downsampling interval at which you want to aggregate the original time series data. `after` parameter specifies how much time after index was rolled over should pass before downsampling is performed. - -```console -PUT _index_template/datastream_template -{ - "index_patterns": [ - "datastream*" - ], - "data_stream": {}, - "template": { - "lifecycle": { - "downsampling": [ - { - "after": "1m", - "fixed_interval": "1h" - } - ] - }, - "settings": { - "index": { - "mode": "time_series" - } - }, - "mappings": { - "properties": { - "@timestamp": { - "type": "date" - }, - "kubernetes": { - "properties": { - "container": { - "properties": { - "cpu": { - "properties": { - "usage": { - "properties": { - "core": { - "properties": { - "ns": { - "type": "long" - } - } - }, - "limit": { - "properties": { - "pct": { - "type": "float" - } - } - }, - "nanocores": { - "type": "long", - "time_series_metric": "gauge" - }, - "node": { - "properties": { - "pct": { - "type": "float" - } - } - } - } - } - } - }, - "memory": { - "properties": { - "available": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - }, - "majorpagefaults": { - "type": "long" - }, - "pagefaults": { - "type": "long", - "time_series_metric": "gauge" - }, - "rss": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - }, - "usage": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - }, - "limit": { - "properties": { - "pct": { - "type": "float" - } - } - }, - "node": { - "properties": { - "pct": { - "type": "float" - } - } - } - } - }, - "workingset": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - } - } - }, - "name": { - "type": "keyword" - }, - "start_time": { - "type": "date" - } - } - }, - "host": { - "type": "keyword", - "time_series_dimension": true - }, - "namespace": { - "type": "keyword", - "time_series_dimension": true - }, - "node": { - "type": "keyword", - "time_series_dimension": true - }, - "pod": { - "type": "keyword", - "time_series_dimension": true - } - } - } - } - } - } -} -``` - - -## Ingest time series data [downsampling-dsl-ingest-data] - -Use a bulk API request to automatically create your TSDS and index a set of ten documents. - -**Important:** Before running this bulk request you need to update the timestamps to within three to five hours after your current time. That is, search `2022-06-21T15` and replace with your present date, and adjust the hour to your current time plus three hours. - -```console -PUT /datastream/_bulk?refresh -{"create": {}} -{"@timestamp":"2022-06-21T15:49:00Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":91153,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":463314616},"usage":{"bytes":307007078,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":585236},"rss":{"bytes":102728},"pagefaults":120901,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:45:50Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":124501,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":982546514},"usage":{"bytes":360035574,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1339884},"rss":{"bytes":381174},"pagefaults":178473,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:50Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":38907,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":862723768},"usage":{"bytes":379572388,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":431227},"rss":{"bytes":386580},"pagefaults":233166,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:40Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":86706,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":567160996},"usage":{"bytes":103266017,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1724908},"rss":{"bytes":105431},"pagefaults":233166,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:00Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":150069,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":639054643},"usage":{"bytes":265142477,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1786511},"rss":{"bytes":189235},"pagefaults":138172,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:42:40Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":82260,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":854735585},"usage":{"bytes":309798052,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":924058},"rss":{"bytes":110838},"pagefaults":259073,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:42:10Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":153404,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":279586406},"usage":{"bytes":214904955,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1047265},"rss":{"bytes":91914},"pagefaults":302252,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:40:20Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":125613,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":822782853},"usage":{"bytes":100475044,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":2109932},"rss":{"bytes":278446},"pagefaults":74843,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:40:10Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":100046,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":567160996},"usage":{"bytes":362826547,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1986724},"rss":{"bytes":402801},"pagefaults":296495,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:38:30Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":40018,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":1062428344},"usage":{"bytes":265142477,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":2294743},"rss":{"bytes":340623},"pagefaults":224530,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -``` - - -## View current state of data stream [downsampling-dsl-view-data-stream-state] - -Now that you’ve created and added documents to the data stream, check to confirm the current state of the new index. - -```console -GET _data_stream -``` - -If the data stream lifecycle policy has not yet been applied, your results will be like the following. Note the original `index_name`: `.ds-datastream-2024.04.29-000001`. - -```console-result -{ - "data_streams": [ - { - "name": "datastream", - "timestamp_field": { - "name": "@timestamp" - }, - "indices": [ - { - "index_name": ".ds-datastream-2024.04.29-000001", - "index_uuid": "vUMNtCyXQhGdlo1BD-cGRw", - "managed_by": "Data stream lifecycle" - } - ], - "generation": 1, - "status": "GREEN", - "template": "datastream_template", - "lifecycle": { - "enabled": true, - "downsampling": [ - { - "after": "1m", - "fixed_interval": "1h" - } - ] - }, - "next_generation_managed_by": "Data stream lifecycle", - "hidden": false, - "system": false, - "allow_custom_routing": false, - "replicated": false, - "rollover_on_write": false, - "time_series": { - "temporal_ranges": [ - { - "start": "2024-04-29T15:55:46.000Z", - "end": "2024-04-29T18:25:46.000Z" - } - ] - } - } - ] -} -``` - -Next, run a search query: - -```console -GET datastream/_search -``` - -The query returns your ten newly added documents. - -```console-result -{ - "took": 23, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 10, - "relation": "eq" - }, -... -``` - - -## Roll over the data stream [downsampling-dsl-rollover] - -Data stream lifecycle will automatically roll over data stream and perform downsampling. This step is only needed in order to see downsampling results in scope of this tutorial. - -Roll over the data stream using the [rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover): - -```console -POST /datastream/_rollover/ -``` - - -## View downsampling results [downsampling-dsl-view-results] - -By default, data stream lifecycle actions are executed every five minutes. Downsampling takes place after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has lapsed as the source index is still expected to receive major writes until then. Index is now rolled over after previous step but its time series range end is likely still in the future. Once index time series range is in the past, re-run the `GET _data_stream` request. - -```console -GET _data_stream -``` - -After the data stream lifecycle action was executed, original `.ds-datastream-2024.04.29-000001` index is replaced with a new, downsampled index, in this case `downsample-1h-.ds-datastream-2024.04.29-000001`. - -```console-result -{ - "data_streams": [ - { - "name": "datastream", - "timestamp_field": { - "name": "@timestamp" - }, - "indices": [ - { - "index_name": "downsample-1h-.ds-datastream-2024.04.29-000001", - "index_uuid": "VqXuShP4T8ODAOnWFcqitg", - "managed_by": "Data stream lifecycle" - }, - { - "index_name": ".ds-datastream-2024.04.29-000002", - "index_uuid": "8gCeSdjUSWG-o-PeEAJ0jA", - "managed_by": "Data stream lifecycle" - } - ], -... -``` - -Run a search query on the datastream (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-concepts.md#querying-downsampled-indices-notes)). - -```console -GET datastream/_search -``` - -The new downsampled index contains just one document that includes the `min`, `max`, `sum`, and `value_count` statistics based off of the original sampled metrics. - -```console-result -{ - "took": 26, - "timed_out": false, - "_shards": { - "total": 2, - "successful": 2, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 1, - "relation": "eq" - }, - "max_score": 1, - "hits": [ - { - "_index": "downsample-1h-.ds-datastream-2024.04.29-000001", - "_id": "0eL0wMf38sl_s5JnAAABjyrMjoA", - "_score": 1, - "_source": { - "@timestamp": "2024-04-29T17:00:00.000Z", - "_doc_count": 10, - "kubernetes": { - "container": { - "cpu": { - "usage": { - "core": { - "ns": 12828317850 - }, - "limit": { - "pct": 0.0000277905 - }, - "nanocores": { - "min": 38907, - "max": 153404, - "sum": 992677, - "value_count": 10 - }, - "node": { - "pct": 0.0000277905 - } - } - }, - "memory": { - "available": { - "bytes": { - "min": 279586406, - "max": 1062428344, - "sum": 7101494721, - "value_count": 10 - } - }, - "majorpagefaults": 0, - "pagefaults": { - "min": 74843, - "max": 302252, - "sum": 2061071, - "value_count": 10 - }, - "rss": { - "bytes": { - "min": 91914, - "max": 402801, - "sum": 2389770, - "value_count": 10 - } - }, - "usage": { - "bytes": { - "min": 100475044, - "max": 379572388, - "sum": 2668170609, - "value_count": 10 - }, - "limit": { - "pct": 0.00009923134 - }, - "node": { - "pct": 0.017700378 - } - }, - "workingset": { - "bytes": { - "min": 431227, - "max": 2294743, - "sum": 14230488, - "value_count": 10 - } - } - }, - "name": "container-name-44", - "start_time": "2021-03-30T07:59:06.000Z" - }, - "host": "gke-apps-0", - "namespace": "namespace26", - "node": "gke-apps-0-0", - "pod": "gke-apps-0-0-0" - } - } - } - ] - } -} -``` - -Use the [data stream stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-data-streams-stats-1) to get statistics for the data stream, including the storage size. - -```console -GET /_data_stream/datastream/_stats?human=true -``` - -```console-result -{ - "_shards": { - "total": 4, - "successful": 4, - "failed": 0 - }, - "data_stream_count": 1, - "backing_indices": 2, - "total_store_size": "37.3kb", - "total_store_size_bytes": 38230, - "data_streams": [ - { - "data_stream": "datastream", - "backing_indices": 2, - "store_size": "37.3kb", - "store_size_bytes": 38230, - "maximum_timestamp": 1714410000000 - } - ] -} -``` - -This example demonstrates how downsampling works as part of a data stream lifecycle to reduce the storage size of metrics data as it becomes less current and less frequently queried. diff --git a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md b/manage-data/data-store/data-streams/run-downsampling-with-ilm.md deleted file mode 100644 index 12a018467d..0000000000 --- a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md +++ /dev/null @@ -1,473 +0,0 @@ ---- -navigation_title: Run downsampling with ILM -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/downsampling-ilm.html -applies_to: - stack: ga - serverless: ga -products: - - id: elasticsearch ---- - -# Run downsampling with ILM [downsampling-ilm] - -:::{warning} -🚧 Work in progress 🚧 -::: - -This is a simplified example that allows you to see quickly how [downsampling](./downsampling-time-series-data-stream.md) works as part of an ILM policy to reduce the storage size of a sampled set of metrics. The example uses typical Kubernetes cluster monitoring data. To test out downsampling with ILM, follow these steps: - -1. Check the [prerequisites](#downsampling-ilm-prereqs). -2. [Create an index lifecycle policy](#downsampling-ilm-policy). -3. [Create an index template](#downsampling-ilm-create-index-template). -4. [Ingest time series data](#downsampling-ilm-ingest-data). -5. [View the results](#downsampling-ilm-view-results). - - -## Prerequisites [downsampling-ilm-prereqs] - -Refer to [time series data stream prerequisites](./set-up-tsds.md#tsds-prereqs). - -Before running this example you may want to try the [Run downsampling manually](./run-downsampling-manually.md) example. - - -## Create an index lifecycle policy [downsampling-ilm-policy] - -Create an ILM policy for your time series data. While not required, an ILM policy is recommended to automate the management of your time series data stream indices. - -To enable downsampling, add a [Downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) and set [`fixed_interval`](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md#ilm-downsample-options) to the downsampling interval at which you want to aggregate the original time series data. - -In this example, an ILM policy is configured for the `hot` phase. The downsample takes place after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has lapsed as the source index is still expected to receive major writes until then. {{ilm-cap}} will not proceed with any action that expects the index to not receive writes anymore until the [index’s end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. The {{ilm-cap}} actions that wait on the end time before proceeding are: - [Delete](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) - [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - [Force merge](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) - [Read only](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - [Searchable snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) - [Shrink](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) - -```console -PUT _ilm/policy/datastream_policy -{ - "policy": { - "phases": { - "hot": { - "actions": { - "rollover" : { - "max_age": "5m" - }, - "downsample": { - "fixed_interval": "1h" - } - } - } - } - } -} -``` - - -## Create an index template [downsampling-ilm-create-index-template] - -This creates an index template for a basic data stream. The available parameters for an index template are described in detail in [Set up a time series data stream](set-up-data-stream.md). - -For simplicity, in the time series mapping all `time_series_metric` parameters are set to type `gauge`, but the `counter` metric type may also be used. The `time_series_metric` values determine the kind of statistical representations that are used during downsampling. - -The index template includes a set of static [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension): `host`, `namespace`, `node`, and `pod`. The time series dimensions are not changed by the downsampling process. - -```console -PUT _index_template/datastream_template -{ - "index_patterns": [ - "datastream*" - ], - "data_stream": {}, - "template": { - "settings": { - "index": { - "mode": "time_series", - "number_of_replicas": 0, - "number_of_shards": 2 - }, - "index.lifecycle.name": "datastream_policy" - }, - "mappings": { - "properties": { - "@timestamp": { - "type": "date" - }, - "kubernetes": { - "properties": { - "container": { - "properties": { - "cpu": { - "properties": { - "usage": { - "properties": { - "core": { - "properties": { - "ns": { - "type": "long" - } - } - }, - "limit": { - "properties": { - "pct": { - "type": "float" - } - } - }, - "nanocores": { - "type": "long", - "time_series_metric": "gauge" - }, - "node": { - "properties": { - "pct": { - "type": "float" - } - } - } - } - } - } - }, - "memory": { - "properties": { - "available": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - }, - "majorpagefaults": { - "type": "long" - }, - "pagefaults": { - "type": "long", - "time_series_metric": "gauge" - }, - "rss": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - }, - "usage": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - }, - "limit": { - "properties": { - "pct": { - "type": "float" - } - } - }, - "node": { - "properties": { - "pct": { - "type": "float" - } - } - } - } - }, - "workingset": { - "properties": { - "bytes": { - "type": "long", - "time_series_metric": "gauge" - } - } - } - } - }, - "name": { - "type": "keyword" - }, - "start_time": { - "type": "date" - } - } - }, - "host": { - "type": "keyword", - "time_series_dimension": true - }, - "namespace": { - "type": "keyword", - "time_series_dimension": true - }, - "node": { - "type": "keyword", - "time_series_dimension": true - }, - "pod": { - "type": "keyword", - "time_series_dimension": true - } - } - } - } - } - } -} -``` - - -## Ingest time series data [downsampling-ilm-ingest-data] - -Use a bulk API request to automatically create your TSDS and index a set of ten documents. - -**Important:** Before running this bulk request you need to update the timestamps to within three to five hours after your current time. That is, search `2022-06-21T15` and replace with your present date, and adjust the hour to your current time plus three hours. - -```console -PUT /datastream/_bulk?refresh -{"create": {}} -{"@timestamp":"2022-06-21T15:49:00Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":91153,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":463314616},"usage":{"bytes":307007078,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":585236},"rss":{"bytes":102728},"pagefaults":120901,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:45:50Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":124501,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":982546514},"usage":{"bytes":360035574,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1339884},"rss":{"bytes":381174},"pagefaults":178473,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:50Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":38907,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":862723768},"usage":{"bytes":379572388,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":431227},"rss":{"bytes":386580},"pagefaults":233166,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:40Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":86706,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":567160996},"usage":{"bytes":103266017,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1724908},"rss":{"bytes":105431},"pagefaults":233166,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:44:00Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":150069,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":639054643},"usage":{"bytes":265142477,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1786511},"rss":{"bytes":189235},"pagefaults":138172,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:42:40Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":82260,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":854735585},"usage":{"bytes":309798052,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":924058},"rss":{"bytes":110838},"pagefaults":259073,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:42:10Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":153404,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":279586406},"usage":{"bytes":214904955,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1047265},"rss":{"bytes":91914},"pagefaults":302252,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:40:20Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":125613,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":822782853},"usage":{"bytes":100475044,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":2109932},"rss":{"bytes":278446},"pagefaults":74843,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:40:10Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":100046,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":567160996},"usage":{"bytes":362826547,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":1986724},"rss":{"bytes":402801},"pagefaults":296495,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -{"create": {}} -{"@timestamp":"2022-06-21T15:38:30Z","kubernetes":{"host":"gke-apps-0","node":"gke-apps-0-0","pod":"gke-apps-0-0-0","container":{"cpu":{"usage":{"nanocores":40018,"core":{"ns":12828317850},"node":{"pct":2.77905e-05},"limit":{"pct":2.77905e-05}}},"memory":{"available":{"bytes":1062428344},"usage":{"bytes":265142477,"node":{"pct":0.01770037710617187},"limit":{"pct":9.923134671484496e-05}},"workingset":{"bytes":2294743},"rss":{"bytes":340623},"pagefaults":224530,"majorpagefaults":0},"start_time":"2021-03-30T07:59:06Z","name":"container-name-44"},"namespace":"namespace26"}} -``` - - -## View the results [downsampling-ilm-view-results] - -Now that you’ve created and added documents to the data stream, check to confirm the current state of the new index. - -```console -GET _data_stream -``` - -If the ILM policy has not yet been applied, your results will be like the following. Note the original `index_name`: `.ds-datastream--000001`. - -```console-result -{ - "data_streams": [ - { - "name": "datastream", - "timestamp_field": { - "name": "@timestamp" - }, - "indices": [ - { - "index_name": ".ds-datastream-2022.08.26-000001", - "index_uuid": "5g-3HrfETga-5EFKBM6R-w" - }, - { - "index_name": ".ds-datastream-2022.08.26-000002", - "index_uuid": "o0yRTdhWSo2pY8XMvfwy7Q" - } - ], - "generation": 2, - "status": "GREEN", - "template": "datastream_template", - "ilm_policy": "datastream_policy", - "hidden": false, - "system": false, - "allow_custom_routing": false, - "replicated": false, - "rollover_on_write": false, - "time_series": { - "temporal_ranges": [ - { - "start": "2022-08-26T13:29:07.000Z", - "end": "2022-08-26T19:29:07.000Z" - } - ] - } - } - ] -} -``` - -Next, run a search query: - -```console -GET datastream/_search -``` - -The query returns your ten newly added documents. - -```console-result -{ - "took": 17, - "timed_out": false, - "_shards": { - "total": 4, - "successful": 4, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 10, - "relation": "eq" - }, -... -``` - -By default, index lifecycle management checks every ten minutes for indices that meet policy criteria. Wait for about ten minutes (maybe brew up a quick coffee or tea ☕ ) and then re-run the `GET _data_stream` request. - -```console -GET _data_stream -``` - -After the ILM policy has taken effect, the original `.ds-datastream-2022.08.26-000001` index is replaced with a new, downsampled index, in this case `downsample-6tkn-.ds-datastream-2022.08.26-000001`. - -```console-result -{ - "data_streams": [ - { - "name": "datastream", - "timestamp_field": { - "name": "@timestamp" - }, - "indices": [ - { - "index_name": "downsample-6tkn-.ds-datastream-2022.08.26-000001", - "index_uuid": "qRane1fQQDCNgKQhXmTIvg" - }, - { - "index_name": ".ds-datastream-2022.08.26-000002", - "index_uuid": "o0yRTdhWSo2pY8XMvfwy7Q" - } - ], -... -``` - -Run a search query on the datastream (note that when querying downsampled indices there are [a few nuances to be aware of](./downsampling-concepts.md#querying-downsampled-indices-notes)). - -```console -GET datastream/_search -``` - -The new downsampled index contains just one document that includes the `min`, `max`, `sum`, and `value_count` statistics based off of the original sampled metrics. - -```console-result -{ - "took": 6, - "timed_out": false, - "_shards": { - "total": 4, - "successful": 4, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 1, - "relation": "eq" - }, - "max_score": 1, - "hits": [ - { - "_index": "downsample-6tkn-.ds-datastream-2022.08.26-000001", - "_id": "0eL0wC_4-45SnTNFAAABgtpz0wA", - "_score": 1, - "_source": { - "@timestamp": "2022-08-26T14:00:00.000Z", - "_doc_count": 10, - "kubernetes.host": "gke-apps-0", - "kubernetes.namespace": "namespace26", - "kubernetes.node": "gke-apps-0-0", - "kubernetes.pod": "gke-apps-0-0-0", - "kubernetes.container.cpu.usage.nanocores": { - "min": 38907, - "max": 153404, - "sum": 992677, - "value_count": 10 - }, - "kubernetes.container.memory.available.bytes": { - "min": 279586406, - "max": 1062428344, - "sum": 7101494721, - "value_count": 10 - }, - "kubernetes.container.memory.pagefaults": { - "min": 74843, - "max": 302252, - "sum": 2061071, - "value_count": 10 - }, - "kubernetes.container.memory.rss.bytes": { - "min": 91914, - "max": 402801, - "sum": 2389770, - "value_count": 10 - }, - "kubernetes.container.memory.usage.bytes": { - "min": 100475044, - "max": 379572388, - "sum": 2668170609, - "value_count": 10 - }, - "kubernetes.container.memory.workingset.bytes": { - "min": 431227, - "max": 2294743, - "sum": 14230488, - "value_count": 10 - }, - "kubernetes.container.cpu.usage.core.ns": 12828317850, - "kubernetes.container.cpu.usage.limit.pct": 0.000027790500098490156, - "kubernetes.container.cpu.usage.node.pct": 0.000027790500098490156, - "kubernetes.container.memory.majorpagefaults": 0, - "kubernetes.container.memory.usage.limit.pct": 0.00009923134348355234, - "kubernetes.container.memory.usage.node.pct": 0.017700377851724625, - "kubernetes.container.name": "container-name-44", - "kubernetes.container.start_time": "2021-03-30T07:59:06.000Z" - } - } - ] - } -} -``` - -Use the [data stream stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-data-streams-stats-1) to get statistics for the data stream, including the storage size. - -```console -GET /_data_stream/datastream/_stats?human=true -``` - -```console-result -{ - "_shards": { - "total": 4, - "successful": 4, - "failed": 0 - }, - "data_stream_count": 1, - "backing_indices": 2, - "total_store_size": "16.6kb", - "total_store_size_bytes": 17059, - "data_streams": [ - { - "data_stream": "datastream", - "backing_indices": 2, - "store_size": "16.6kb", - "store_size_bytes": 17059, - "maximum_timestamp": 1661522400000 - } - ] -} -``` - -This example demonstrates how downsampling works as part of an ILM policy to reduce the storage size of metrics data as it becomes less current and less frequently queried. - -You can also try our [Run downsampling manually](./run-downsampling-manually.md) example to learn how downsampling can work outside of an ILM policy. diff --git a/manage-data/data-store/data-streams/run-downsampling.md b/manage-data/data-store/data-streams/run-downsampling.md index e42a1db748..d2ba107fc0 100644 --- a/manage-data/data-store/data-streams/run-downsampling.md +++ b/manage-data/data-store/data-streams/run-downsampling.md @@ -2,27 +2,43 @@ applies_to: stack: ga serverless: ga +navigation_title: "Run downsampling" +mapped_pages: + - https://www.elastic.co/guide/en/elasticsearch/reference/current/downsampling-manual.html + - https://www.elastic.co/guide/en/elasticsearch/reference/current/downsampling-ilm.html products: - id: elasticsearch --- # Run downsampling on time series data [running-downsampling] -:::{warning} -🚧 Work in progress 🚧 +:::{admonition} Page status +🟢 Ready for review ::: -% TODO consider retitling to "Downsample time series data" +% TODO consider retitling (cf. overview) -To downsample a time series index, you can use the `downsample API`, index lifecycle management (ILM), or a data stream lifecycle. +To downsample a time series data stream backing index, you can use the `downsample API`, index lifecycle management (ILM), or a data stream lifecycle. +:::{note} +Downsampling runs on the data stream backing index, not the data stream itself. +::: + +## Prerequisites + +Before you start, make sure your index is a candidate for downsampling: + +* The index must be **read-only**. You can roll over a write index and make it read-only. +* The index must have at least one metric field. + +For more details about the downsampling process, refer to [](downsampling-concepts.md). ::::{tab-set} :::{tab-item} Downsample API -## Use the downsample API +## Downsampling with the API -Issue a [downsample API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-downsample) request, setting `fixed_interval` to your preferred level of granularity: +Make a [downsample API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-downsample) request: ```console POST /my-time-series-index/_downsample/my-downsampled-time-series-index @@ -30,13 +46,16 @@ POST /my-time-series-index/_downsample/my-downsampled-time-series-index "fixed_interval": "1d" } ``` + +Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. + ::: :::{tab-item} Index lifecycle -## Downsample with index lifecycle management +## Downsampling with index lifecycle management -To downsample time series data as part of index lifecycle management (ILM), include a [downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) in your ILM policy, setting `fixed_interval` to your preferred level of granularity: +To downsample time series data as part of index lifecycle management (ILM), include a [downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) in your ILM policy: ```console PUT _ilm/policy/my_policy @@ -54,12 +73,85 @@ PUT _ilm/policy/my_policy } } ``` +Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. + +% TODO consider restoring removed tutorial-esque content + +In this example, an ILM policy is configured for the `hot` phase. The downsample action runs after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. + +```console +PUT _ilm/policy/datastream_policy +{ + "policy": { + "phases": { + "hot": { + "actions": { + "rollover" : { + "max_age": "5m" + }, + "downsample": { + "fixed_interval": "1h" + } + } + } + } + } +} +``` + + ::: :::{tab-item} Data stream lifecycle -Move tutorial here +## Downsampling with data stream lifecycle management + +To downsample time series data as part of data lifecycle management, create an index template that includes a `lifecycle` section with a [downsampling](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) object. + +* Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. +* Set `after` to the minimum time to wait after an index rollover, before running downsampling. + +```console +PUT _index_template/datastream_template +{ + "index_patterns": [ + "datastream*" + ], + "data_stream": {}, + "template": { + "lifecycle": { + "downsampling": [ + { + "after": "1m", + "fixed_interval": "1h" + } + ] + }, + "settings": { + "index": { + "mode": "time_series" + } + }, + "mappings": { + "properties": { + "@timestamp": { + "type": "date" + }, + [...] + } + } + } +} +``` + + +For more details about index templates for time series data streams, refer to [](set-up-tsds.md). ::: :::: + +## Additional resources + +* [](downsampling-concepts.md) +* [](time-series-data-stream-tsds.md) diff --git a/manage-data/toc.yml b/manage-data/toc.yml index a87c9ab237..73bb6059a2 100644 --- a/manage-data/toc.yml +++ b/manage-data/toc.yml @@ -17,9 +17,6 @@ toc: - file: data-store/data-streams/downsampling-time-series-data-stream.md children: - file: data-store/data-streams/run-downsampling.md - - file: data-store/data-streams/run-downsampling-with-ilm.md - - file: data-store/data-streams/run-downsampling-manually.md - - file: data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md - file: data-store/data-streams/downsampling-concepts.md - file: data-store/data-streams/reindex-tsds.md - file: data-store/data-streams/logs-data-stream.md