Skip to content

Commit e594763

Browse files
committed
apm: Add known issue about HTTP 502
1 parent 3ee64a9 commit e594763

File tree

1 file changed

+29
-8
lines changed

1 file changed

+29
-8
lines changed

docs/en/observability/apm/known-issues.asciidoc

Lines changed: 29 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,27 @@ _Versions: XX.XX.XX, YY.YY.YY, ZZ.ZZ.ZZ_
2121
// If applicable, link to fix
2222
////
2323

24+
== APM occasionally returning HTTP 502 "backend connection closed" or "use of closed network connection"
25+
26+
_Elastic Stack versions: >=8.0.0 and <8.18.7 or <8.19.4, >=9.0.0 and <9.0.7 or <9.1.4_
27+
_Environments: ECH, ECE
28+
29+
APM Server on ECH and ECE might sometimes return HTTP 502 with error message "backend connection closed" or "use of closed network connection" for any requests due to a rare race condition.
30+
When this happens to an intake request, Elastic APM agents will log an error but will not retry, leading to data loss.
31+
32+
Note that there may be other causes to "backend connection closed" or "use of closed network connection", and the provided workaround and released bugfix will only resolve the case related to the mentioned race condition.
33+
34+
*Workaround*
35+
36+
To work around this issue:
37+
38+
* Go to *Kibana* > *Fleet* > *Elastic Cloud agent policy*,
39+
* Next to *Elastic APM*, select the *...* icon, then *Edit Integration*.
40+
* Under *General*, select *Advanced options*, then change *Idle time before underlying connection is closed* to *200s*.
41+
* Select *Save Integration*
42+
43+
This bug will be fixed in 8.18.7, 8.19.4, 9.0.7, 9.1.4 for new deployments, and 8.18.8, 8.19.5, 9.0.8, 9.1.5, 9.2.0 for upgraded deployments.
44+
2445
[discrete]
2546
== APM Integration might be unreachable after upgrading to 8.19.0 and 9.1.0
2647

@@ -99,18 +120,18 @@ PUT _component_template/metrics-apm.internal@custom
99120
== `prefer_ilm` required in component templates to create custom lifecycle policies
100121

101122
_Elastic Stack versions: 8.15.1+_
102-
123+
103124
// The conditions in which this issue occurs
104125
The issue occurs when creating a _new_ cluster using version 8.15.1+.
105126
The issue occurs for any APM data streams created in 8.15.1+.
106127
The issue does _not_ occur if custom component template has been created in or before version 8.15.0.
107128

108129
// Describe why it happens
109-
In 8.15.0, APM Server began using the https://github.com/elastic/elasticsearch/tree/main/x-pack/plugin/apm-data[apm-data plugin]
110-
to manage data streams, ingest pipelines, lifecycle policies, and more. In 8.15.1, a fix was introduced to address
111-
unmanaged indices in older clusters using default ILM policies. This fix added a fallback to the default ILM policy
112-
(if it exists) and set the `prefer_ilm` configuration to `false`. This setting impacts clusters where both ILM and
113-
data stream lifecycles (DSL) are in effect—such as when configuring custom ILM policies using `@custom` component
130+
In 8.15.0, APM Server began using the https://github.com/elastic/elasticsearch/tree/main/x-pack/plugin/apm-data[apm-data plugin]
131+
to manage data streams, ingest pipelines, lifecycle policies, and more. In 8.15.1, a fix was introduced to address
132+
unmanaged indices in older clusters using default ILM policies. This fix added a fallback to the default ILM policy
133+
(if it exists) and set the `prefer_ilm` configuration to `false`. This setting impacts clusters where both ILM and
134+
data stream lifecycles (DSL) are in effect—such as when configuring custom ILM policies using `@custom` component
114135
templates, under the conditions mentioned above.
115136

116137
// How to fix it
@@ -122,7 +143,7 @@ to `true` by following the {observability-guide}/apm-ilm-how-to.html[updated gui
122143

123144
_Elastic Stack versions: 8.15.0, 8.15.1, 8.15.2, 8.15.3_ +
124145
_Fixed in Elastic Stack version 8.15.4_
125-
146+
126147
// The conditions in which this issue occurs
127148
The issue only occurs when _upgrading_ the {stack} from 8.12.2 or lower directly to any 8.15.x version prior to 8.15.4.
128149
The issue does _not_ occur when creating a _new_ cluster using any 8.15.x version, or when upgrading
@@ -132,7 +153,7 @@ from 8.12.2 to 8.13.x or 8.14.x and then to 8.15.x.
132153
In APM Servers versions prior to 8.13.0, an ingestion pipeline exists to perform a check on the version.
133154
The version check would fail any APM document produced with a different version of APM server compared to the version of the installed APM’s ingest pipeline.
134155
In 8.13.0 the version check in the ingest pipeline was removed.
135-
Due to the combination of an internal change in how apm data management assets are set up from 8.15 onwards and a bug in Elasticsearch,
156+
Due to the combination of an internal change in how apm data management assets are set up from 8.15 onwards and a bug in Elasticsearch,
136157
related to https://github.com/elastic/elasticsearch/issues/112781[lazy rollover of data streams], the ingestion pipeline conducting the version check is not removed on upgrade and prevents the ingestion of data.
137158

138159
// How to fix it

0 commit comments

Comments
 (0)