connect_write timeout until restarting Fluentd

### Describe the bug

Hello,

This is a follow up of fluent/fluentd#1844

Environment:
- +50 nodes sending logs to opensearch thought fluentd.
- All nodes send only basic systemd logs.

I'm deploying logs of

I observe that "sometimes" ( at a random time ).

Fluentd is not able anymore to contact the opensearch cluster a timeout concerning timeout

All next auto-retry fails by the same way / error ( the other nodes continue at the SAME time, to send successfully their logs.

But the curious things, when i restart fluentd, it begin the shutdown by flushing the buffer and ... it works. whereas all the previous auto-retry fails.

I'm trying many parameters about timeout, but i don't understand why fluentd suddenly say: "i got a timeout while writing to your server" x times ( 1,2 - 40 times ! ) but when i restart it, it works successfully at the this push.

Do you have an idea ?

### To Reproduce

I don't know presily how to reproduce it. In my side, the problem occured randomly, not a specific time after start, that's disturbing.

### Expected behavior

Logs should be flushed successfuly because on shutdown it works, and my +50 others nodes never fails.

### Your Environment

```markdown
- Fluentd version: 1.16.9
- Package version: 5.0.7-1
- Operating system: Rocky Linux 9
- Kernel version: 4.18.0-553.51.1.el8_10.x86_64
```

### Your Configuration

```apache
@include conf.d/*.conf

<filter **>
  @type record_transformer
  enable_ruby true
  <record>
    log_type ${tag}
    server_name "#{Socket.gethostname}"
  </record>
</filter>

<match **>
  @type opensearch
  host xxx
  port 443
  scheme https

  user xxx
  password xxxx

  path /es

  logstash_format true

  ssl_verify true

  request_timeout 300s
  <buffer>
    @type file
    path /var/log/fluent/buffer
    flush_interval 5s
    chunk_limit_size 32m
    total_limit_size 1g
  </buffer>
</match>


in the included files

<source>
  @type systemd
  @id input_systemd
  path /run/log/journal
  tag systemd

 <storage>
    @type local
    path /var/log/fluent/fluentd-systemd.json
  </storage>
</source>

<filter systemd>
  @type grep
  <exclude>
    key _SYSTEMD_UNIT
    pattern /^mega-exporter\.service$/
  </exclude>
</filter>

<filter systemd>
  @type record_transformer
  renew_record true
  keep_keys SYSLOG_IDENTIFIER, MESSAGE
 </filter>
<source>
  @type tail
  tag httpd.access
  path /var/log/httpd/*access_log,/var/www/*/logs/*access_log
  pos_file /var/log/fluent/httpd-access.log.pos
  format apache2 
  path_key log_path
</source>

<source>
  @type tail
  tag httpd.errors
  path /var/log/httpd/*error_log,/var/www/*/logs/*error_log
  pos_file /var/log/td-agent/httpd-error.log.pos
  format apache_error
  path_key log_path
</source>

<filter httpd.errors>
  @type record_transformer
  enable_ruby true
  remove_keys pid
  <record>
    client_ip ${record["client"] ? record["client"].split(":")[0] : nil}
  </record>
</filter>

<filter httpd.**>
  @type record_transformer
  enable_ruby true
  <record>
    domain ${record["log_path"] ? record["log_path"].split('/').last.gsub(/-(access|error)_log$/, '') : nil}
  </record>
</filter>
```

### Your Error Log

```shell
2025-05-18 06:38:57 +0200 [warn]: #0 failed to flush the buffer. retry_times=15 next_retry_time=2025-05-18 15:20:41 +0200 chunk="63559b64af2d4b9db721c9907294a3cc" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"xxx\", :port=>443, :scheme=>\"https\", :user=>\"xxx\", :password=>\"obfuscated\", :path=>\"/"}): connect_write timeout reached"
```

### Additional context

```2025-05-18 03:53:21 +0200 [info]: #0 flushing all buffer forcedly```

do not fix the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

connect_write timeout until restarting Fluentd #157

Describe the bug

To Reproduce

Expected behavior

Your Environment

Your Configuration

Your Error Log

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

connect_write timeout until restarting Fluentd #157

Description

Describe the bug

To Reproduce

Expected behavior

Your Environment

Your Configuration

Your Error Log

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions