Skip to content

Commit f84f5ae

Browse files
authored
feat: support retry in health-check (#131)
1 parent 60e8de4 commit f84f5ae

File tree

7 files changed

+272
-191
lines changed

7 files changed

+272
-191
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,3 +57,5 @@ t/servroot
5757
utils/lj-releng
5858
[\.]*
5959
!.github/
60+
*.etcd
61+
*.log

health_check.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Initializes the health check object, overiding default params with the given one
2020

2121
`syntax: health_check.report_failure(etcd_host)`
2222

23-
Reports a health failure which will count against the number of occurrences required to make a target "fail".
23+
Reports a health failure which will count against the number of occurrences required to make a target "fail".
2424

2525
### get_target_status
2626

@@ -35,6 +35,7 @@ Get the current status of the target.
3535
| shm_name | string | required | | the declarative `lua_shared_dict` is used to store the health status of endpoints. |
3636
| fail_timeout | integer | optional | 10s | sets the time during which the specified number of unsuccessful attempts to communicate with the endpoint should happen to marker the endpoint unavailable, and also sets the period of time the endpoint will be marked unavailable. |
3737
| max_fails | integer | optional | 1 | sets the number of failed attempts that must occur during the `fail_timeout` period for the endpoint to be marked unavailable. |
38+
| retry | bool | optional | false | automatically retry another endpoint when operations failed. |
3839

3940
lua example:
4041

@@ -43,16 +44,17 @@ local health_check, err = require("resty.etcd.health_check").init({
4344
shm_name = "healthcheck_shm",
4445
fail_timeout = 10,
4546
max_fails = 1,
47+
retry = false,
4648
})
4749
```
4850

49-
In a `fail_timeout`, if there are `max_fails` consecutive failures, the endpoint is marked as unhealthy, the unhealthy endpoint will not be choosed to connect for a `fail_timeout` time in the future.
51+
In a `fail_timeout`, if there are `max_fails` consecutive failures, the endpoint is marked as unhealthy, the unhealthy endpoint will not be choosed to connect for a `fail_timeout` time in the future.
5052

5153
Health check mechanism would switch endpoint only when the previously choosed endpoint is marked as unhealthy.
5254

5355
The failure counter and health status of each etcd endpoint are shared across workers and by different etcd clients.
5456

55-
Also note that the `fail_timeout` and `max_fails` of the health check cannot be changed once it has been created.
57+
Also note that the `fail_timeout`, `max_fails` and `retry` of the health check cannot be changed once it has been created.
5658

5759
## Synopsis
5860

@@ -69,12 +71,13 @@ http {
6971
shm_name = "healthcheck_shm",
7072
fail_timeout = 10,
7173
max_fails = 1,
74+
retry = false,
7275
})
7376
7477
local etcd, err = require("resty.etcd").new({
7578
protocol = "v3",
7679
http_host = {
77-
"http://127.0.0.1:12379",
80+
"http://127.0.0.1:12379",
7881
"http://127.0.0.1:22379",
7982
"http://127.0.0.1:32379",
8083
},

lib/resty/etcd/health_check.lua

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ function _M.init(opts)
7777
conf.shm_name = opts.shm_name
7878
conf.fail_timeout = opts.fail_timeout or 10 -- 10 sec
7979
conf.max_fails = opts.max_fails or 1
80+
conf.retry = opts.retry or false
8081
_M.conf = conf
8182
return _M, nil
8283
end

lib/resty/etcd/utils.lua

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,10 @@ function _M.has_value(arr, val)
8686
return false
8787
end
8888

89+
function _M.starts_with(str, start)
90+
return str:sub(1, #start) == start
91+
end
92+
8993
local ngx_log = ngx.log
9094
local ngx_ERR = ngx.ERR
9195
local ngx_INFO = ngx.INFO

0 commit comments

Comments
 (0)