mirror of
https://github.com/coredns/coredns.git
synced 2025-11-30 07:34:02 -05:00
* reload: use OnRestart
Close the listener on OnRestart for health and metrics so the default
setup function can setup the listener when the plugin is "starting up".
Lightly test with some SIGUSR1-ing. Also checked the reload plugin with
this, seems fine:
.com.:1043
.:1043
2018/04/20 15:01:25 [INFO] CoreDNS-1.1.1
2018/04/20 15:01:25 [INFO] linux/amd64, go1.10,
CoreDNS-1.1.1
linux/amd64, go1.10,
2018/04/20 15:01:25 [INFO] Running configuration MD5 = aa8b3f03946fb60546ca1f725d482714
2018/04/20 15:02:01 [INFO] Reloading
2018/04/20 15:02:01 [INFO] Running configuration MD5 = b34a96d99e01db4015a892212560155f
2018/04/20 15:02:01 [INFO] Reloading complete
^C2018/04/20 15:02:06 [INFO] SIGINT: Shutting down
With this corefile:
.com {
proxy . 127.0.0.1:53
prometheus :9054
whoami
reload
}
. {
proxy . 127.0.0.1:53
prometheus :9054
whoami
reload
}
The prometheus port was 9053, changed that to 54 so reload would pick it
up.
From a cursory look it seems this also fixes:
Fixes #1604 #1618 #1686 #1492
* At least make it test
* Use onfinalshutdown
* reload: add reload test
This test #1604 adn right now fails.
* Address review comments
* Add bug section explaining things a bit
* compile tests
* Fix tests
* fixes
* slightly less crazy
* try to make prometheus setup less confusing
* Use ephermal port for test
* Don't use the listener
* These are shared between goroutines, just use the boolean in the main
structure.
* Fix text in the reload README,
* Set addr to TODO once stopping it
* Morph fturb's comment into test, to test reload and scrape health and
metric endpoint
85 lines
2.0 KiB
Markdown
85 lines
2.0 KiB
Markdown
# health
|
|
|
|
## Name
|
|
|
|
*health* - enables a health check endpoint.
|
|
|
|
## Description
|
|
|
|
By enabling *health* any plugin that implements
|
|
[healt.Healther interface](https://godoc.org/github.com/coredns/coredns/plugin/health#Healther)
|
|
will be queried for it's health. The combined health is exported, by default, on port 8080/health .
|
|
|
|
## Syntax
|
|
|
|
~~~
|
|
health [ADDRESS]
|
|
~~~
|
|
|
|
Optionally takes an address; the default is `:8080`. The health path is fixed to `/health`. The
|
|
health endpoint returns a 200 response code and the word "OK" when this server is healthy. It returns
|
|
a 503. *health* periodically (1s) polls plugins that exports health information. If any of the
|
|
plugins signals that it is unhealthy, the server will go unhealthy too. Each plugin that supports
|
|
health checks has a section "Health" in their README.
|
|
|
|
More options can be set with this extended syntax:
|
|
|
|
~~~
|
|
health [ADDRESS] {
|
|
lameduck DURATION
|
|
}
|
|
~~~
|
|
|
|
* Where `lameduck` will make the process unhealthy then *wait* for **DURATION** before the process
|
|
shuts down.
|
|
|
|
If you have multiple Server Blocks and need to export health for each of the plugins, you must run
|
|
health endpoints on different ports:
|
|
|
|
~~~ corefile
|
|
com {
|
|
whoami
|
|
health :8080
|
|
}
|
|
|
|
net {
|
|
erratic
|
|
health :8081
|
|
}
|
|
~~~
|
|
|
|
## Plugins
|
|
|
|
Any plugin that implements the Healther interface will be used to report health.
|
|
|
|
## Metrics
|
|
|
|
If monitoring is enabled (via the *prometheus* directive) then the following metric is exported:
|
|
|
|
* `coredns_health_request_duration_seconds{}` - duration to process a /health query. As this should
|
|
be a local operation it should be fast. A (large) increases in this duration indicates the
|
|
CoreDNS process is having trouble keeping up with its query load.
|
|
|
|
Note that this metric *does not* have a `server` label, because being overloaded is a symptom of
|
|
the running process, *not* a specific server.
|
|
|
|
## Examples
|
|
|
|
Run another health endpoint on http://localhost:8091.
|
|
|
|
~~~ corefile
|
|
. {
|
|
health localhost:8091
|
|
}
|
|
~~~
|
|
|
|
Set a lameduck duration of 1 second:
|
|
|
|
~~~ corefile
|
|
. {
|
|
health localhost:8092 {
|
|
lameduck 1s
|
|
}
|
|
}
|
|
~~~
|