Files
coredns/plugin/health
Miek Gieben 804f745951 plugin/health: make reload work (#1585)
* plugin/health: make reload work

Remove the once.Do from the startup, so we can re-bind the HTTP
listener. Also clarify the usage of health in multiple server blocks
(this is not the best approach - but there isn't a generic solution at
this point).

Manual tested as we lack testing infra, i.e kill -SIGUSR1 and some
CURLing of the health endpoint.

* Readme test fix

* update

* dont need this
2018-03-02 21:40:14 -08:00
..
2018-03-01 18:32:15 -08:00
2018-01-10 11:41:22 +00:00
2018-02-08 10:55:51 +00:00

health

Name

health - enables a health check endpoint.

Description

By enabling health any plugin that implements healt.Healther interface will be queried for it's health. The combined health is exported, by default, on port 8080/health .

Syntax

health [ADDRESS]

Optionally takes an address; the default is :8080. The health path is fixed to /health. The health endpoint returns a 200 response code and the word "OK" when CoreDNS is healthy. It returns a 503. health periodically (1s) polls plugin that exports health information. If any of the plugin signals that it is unhealthy, the server will go unhealthy too. Each plugin that supports health checks has a section "Health" in their README.

More options can be set with this extended syntax:

health [ADDRESS] {
    lameduck DURATION
}
  • Where lameduck will make the process unhealthy then wait for DURATION before the process shuts down.

If you have multiple Server Block and need to export health for each of the plugins, you must run health endpoints on different ports:

com {
    whoami
    health :8080
}

net {
    erratic
    health :8081
}

Plugins

Any plugin that implements the Healther interface will be used to report health.

Metrics

If monitoring is enabled (via the prometheus directive) then the following metric is exported:

  • coredns_health_request_duration_seconds{} - duration to process a /health query. As this should be a local operation it should be fast. A (large) increases in this duration indicates the CoreDNS process is having trouble keeping up with its query load.

Examples

Run another health endpoint on http://localhost:8091.

. {
    health localhost:8091
}

Set a lameduck duration of 1 second:

. {
    health localhost:8092 {
        lameduck 1s
    }
}