2018-02-05 22:00:47 +00:00
# forward
## Name
2018-02-22 08:55:37 +00:00
*forward* - facilitates proxying DNS messages to upstream resolvers.
2018-02-05 22:00:47 +00:00
## Description
2018-02-15 10:21:57 +01:00
The *forward* plugin re-uses already opened sockets to the upstreams. It supports UDP, TCP and
DNS-over-TLS and uses in band health checking.
2020-06-24 16:49:06 +02:00
When it detects an error a health check is performed. This checks runs in a loop, performing each
check at a *0.5s* interval for as long as the upstream reports unhealthy. Once healthy we stop
health checking (until the next error). The health checks use a recursive DNS query (`. IN NS` )
to get upstream health. Any response that is not a network error (REFUSED, NOTIMPL, SERVFAIL, etc)
is taken as a healthy upstream. The health check uses the same protocol as specified in **TO** . If
`max_fails` is set to 0, no checking is performed and upstreams will always be considered healthy.
2018-02-15 10:21:57 +01:00
When *all* upstreams are down it assumes health checking as a mechanism has failed and will try to
2018-02-05 22:00:47 +00:00
connect to a random upstream (which may or may not work).
## Syntax
In its most basic form, a simple forwarder uses this syntax:
~~~
forward FROM TO...
~~~
2021-05-20 03:24:36 -04:00
* **FROM** is the base domain to match for the request to be forwarded. Domains using CIDR notation
that expand to multiple reverse zones are not fully supported; only the first expanded zone is used.
2018-02-05 22:00:47 +00:00
* **TO...** are the destination endpoints to forward to. The **TO** syntax allows you to specify
2018-02-15 10:21:57 +01:00
a protocol, `tls://9.9.9.9` or `dns://` (or no protocol) for plain DNS. The number of upstreams is
limited to 15.
2018-02-05 22:00:47 +00:00
2018-02-15 10:21:57 +01:00
Multiple upstreams are randomized (see `policy` ) on first use. When a healthy proxy returns an error
during the exchange the next upstream in the list is tried.
2018-02-05 22:00:47 +00:00
Extra knobs are available with an expanded syntax:
~~~
forward FROM TO... {
except IGNORED_NAMES...
force_tcp
2018-07-07 10:14:21 +03:00
prefer_udp
2018-02-05 22:00:47 +00:00
expire DURATION
max_fails INTEGER
tls CERT KEY CA
tls_servername NAME
2018-04-20 01:07:58 -05:00
policy random|round_robin|sequential
2022-08-15 22:16:15 +08:00
health_check DURATION [no_rec] [domain FQDN]
2020-02-05 10:19:04 -05:00
max_concurrent MAX
2024-07-01 17:20:12 +02:00
next RCODE_1 [RCODE_2] [RCODE_3...]
2018-02-05 22:00:47 +00:00
}
~~~
* **FROM** and **TO...** as above.
* **IGNORED_NAMES** in `except` is a space-separated list of domains to exclude from forwarding.
Requests that match none of these names will be passed through.
* `force_tcp` , use TCP even when the request comes in over UDP.
2018-07-07 10:14:21 +03:00
* `prefer_udp` , try first using UDP even when the request comes in over TCP. If response is truncated
2018-07-07 08:30:57 +01:00
(TC flag set in response) then do another attempt over TCP. In case if both `force_tcp` and
2018-07-07 14:38:05 +01:00
`prefer_udp` options specified the `force_tcp` takes precedence.
2018-02-05 22:00:47 +00:00
* `max_fails` is the number of subsequent failed health checks that are needed before considering
2018-02-15 10:21:57 +01:00
an upstream to be down. If 0, the upstream will never be marked as down (nor health checked).
Default is 2.
2018-02-05 22:00:47 +00:00
* `expire` **DURATION** , expire (cached) connections after this time, the default is 10s.
2018-03-30 16:35:09 +03:00
* `tls` **CERT** **KEY** **CA** define the TLS properties for TLS connection. From 0 to 3 arguments can be
provided with the meaning as described below
2018-04-23 12:47:32 +01:00
2018-03-30 16:35:09 +03:00
* `tls` - no client authentication is used, and the system CAs are used to verify the server certificate
* `tls` **CA** - no client authentication is used, and the file CA is used to verify the server certificate
* `tls` **CERT** **KEY** - client authentication is used with the specified cert/key pair.
The server certificate is verified with the system CAs
* `tls` **CERT** **KEY** **CA** - client authentication is used with the specified cert/key pair.
The server certificate is verified using the specified CA file
2018-04-23 12:47:32 +01:00
2018-02-05 22:00:47 +00:00
* `tls_servername` **NAME** allows you to set a server name in the TLS configuration; for instance 9.9.9.9
2018-11-20 21:16:54 +01:00
needs this to be set to `dns.quad9.net` . Multiple upstreams are still allowed in this scenario,
but they have to use the same `tls_servername` . E.g. mixing 9.9.9.9 (QuadDNS) with 1.1.1.1
2021-11-22 01:49:13 -06:00
(Cloudflare) will not work. Using TLS forwarding but not setting `tls_servername` results in anyone
being able to man-in-the-middle your connection to the DNS server you are forwarding to. Because of this,
it is strongly recommended to set this value when using TLS forwarding.
2018-02-05 22:00:47 +00:00
* `policy` specifies the policy to use for selecting upstream servers. The default is `random` .
2019-04-02 19:24:54 +02:00
* `random` is a policy that implements random upstream selection.
* `round_robin` is a policy that selects hosts based on round robin ordering.
* `sequential` is a policy that selects hosts based on sequential ordering.
2020-03-06 11:52:43 +01:00
* `health_check` configure the behaviour of health checking of the upstream servers
* `<duration>` - use a different duration for health checking, the default duration is 0.5s.
* `no_rec` - optional argument that sets the RecursionDesired-flag of the dns-query used in health checking to `false` .
The flag is default `true` .
2022-08-15 22:16:15 +08:00
* `domain FQDN` - set the domain name used for health checks to **FQDN** .
If not configured, the domain name used for health checks is `.` .
2020-02-04 07:59:08 -05:00
* `max_concurrent` **MAX** will limit the number of concurrent queries to **MAX** . Any new query that would
2020-12-15 08:02:15 -05:00
raise the number of concurrent queries above the **MAX** will result in a REFUSED response. This
2020-02-04 07:59:08 -05:00
response does not count as a health failure. When choosing a value for **MAX** , pick a number
at least greater than the expected *upstream query rate* * *latency* of the upstream servers.
As an upper bound for **MAX** , consider that each concurrent query will use about 2kb of memory.
2024-07-01 17:20:12 +02:00
* `next` If the `RCODE` (i.e. `NXDOMAIN` ) is returned by the remote then execute the next plugin. If no next plugin is defined, or the next plugin is not a `forward` plugin, this setting is ignored
2018-02-05 22:00:47 +00:00
Also note the TLS config is "global" for the whole forwarding proxy if you need a different
2023-05-27 05:01:06 +08:00
`tls_servername` for different upstreams you're out of luck.
2018-02-05 22:00:47 +00:00
2020-05-14 12:58:58 -04:00
On each endpoint, the timeouts for communication are set as follows:
2018-06-21 12:40:19 +02:00
2022-05-19 11:48:25 +02:00
* The dial timeout by default is 30s, and can decrease automatically down to 1s based on early results.
2020-05-14 12:58:58 -04:00
* The read timeout is static at 2s.
2018-06-15 02:37:22 -04:00
2021-03-16 08:51:21 -04:00
## Metadata
The forward plugin will publish the following metadata, if the *metadata*
plugin is also enabled:
* `forward/upstream` : the upstream used to forward the request
2018-02-05 22:00:47 +00:00
## Metrics
2019-10-08 10:20:48 +01:00
If monitoring is enabled (via the *prometheus* plugin) then the following metric are exported:
2018-02-05 22:00:47 +00:00
2023-07-04 15:35:55 +01:00
* `coredns_forward_healthcheck_broken_total{}` - count of when all upstreams are unhealthy,
2018-02-15 10:21:57 +01:00
and we are randomly (this always uses the `random` policy) spraying to an upstream.
2023-07-04 15:35:55 +01:00
* `coredns_forward_max_concurrent_rejects_total{}` - count of queries rejected because the
2020-02-04 07:59:08 -05:00
number of concurrent queries were at maximum.
2023-07-04 15:35:55 +01:00
* `coredns_proxy_request_duration_seconds{proxy_name="forward", to, rcode}` - histogram per upstream, RCODE
* `coredns_proxy_healthcheck_failures_total{proxy_name="forward", to, rcode}` - count of failed health checks per upstream.
* `coredns_proxy_conn_cache_hits_total{proxy_name="forward", to, proto}` - count of connection cache hits per upstream and protocol.
* `coredns_proxy_conn_cache_misses_total{proxy_name="forward", to, proto}` - count of connection cache misses per upstream and protocol.
2019-09-27 11:09:59 +01:00
Where `to` is one of the upstream servers (**TO** from the config), `rcode` is the returned RCODE
2020-09-14 12:42:55 +03:00
from the upstream, `proto` is the transport protocol like `udp` , `tcp` , `tcp-tls` .
2018-02-05 22:00:47 +00:00
2023-07-04 15:35:55 +01:00
The following metrics have recently been deprecated:
* `coredns_forward_healthcheck_failures_total{to, rcode}`
* Can be replaced with `coredns_proxy_healthcheck_failures_total{proxy_name="forward", to, rcode}`
* `coredns_forward_requests_total{to}`
* Can be replaced with `sum(coredns_proxy_request_duration_seconds_count{proxy_name="forward", to})`
* `coredns_forward_responses_total{to, rcode}`
* Can be replaced with `coredns_proxy_request_duration_seconds_count{proxy_name="forward", to, rcode}`
* `coredns_forward_request_duration_seconds{to, rcode}`
* Can be replaced with `coredns_proxy_request_duration_seconds{proxy_name="forward", to, rcode}`
2018-02-05 22:00:47 +00:00
## Examples
2018-04-27 14:24:58 +01:00
Proxy all requests within `example.org.` to a nameserver running on a different port:
2018-02-05 22:00:47 +00:00
~~~ corefile
example.org {
forward . 127.0.0.1:9005
}
~~~
2022-07-20 10:35:04 -04:00
Send all requests within `lab.example.local.` to `10.20.0.1` , all requests within `example.local.` (and not in
`lab.example.local.` ) to `10.0.0.1` , all others requests to the servers defined in `/etc/resolv.conf` , and
caches results. Note that a CoreDNS server configured with multiple _forward_ plugins in a server block will evaluate those
forward plugins in the order they are listed when serving a request. Therefore, subdomains should be
placed before parent domains otherwise subdomain requests will be forwarded to the parent domain's upstream.
Accordingly, in this example `lab.example.local` is before `example.local` , and `example.local` is before `.` .
~~~ corefile
. {
cache
forward lab.example.local 10.20.0.1
forward example.local 10.0.0.1
forward . /etc/resolv.conf
}
~~~
The example above is almost equivalent to the following example, except that example below defines three separate plugin
chains (and thus 3 separate instances of _cache_ ).
~~~ corefile
lab.example.local {
cache
forward . 10.20.0.1
}
example.local {
cache
forward . 10.0.0.1
}
. {
cache
forward . /etc/resolv.conf
}
~~~
2018-02-05 22:00:47 +00:00
Load balance all requests between three resolvers, one of which has a IPv6 address.
~~~ corefile
. {
forward . 10.0.0.10:53 10.0.0.11:1053 [2003::1]:53
}
~~~
Forward everything except requests to `example.org`
~~~ corefile
. {
forward . 10.0.0.10:1234 {
except example.org
}
}
~~~
Proxy everything except `example.org` using the host's `resolv.conf` 's nameservers:
~~~ corefile
. {
forward . /etc/resolv.conf {
except example.org
}
}
~~~
2020-11-03 15:32:49 +01:00
Proxy all requests to 9.9.9.9 using the DNS-over-TLS (DoT) protocol, and cache every answer for up to 30
2018-02-15 10:21:57 +01:00
seconds. Note the `tls_servername` is mandatory if you want a working setup, as 9.9.9.9 can't be
used in the TLS negotiation. Also set the health check duration to 5s to not completely swamp the
service with health checks.
2018-02-05 22:00:47 +00:00
~~~ corefile
. {
forward . tls://9.9.9.9 {
tls_servername dns.quad9.net
health_check 5s
}
cache 30
}
~~~
2022-04-13 00:39:48 +08:00
Or configure other domain name for health check requests
~~~ corefile
. {
forward . tls://9.9.9.9 {
tls_servername dns.quad9.net
health_check 5s domain example.org
}
cache 30
}
~~~
2018-11-20 21:16:54 +01:00
Or with multiple upstreams from the same provider
~~~ corefile
. {
forward . tls://1.1.1.1 tls://1.0.0.1 {
2019-01-15 18:18:20 +01:00
tls_servername cloudflare-dns.com
2018-11-20 21:16:54 +01:00
health_check 5s
}
cache 30
}
~~~
2020-11-03 15:32:49 +01:00
Or when you have multiple DoT upstreams with different `tls_servername` s, you can do the following:
2018-02-05 22:00:47 +00:00
2020-11-03 15:32:49 +01:00
~~~ corefile
. {
forward . 127.0.0.1:5301 127.0.0.1:5302
}
.:5301 {
2023-05-27 05:01:06 +08:00
forward . tls://8.8.8.8 tls://8.8.4.4 {
2020-11-03 15:32:49 +01:00
tls_servername dns.google
}
}
.:5302 {
2023-05-27 05:01:06 +08:00
forward . tls://1.1.1.1 tls://1.0.0.1 {
2020-11-03 15:32:49 +01:00
tls_servername cloudflare-dns.com
}
}
~~~
2018-02-05 22:00:47 +00:00
2024-07-01 17:20:12 +02:00
The following would try 1.2.3.4 first. If the response is `NXDOMAIN` , try 5.6.7.8. If the response from 5.6.7.8 is `NXDOMAIN` , try 9.0.1.2.
~~~ corefile
. {
forward . 1.2.3.4 {
next NXDOMAIN
}
forward . 5.6.7.8 {
next NXDOMAIN
}
forward . 9.0.1.2 {
}
}
~~~
2020-10-28 18:56:35 +01:00
## See Also
2018-02-05 22:00:47 +00:00
[RFC 7858 ](https://tools.ietf.org/html/rfc7858 ) for DNS over TLS.