2018-07-20 19:45:17 +01:00
# loop
## Name
2018-07-23 15:37:41 -04:00
*loop* - detect simple forwarding loops and halt the server.
2018-07-20 19:45:17 +01:00
## Description
2018-10-12 13:24:40 -04:00
The *loop* plugin will send a random probe query to ourselves and will then keep track of how many times
2018-12-16 21:48:09 +00:00
we see it. If we see it more than twice, we assume CoreDNS has seen a forwarding loop and we halt the process.
2018-07-20 19:45:17 +01:00
The plugin will try to send the query for up to 30 seconds. This is done to give CoreDNS enough time
2018-12-16 21:48:09 +00:00
to start up. Once a query has been successfully sent, *loop* disables itself to prevent a query of
2018-07-20 19:45:17 +01:00
death.
2018-07-23 15:37:41 -04:00
The query sent is `<random number>.<random number>.zone` with type set to HINFO.
2018-07-20 19:45:17 +01:00
## Syntax
~~~ txt
loop
~~~
## Examples
Start a server on the default port and load the *loop* and *forward* plugins. The *forward* plugin
forwards to it self.
~~~ txt
. {
loop
forward . 127.0.0.1
}
~~~
After CoreDNS has started it stops the process while logging:
~~~ txt
2018-12-16 21:48:09 +00:00
plugin/loop: Loop (127.0.0.1:55953 -> :1053) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting . Query: "HINFO 4547991504243258144.3688648895315093531."
2018-07-20 19:45:17 +01:00
~~~
2018-07-23 15:37:41 -04:00
## Limitations
2018-12-16 21:48:09 +00:00
This plugin only attempts to find simple static forwarding loops at start up time. To detect a loop,
the following must be true:
2018-07-23 15:37:41 -04:00
2018-12-16 21:48:09 +00:00
* the loop must be present at start up time.
* the loop must occur for the `HINFO` query type.
2018-10-12 13:24:40 -04:00
## Troubleshooting
2018-12-16 21:48:09 +00:00
When CoreDNS logs contain the message `Loop ... detected ...` , this means that the `loop` detection
plugin has detected an infinite forwarding loop in one of the upstream DNS servers. This is a fatal
error because operating with an infinite loop will consume memory and CPU until eventual out of
memory death by the host.
2018-10-12 13:24:40 -04:00
A forwarding loop is usually caused by:
2018-10-18 10:19:22 -04:00
2018-10-22 13:30:42 -04:00
* Most commonly, CoreDNS forwarding requests directly to itself. e.g. via a loopback address such as `127.0.0.1` , `::1` or `127.0.0.53`
2018-10-12 13:24:40 -04:00
* Less commonly, CoreDNS forwarding to an upstream server that in turn, forwards requests back to CoreDNS.
2019-03-03 23:32:38 -08:00
To troubleshoot this problem, look in your Corefile for any `forward` s to the zone
2018-10-12 13:24:40 -04:00
in which the loop was detected. Make sure that they are not forwarding to a local address or
2019-03-03 23:32:38 -08:00
to another DNS server that is forwarding requests back to CoreDNS. If `forward` is
using a file (e.g. `/etc/resolv.conf` ), make sure that file does not contain local addresses.
2018-10-12 13:24:40 -04:00
### Troubleshooting Loops In Kubernetes Clusters
2018-12-16 21:48:09 +00:00
2018-10-12 13:24:40 -04:00
When a CoreDNS Pod deployed in Kubernetes detects a loop, the CoreDNS Pod will start to "CrashLoopBackOff".
This is because Kubernetes will try to restart the Pod every time CoreDNS detects the loop and exits.
2018-12-04 06:58:20 -05:00
A common cause of forwarding loops in Kubernetes clusters is an interaction with a local DNS cache
on the host node (e.g. `systemd-resolved` ). For example, in certain configurations `systemd-resolved` will
put the loopback address `127.0.0.53` as a nameserver into `/etc/resolv.conf` . Kubernetes (via `kubelet` ) by default
2018-12-07 17:45:13 +08:00
will pass this `/etc/resolv.conf` file to all Pods using the `default` dnsPolicy rendering them
2018-12-05 16:20:20 -05:00
unable to make DNS lookups (this includes CoreDNS Pods). CoreDNS uses this `/etc/resolv.conf`
2019-03-03 23:32:38 -08:00
as a list of upstreams to forward requests to. Since it contains a loopback address, CoreDNS ends up forwarding
2018-12-05 16:20:20 -05:00
requests to itself.
2018-10-12 13:24:40 -04:00
There are many ways to work around this issue, some are listed here:
2018-10-18 10:19:22 -04:00
2019-04-30 08:42:14 -04:00
* Add the following to your `kubelet` config yaml: `resolvConf: <path-to-your-real-resolv-conf-file>` (or via command line flag `--resolv-conf` deprecated in 1.10). Your "real"
2018-12-04 06:58:20 -05:00
`resolv.conf` is the one that contains the actual IPs of your upstream servers, and no local/loopback address.
This flag tells `kubelet` to pass an alternate `resolv.conf` to Pods. For systems using `systemd-resolved` ,
`/run/systemd/resolve/resolv.conf` is typically the location of the "real" `resolv.conf` ,
although this can be different depending on your distribution.
* Disable the local DNS cache on host nodes, and restore `/etc/resolv.conf` to the original.
2019-03-03 23:32:38 -08:00
* A quick and dirty fix is to edit your Corefile, replacing `forward . /etc/resolv.conf` with
2019-07-02 16:23:47 +01:00
the IP address of your upstream DNS, for example `forward . 8.8.8.8` . But this only fixes the issue for CoreDNS,
2018-12-04 06:58:20 -05:00
kubelet will continue to forward the invalid `resolv.conf` to all `default` dnsPolicy Pods, leaving them unable to resolve DNS.