After several experiments at SoundCloud we found that the current
minimum read timeout of 10ms is too low. A single request against a
slow/unavailable authoritative server can cause all TCP connections to
get closed. We record a 50th percentile forward/proxy latency of <5ms,
and a 99th percentile latency of 60ms. Using a minimum timeout of 200ms
seems to be a fair trade-off between avoiding unnecessary high
connection churn and reacting to upstream failures in a timely manner.
This change also renames hcDuration to hcInterval to reflect its usage,
and removes the duplicated timeout constant to make code comprehension
easier.
The default dns.Response implementation of a dns.ResponseWriter will
panic if RemoteAddr() is called after the connection to the client has
been closed already. The current cache implementation doesn't create a
new request+responsewriter during an asynchronous prefetch, but
piggybacks on the request triggering the prefetch.
This change copies the RemoteAddr first, so that it's safe to use it
later during the actual prefetch request.
A better implementation would be to completely decouple the prefetch
request from the client triggering a request.
add a test to see if we copy the rcode correctly. Some minor cleanup in
import ordering and renaming NewUpstream to New as we already are in the
upstream package.
* plugin/file: fix local CNAME lookup
Issue #1864 explains it will, when we serve the child zone as well we
should just recursive into ourself (upstream self). Thus relax the
IsSubDomain check in file/lookup.go and just query (even if the query
will hit a remote server).
I've looped over all other plugins that do something similar (CNAME
resolving) and they didn't do the IsSubDomain check; therefor I've
removed it from *file* as well.
Added test in file_upstream_test that shows this failed before but now
results in a reply.
Fixes#1864
* self does not need to be exported
* Fix test
We don't know if we had a valid reply. Check this.
Remove the code and remove the call in etcd and kubernetes handlers.
This does mean we should not add dups in the first place, which means
adding maps in backend_lookup to prevent dups from begin added.
This should cut down on the allocations because dnsutil.Dedup is very
expensive by converting everything to strings, we avoid doing that now.
* Current stage of the log files. Test need to be done as well as formatting of times.
* Finished testing. All altered classes test pass along with my additions
* Updated the replacer package to print the units as well. May take out.
* Changed the time units to be within the rules. Fixed the test as well.
* Fixed some tests, updated the readme, fixed the replacer class.
* Updates of standardizing only to seconds in response duration. Need to revert README.
* Reverted readme.
* Added a small test in new replacer.
* Changed replacer to inline the strconv for duration.
* Docker: drop alpine
Create a multistage docker build image that uses debian to install certs
and then create the final image by using FROM: scratch. This creates a
(slightly) smaller images and drops busybox and alpine.
* Even less copying
Uppercase all these test errors as well. And extend the presubmit to
check for these in the future. Also do a slightly smarter grep to only
get t.<something>. as (because dump regexp) this also grep over non test
files.
* plugin/forward: erase expired connection by timer
- in previous implementation, the expired connections resided in
cache until new request to the same upstream/protocol came. In
case if the upstream was unhealthy new request may come long time
later or may not come at all. All this time expired connections
held system resources (file descriptors, ephemeral ports). In my
fix the expired connections and related resources are released
by timer
- decreased the complexity of taking connection from cache. The list
of connections is treated as stack (LIFO queue), i.e. the connection
is taken from the end of queue (the most fresh connection) and
returned to the end (as it was implemented before). The remarkable
thing is that all connections in the stack appear to be ordered by
'used' field
- the cleanup() method finds the first good (not expired) connection
in stack with binary search, since all connections are ordered by
'used' field
* fix race conditions
* minor enhancement
* add comments
* Implement deprecation notice for 1.1.4
This still allows all the config to be parsed, but noops it:
* -log; always set the log to stdout; no matter what.
* https_google; removed from the proxy implementation.
* reverse plugin: set to deprecated.
* Whole of reverse can go
* Remove test for deprecated plugin
enable alias and add one, so that "/plugin: forward" adds a label
called plugin-forward to the issue.
enable branches that automatically delete merged branches.
* ADD: ignoreemptyservice option for kubernetes plugin
* Modify documentation and rename option to add space
* UPD: Add unit tests
* UPD: gofmt
* Add unit test for ignore emptyservice
* gofmt
* xfr tests failed
* Rename emptyservice to empty_service
The DoH work (#1619) made changes to pkg/nonwriter.Writer that in
hindsight were not backwards compatible; it added override for the
LocalAddr() and RemoteAddr(). Instead of rolling back that PR, this PR
reverts those changes and creates a DoHWriter for use in the
https-server.go side of things.
This was only caught in the integration test making this hard to catch,
so we add a upstream_file_test.go that tries (doesn't work yet) to test
this in the unit tests as well. Esp. helpful when 'git bisecting'.
Fixes#1826
* WIP: make CoreDNS DoH Server
* It works
* Fix tests
* Review from Tom - on diff. PR
* correct mime type
* Cleanups and use the pkg/nonwriter
* rename and updates
* implement get
* implement GET
* Code review comments
* correct context
* tweaks
* code review
While invoking `make check` from a fresh new environment
the following failure occured:
```
[ec2-user@..... coredns]$ docker run -i -t --rm -v $PWD:/go/src/github.com/coredns/coredns -w /go/src/github.com/coredns/coredns golang:1.10
root@e2d6a6c17132:/go/src/github.com/coredns/coredns# make check
** presubmit/context
** presubmit/test-lowercase
( gometalinter --deadline=2m --disable-all --enable=goimports --vendor --exclude=^pb/ ./... || true )
/bin/sh: 1: gometalinter: not found
go generate coredns.go
```
This fix fixes the issue in Makefile so that deps could be installed first.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
- connManager() goroutine will stop when Proxy is about to be
garbage collected. This means that no queries are in progress,
and no queries are going to come
A bit meh, but we *need* hardcoded addresses in these tests, because
we can't get them from a running coredns. These may be in-use and this
fails the tests then. Do an ugly err.Error() string match if this is the
case to prevent failing the test for something not in our control.
A better fix would be to retreive the listening address from coredns via
some api, so we could listen on :0 for these as well. No such API exists
as of yet.
This fix is an vendor update. Both ugorji and thrift have to be pinned
to compile. The ugorji is from etcd and thrift is from zipkin.
This fix fixes#1802.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* vendor: don't vendor the context stuff
We don't need to vendor this anymore as we moved to the std lib for
these.
* new stuff showing up with dep ensure
* remove go-shlex
* Probe simplification
- the main reason of rework is that previous implementation hung
when calling Do() after Stop()
* replace atomics with mutex
* access Probe.interval under lock