Start documenting xds

Signed-off-by: Miek Gieben <miek@miek.nl>
This commit is contained in:
Miek Gieben
2020-01-10 09:52:29 +01:00
parent d5c5ba010c
commit 99c828c787

View File

@@ -7,30 +7,28 @@
## Description ## Description
The *traffic* plugin is a load balancer that allows traffic steering, weighted responses and The *traffic* plugin is a load balancer that allows traffic steering, weighted responses and
draining of endpoints. Endpoints are IP:port pairs. *Traffic* works as an overlay on top of other draining of endpoints. It discovers the enpoints via the Envoy xDS protocol, specifically messages
plugins, it does not mandate any storage by itself. of the type "envoy.api.v2.ClusterLoadAssignment", these contain endpoints and an (optional) weight
for each. The `cluster_name` or `service_name` for a service must be a domain name.
*Traffic* receives (via gRPC?) *assignments* that define the weight of the endpoints in services. The plugin takes care of handing out responses that adhere to these assignments.
The plugin takes care of handing out responses that adhere to these assignments. Assignments will Assignments will need to be updated frequently, as discussed the [Envoy xDS
need to be updated frequently, without new updates *traffic* will hand out responses according to protocol](https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol) documentation. Each
the last received assignment. When there are no assignments for a service name (yet), the responses response will contain one address record; which *traffic* considers the optimal one.
will also be modified (see below).
An assignment covers a "service name", which is a domain name. For each service a number of backends When there are no assignments for a service name (yet), the responses will also be modified (see
are expected. A backend is defined as an IP:port pair Each backend comes with a integer indicating below).
it relative weight. A zero means the backend exists, but should not be handed out (drain it).
*Traffic* will load balance A and AAAA queries. known to the plugin. It will return precisely one *Traffic* will load balance A and AAAA queries. As said, it will return precisely one record in a
record in a response, which is the optimal record according to the assignments and previously handed response. If a service should be load balanced, but no assignment can be found a random record from
out responses. If a service should be load balanced, but no assignment can be found a random record the *answer section* will be choosen.
from the *answer section* will be choosen.
Every message that is handled by the *traffic* plugin will have all it's TTLs set to 5 seconds, Every message that is handled by the *traffic* plugin will have its TTLs set to 5 seconds, the
any authority section is removed and all RRSIGs are removed from it. authority section, and all RRSIGs are removed from it.
The *traffic* plugin has no notion of draining, drop overload and anything that advanced, *it just The *traffic* plugin has no notion of draining, drop overload and anything that advanced, *it just
acts upon assignments*. This is means that if a backend goes down and *traffic* has not seen a new acts upon assignments*. This is means that if a endpoint goes down and *traffic* has not seen a new
assignment yet, it will still include this backend in responses. assignment yet, it will still include this endpoint address in responses.
## Syntax ## Syntax
@@ -54,62 +52,36 @@ This will add load balancing for domains under example.org; the upstream informa
## Assignments ## Assignments
Assignments are given in protobuf format, but here is an example in YAML conveying the same Assignments are streamed for a service that implements the xDS protocol, *traffic* will bla bla.
information. This is an example assignment for the service "www.example.org". TODO.
~~~ yaml Picking an endpoint is done as follows: (still true for xDs - check afer implementing things)
assignments:
- service: www.example.org
- backend: 192.168.1.1:443
assign: 4
backend: 192.168.1.2:443
assign: 6
backend: 192.168.1.3:443
assign: 0
~~~
This particular one has 3 backends, one of which is to be drained (192.168.1.3). the two remaining * include spiffy algorithm.
ones have a non zero weighted assignment. We use "Weighted Random Selection" to select a backend:
* Add up all the weights for all the items in the list (here 8).
* Pick a number at random between 1 and the sum of the weights.
* Iterate over the items
* For the current item, subtract the item's weight from the random number.
* If less or zero pick this item, other continue with the next item.
On seeing a query for a service, *traffic* will track the reply. When it returns with an answer On seeing a query for a service, *traffic* will track the reply. When it returns with an answer
*traffic* will rewrite it (and discard of any RRSIGs). Using the assignments the answer section will *traffic* will rewrite it (and discard of any RRSIGs). Using the assignments the answer section will
be rewritten as such: be rewritten as such:
* A backend will be picked using the algorithm from above. * A endpoint will be picked using the algorithm from above.
* The TTL on the response will be 5s for all included records. * The TTL on the response will be 5s for all included records.
* According to previous responses for this service and the relative weights of each backends the * According to previous responses for this service and the relative weights of each endpoints the
best backend will be put in the response. best endpoint will be put in the response.
* If after the selection *no* backends are available an NODATA response will be sent. An SOA * If after the selection *no* endpoints are available an NODATA response will be sent. An SOA
record will be synthesised, and a low TTL (and negative TTL) of 5 seconds will be set. record will be synthesised, and a low TTL (and negative TTL) of 5 seconds will be set.
TTL rewriting always? TODO.
Authority section will be removed. Authority section will be removed.
If no assignment, randomly pick an address If no assignment, randomly pick an address
other types then A and AAAA, like SRV - do the same selection. other types then A and AAAA, like SRV - do the same selection.
## Bugs ## Bugs
This plugin does not play nice with DNSSEC - if the backend returns signatures with the answer; they This plugin does not play nice with DNSSEC - if the endpoint returns signatures with the answer; they
will be stripped. You can optionally sign responses on the fly by using the *dnssec* plugin. will be stripped. You can optionally sign responses on the fly by using the *dnssec* plugin.
## Also See ## Also See
This is a [post on weighted random * https://github.com/envoyproxy/go-control-plane
selection](https://medium.com/@peterkellyonline/weighted-random-selection-3ff222917eb6). * https://blog.christianposta.com/envoy/guidance-for-building-a-control-plane-to-manage-envoy-proxy-based-infrastructure/
* https://github.com/envoyproxy/envoy/blob/442f9fcf21a5f091cec3fe9913ff309e02288659/api/envoy/api/v2/discovery.proto#L63
## TODO * This is a [post on weighted random selection](https://medium.com/@peterkellyonline/weighted-random-selection-3ff222917eb6).
Should we add source address information (geographical load balancing) to the assignment? This can
be handled be having each backend specify an optional source range there this record should be used.
For IPv4 this must a /24 for IPv6 a /64.
Other points that require more attention:
* deleting assignments?
* last known good assignment (esp with deleting assignments)?