diff --git a/plugin/traffic/README.md b/plugin/traffic/README.md index 16789776d..0e269ae24 100644 --- a/plugin/traffic/README.md +++ b/plugin/traffic/README.md @@ -7,7 +7,7 @@ ## Description The *traffic* plugin is a balancer that allows traffic steering, weighted responses and draining -of clusters. A cluster in Envoy is defined as: "A group of logically similar endpoints that Envoy +of clusters. A cluster is defined as: "A group of logically similar endpoints that Envoy connects to." Each cluster has a name, which *traffic* extends to be a domain name. See "Naming Clusters" below. @@ -17,37 +17,35 @@ be upgraded, so all traffic to it is drained. Or the entire Kubernetes needs to endpoints need to be drained from it. The cluster information is retrieved from a service discovery manager that implements the service -discovery [protocols from Envoy implements](https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol). +discovery [protocols from Envoy](https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol). It connects to the manager using the Aggregated Discovery Service (ADS) protocol. Endpoints and clusters are discovered every 10 seconds. The plugin hands out responses that adhere to these assignments. Only endpoints that are *healthy* are handed out. Note that the manager *itself* is also a cluster that is managed *by the management server*. This is the *management cluster* (see `cluster` below in "Syntax"). By default the name for cluster is `xds`. -When bootstrapping *traffic* tries to retrieve the cluster endpoints for the management cluster. -This continues in the background and *traffic* is smart enough to reconnect on failures or updates -cluster configuration. If the `xds` management cluster can't be found on start up, *traffic* returns a -fatal error. +When bootstrapping *traffic* tries to retrieve the cluster endpoints for the management cluster, +when the cluster is not found *traffic* will return a fatal error. + +The *traffic* plugin handles A, AAAA and SRV queries. Queries for non-existent clusters get a +NXDOMAIN, where the minimal TTL is also set to 5s. For A and AAAA queries each DNS response contains a single IP address that's considered the best one. The TTL on these answer is set to 5s. It will only return successful responses either with an -answer or otherwise a NODATA response. Queries for non-existent clusters get a NXDOMAIN, where the -minimal TTL is also set to 5s. +answer or, otherwise, a NODATA response. -For SRV queries all healthy backends will be returned - assuming the client doing the query is smart -enough to select the best one. When SRV records are returned, the endpoint DNS names are synthesized -`endpoint-..` that carries the IP address. Querying for these synthesized names -works as well. +For SRV queries *all* healthy backends will be returned - assuming the client doing the query +is smart enough to select the best one. When SRV records are returned, the endpoint DNS names +are synthesized `endpoint-..` that carries the IP address. Querying for these +synthesized names works as well. [gRPC LB SRV records](https://github.com/grpc/proposal/blob/master/A5-grpclb-in-dns.md) are supported and returned by the *traffic* plugin for all clusters. The returned endpoints are, -however, the ones from the management cluster as these must implement gRPC LB. +however, the ones from the management cluster. *Traffic* implements version 3 of the xDS API. It works with the management server as written in . -If *traffic*'s `locality` has been set the answers can be localized. - ## Syntax ~~~ @@ -66,7 +64,6 @@ The extended syntax is available if you want more control. traffic TO... { cluster CLUSTER id ID - locality REGION[,ZONE[,SUBZONE]] [REGION[,ZONE[,SUBZONE]]]... tls CERT KEY CA tls_servername NAME ignore_health @@ -75,16 +72,7 @@ traffic TO... { * `cluster` **CLUSTER** define the name of the management cluster. By default this is `xds`. - * `id` **ID** is how *traffic* identifies itself to the control plane. This defaults to - `coredns`. - - * `locality` has a list of **REGION,ZONE,SUBZONE** sets. These tell *traffic* where its running - and what should be considered local traffic. Each **REGION,ZONE,SUBZONE** set will be used - to match clusters against while generating responses. The list should descend in proximity. - **ZONE** or **ZONE** *and* **SUBZONE** may be omitted. This signifies a wild card match. - I.e. when there are 3 regions, US, EU, ASIA, and this CoreDNS is running in EU, you can use: - `locality EU US ASIA`. Each list must be separated using spaces. The elements within a set - should be separated with only a comma. + * `id` **ID** is how *traffic* identifies itself to the control plane. This defaults to `coredns`. * `tls` **CERT** **KEY** **CA** define the TLS properties for gRPC connection. If this is omitted an insecure connection is attempted. From 0 to 3 arguments can be provided with the meaning as @@ -118,39 +106,22 @@ and "cluster-v0" is one of the load balanced cluster, *traffic* will respond to For SRV queries all endpoints are returned, the SRV target names are synthesized: `endpoint-.web.lb.example.org` to take the example from above. *N* is an integer starting with 0. -For the management cluster `_grpclb._tcp..` will also be resolved in the same way as -normal SRV queries. This special case is done because gRPC lib - -the gRPC LBs are. For each **TO** in the configuration *traffic* will return a SRV record. The -target name in the SRV are synthesized as well, using `grpclb-N` to prefix the zone from the Corefile, -i.e. `grpclb-0.lb.example.org` will be the gRPC name when using `lb.example.org` in the configuration. -Each `grpclb-N` target will have one address record, namely the one specified in the configuration. - -## Localized Endpoints - -Endpoints can be grouped by location, this location information is used if the `locality` property -is used in the configuration. +The gRPC load balancer name: `_grpclb._tcp..` will also be resolved in the same way +as normal SRV queries. gRPC uses this to find load balancers. Note that the addresses returned in +this care are from the management cluster. ## Matching Algorithm -How are clients match against the data we receive from xDS endpoint? Ignoring `locality` for now, it -will go through the following steps: +How are clients match against the data we receive from xDS endpoint? 1. Does the cluster exist? If not return NXDOMAIN, otherwise continue. 2. Run through the endpoints, discard any endpoints that are not HEALTHY. If we are left with no endpoint return a NODATA response, otherwise continue. -3. If weights are assigned use those to pick an endpoint, otherwise randomly pick one and return a +3. If weights are assigned, use those to pick an endpoint, otherwise randomly pick one and return a response to the client. -If `locality` *has* been specified there is an extra step between 2 and 3. - -2a. Match the endpoints using the locality that groups several of them, it's the most specific -match from left to right in the `locality` list; if no **REGION,ZONE,SUBZONE** matches then try -**REGION,ZONE** and then **REGION**. If still not match, move on the to next one. If we found none, -we continue with step 4 above, ignoring any locality. - ## Metrics If monitoring is enabled (via the *prometheus* plugin) then the following metric are exported: @@ -207,8 +178,9 @@ endpoint-2.xds.lb.example.org. 5 IN A 10.0.2.1 ## Bugs -Priority and locality information from ClusterLoadAssignments is not used. Credentials are not -implemented. +Credentials are not implemented. Bootstrapping is not fully implemented, *traffic* will connect to +the first working **TO** address, but then stops short of re-connecting to he endpoints is received +for the management **CLUSTER**. Load reporting is not supported for the following reason: A DNS query is done by a resolver. Behind this resolver (which can also cache) there may be many clients that will use this reply. The @@ -216,9 +188,6 @@ responding server (CoreDNS) has no idea how many clients use this resolver. So r +1 on the CoreDNS side can results in anything from 1 to 1000+ of queries on the endpoint, making the load reporting from *traffic* highly inaccurate. -Bootstrapping is not fully implemented, *traffic* will connect to the first working **TO** addresss, -but then stops short of re-connecting to he endpoints is received for the management **CLUSTER**. - ## Also See A Envoy management server and command line interface can be found on diff --git a/plugin/traffic/setup.go b/plugin/traffic/setup.go index d3961efa8..91cfaf0ba 100644 --- a/plugin/traffic/setup.go +++ b/plugin/traffic/setup.go @@ -149,57 +149,3 @@ func parseTraffic(c *caddy.Controller) (*Traffic, error) { t.hosts = toHosts return t, nil } - -// parseLocality parses string s into loc's. Each loc must be space separated from the other, inside -// a loc we have region,zone,subzone, where subzone or subzone and zone maybe empty. If specified -// they must be comma separate (not spaces or anything). -func parseLocality(s string) ([]xds.Locality, error) { - sets := strings.Fields(s) - if len(sets) == 0 { - return nil, nil - } - - locs := []xds.Locality{} - for _, s := range sets { - l := strings.Split(s, ",") - switch len(l) { - default: - return nil, fmt.Errorf("too many location specifiers: %q", s) - case 1: - l0 := strings.TrimSpace(l[0]) - if l0 == "" { - return nil, fmt.Errorf("empty location specifer: %q", l[0]) - } - locs = append(locs, xds.Locality{Region: l0}) - continue - case 2: - l0 := strings.TrimSpace(l[0]) - if l0 == "" { - return nil, fmt.Errorf("empty location specifer: %q", l[0]) - } - l1 := strings.TrimSpace(l[1]) - if l1 == "" { - return nil, fmt.Errorf("empty location specifer: %q", l[1]) - } - locs = append(locs, xds.Locality{Region: l0, Zone: l1}) - continue - case 3: - l0 := strings.TrimSpace(l[0]) - if l0 == "" { - return nil, fmt.Errorf("empty location specifer: %q", l[0]) - } - l1 := strings.TrimSpace(l[1]) - if l1 == "" { - return nil, fmt.Errorf("empty location specifer: %q", l[1]) - } - l2 := strings.TrimSpace(l[2]) - if l2 == "" { - return nil, fmt.Errorf("empty location specifer: %q", l[2]) - } - locs = append(locs, xds.Locality{Region: l0, Zone: l1, SubZone: l2}) - continue - } - - } - return locs, nil -} diff --git a/plugin/traffic/setup_test.go b/plugin/traffic/setup_test.go index 14b0f5037..8649801de 100644 --- a/plugin/traffic/setup_test.go +++ b/plugin/traffic/setup_test.go @@ -9,7 +9,7 @@ import ( func TestSetup(t *testing.T) { c := caddy.NewTestController("dns", `traffic grpc://127.0.0.1`) if err := setup(c); err != nil { - t.Fatalf("Test 1, expected no errors, but got: %q", err) + t.Fatalf("Expected no errors, but got: %q", err) } } @@ -37,10 +37,10 @@ func TestParseTraffic(t *testing.T) { c := caddy.NewTestController("dns", test.input) _, err := parseTraffic(c) if test.shouldErr && err == nil { - t.Errorf("Test %v: Expected error but found nil", i) + t.Errorf("Test %d: Expected error, but got: nil", i) continue } else if !test.shouldErr && err != nil { - t.Errorf("Test %v: Expected no error but found error: %v", i, err) + t.Errorf("Test %d: Expected no error, but got: %v", i, err) continue } @@ -49,32 +49,3 @@ func TestParseTraffic(t *testing.T) { } } } - -func testParseLocality(t *testing.T) { - s := "region" - locs, err := parseLocality(s) - if err != nil { - t.Fatal(err) - } - if locs[0].Region != "region" { - t.Errorf("Expected %s, but got %s", "region", locs[0].Region) - } - - s = "region1,zone,sub region2" - locs, err = parseLocality(s) - if err != nil { - t.Fatal(err) - } - if locs[0].Zone != "zone" { - t.Errorf("Expected %s, but got %s", "zone", locs[1].Zone) - } - if locs[0].SubZone != "sub" { - t.Errorf("Expected %s, but got %s", "sub", locs[1].SubZone) - } - if locs[1].Region != "region2" { - t.Errorf("Expected %s, but got %s", "region2", locs[1].Region) - } - if locs[1].Zone != "" { - t.Errorf("Expected %s, but got %s", "", locs[1].Zone) - } -} diff --git a/plugin/traffic/traffic.go b/plugin/traffic/traffic.go index 3cee43519..f5a8407ca 100644 --- a/plugin/traffic/traffic.go +++ b/plugin/traffic/traffic.go @@ -27,7 +27,6 @@ type Traffic struct { id string health bool origins []string - loc []xds.Locality Next plugin.Handler } @@ -49,7 +48,7 @@ func (t *Traffic) ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg m.SetReply(r) m.Authoritative = true - sockaddr, ok := t.c.Select(cluster, t.loc, t.health) + sockaddr, ok := t.c.Select(cluster, t.health) if !ok { // ok this cluster doesn't exist, potentially due to extra labels, which may be garbage or legit queries: // legit is: @@ -70,7 +69,7 @@ func (t *Traffic) ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg if strings.HasPrefix(strings.ToLower(labels[0]), "endpoint-") { // recheck if the cluster exist. cluster = labels[1] - sockaddr, ok = t.c.Select(cluster, t.loc, t.health) + sockaddr, ok = t.c.Select(cluster, t.health) if !ok { m.Ns = soa(state.Zone) m.Rcode = dns.RcodeNameError @@ -87,9 +86,9 @@ func (t *Traffic) ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg return 0, nil } // OK, _grcplb._tcp query; we need to return the endpoint for the mgmt cluster *NOT* the cluster - // we got the query for. This should exist, but we'll check later anyway + // we got the query for. This should exist, but we'll check later anyway. cluster = t.mgmt - sockaddr, _ = t.c.Select(cluster, t.loc, t.health) + sockaddr, _ = t.c.Select(cluster, t.health) break default: m.Ns = soa(state.Zone) @@ -121,7 +120,7 @@ func (t *Traffic) ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg } m.Answer = []dns.RR{&dns.AAAA{Hdr: dns.RR_Header{Name: state.QName(), Rrtype: dns.TypeAAAA, Class: dns.ClassINET, Ttl: 5}, AAAA: sockaddr.Address()}} case dns.TypeSRV: - sockaddrs, _ := t.c.All(cluster, t.loc, t.health) + sockaddrs, _ := t.c.All(cluster, t.health) m.Answer = make([]dns.RR, 0, len(sockaddrs)) m.Extra = make([]dns.RR, 0, len(sockaddrs)) for i, sa := range sockaddrs { @@ -168,7 +167,7 @@ func (t *Traffic) serveEndpoint(ctx context.Context, state request.Request, endp return 0, nil } - sockaddrs, _ := t.c.All(cluster, t.loc, t.health) + sockaddrs, _ := t.c.All(cluster, t.health) if len(sockaddrs) < nr { m.Ns = soa(state.Zone) m.Rcode = dns.RcodeNameError diff --git a/plugin/traffic/traffic_test.go b/plugin/traffic/traffic_test.go index f43d9568f..a0c566161 100644 --- a/plugin/traffic/traffic_test.go +++ b/plugin/traffic/traffic_test.go @@ -152,58 +152,6 @@ func TestTraffic(t *testing.T) { } } -func TestTrafficLocality(t *testing.T) { - c, err := xds.New("127.0.0.1:0", "test-id", grpc.WithInsecure()) - if err != nil { - t.Fatal(err) - } - tr := &Traffic{c: c, origins: []string{"lb.example.org."}} - - tests := []struct { - cla *endpointpb.ClusterLoadAssignment - loc xds.Locality // where we run - answer string - }{ - { - cla: &endpointpb.ClusterLoadAssignment{ - ClusterName: "web", - Endpoints: append( - // IPs here should be different, but locality isn't implemented. Make - // them identical so the test doesn't fail...(for now) - endpointsWithLocality([]EndpointHealth{ - {"127.0.1.1", 18008, corepb.HealthStatus_HEALTHY}, - {"127.0.1.1", 18008, corepb.HealthStatus_HEALTHY}}, - xds.Locality{Region: "us"}), - endpointsWithLocality([]EndpointHealth{ - {"127.0.1.1", 18008, corepb.HealthStatus_HEALTHY}}, - xds.Locality{Region: "eu"})..., - ), - }, - answer: "127.0.1.1", - loc: xds.Locality{Region: "eu"}, // our location - }, - } - - ctx := context.TODO() - - for i, tc := range tests { - a := xds.NewAssignment() - a.SetClusterLoadAssignment("web", tc.cla) - c.SetAssignments(a) - - m := new(dns.Msg).SetQuestion(dnsutil.Join("web", tr.origins[0]), dns.TypeA) - - rec := dnstest.NewRecorder(&test.ResponseWriter{}) - _, err := tr.ServeDNS(ctx, rec, m) - if err != nil { - t.Errorf("Test %d: Expected no error, but got %q", i, err) - } - if x := rec.Msg.Answer[0].(*dns.A).A.String(); x != tc.answer { - t.Fatalf("Test %d: Expected %s, but got %s", i, tc.answer, x) - } - } -} - func TestTrafficSRV(t *testing.T) { c, err := xds.New("127.0.0.1:0", "test-id", grpc.WithInsecure()) if err != nil { @@ -291,6 +239,9 @@ func TestTrafficManagement(t *testing.T) { if _, err := tr.ServeDNS(ctx, rec, m); err != nil { t.Errorf("Expected no error, but got %q", err) } + if len(rec.Msg.Answer) == 0 { + t.Fatalf("Expected answer section, got none") + } if x := rec.Msg.Answer[0].(*dns.SRV).Target; x != "endpoint-0.xds.lb.example.org." { t.Errorf("Expected %s, got %s", "endpoint-0.xds.lb.example.org.", x) } diff --git a/plugin/traffic/xds/assignment.go b/plugin/traffic/xds/assignment.go index dcb803841..5d127a0e9 100644 --- a/plugin/traffic/xds/assignment.go +++ b/plugin/traffic/xds/assignment.go @@ -71,20 +71,20 @@ func (a *assignment) clusters() []string { } // Select selects a endpoint from cluster load assignments, using weighted random selection. It only selects endpoints that are reporting healthy. -func (a *assignment) Select(cluster string, locality []Locality, ignore bool) (*SocketAddress, bool) { +func (a *assignment) Select(cluster string, ignore bool) (*SocketAddress, bool) { cla := a.ClusterLoadAssignment(cluster) if cla == nil { return nil, false } - total := 0 + weight := 0 healthy := 0 for _, ep := range cla.Endpoints { for _, lb := range ep.GetLbEndpoints() { if !ignore && lb.GetHealthStatus() != corepb.HealthStatus_HEALTHY { continue } - total += int(lb.GetLoadBalancingWeight().GetValue()) + weight += int(lb.GetLoadBalancingWeight().GetValue()) healthy++ } } @@ -92,8 +92,8 @@ func (a *assignment) Select(cluster string, locality []Locality, ignore bool) (* return nil, true } - if total == 0 { - // all weights are 0, randomly select one of the endpoints. + // all weights are 0, randomly select one of the endpoints, + if weight == 0 { r := rand.Intn(healthy) i := 0 for _, ep := range cla.Endpoints { @@ -110,7 +110,7 @@ func (a *assignment) Select(cluster string, locality []Locality, ignore bool) (* return nil, true } - r := rand.Intn(total) + 1 + r := rand.Intn(healthy) + 1 for _, ep := range cla.Endpoints { for _, lb := range ep.GetLbEndpoints() { if !ignore && lb.GetHealthStatus() != corepb.HealthStatus_HEALTHY { @@ -126,7 +126,7 @@ func (a *assignment) Select(cluster string, locality []Locality, ignore bool) (* } // All returns all healthy endpoints. -func (a *assignment) All(cluster string, locality []Locality, ignore bool) ([]*SocketAddress, bool) { +func (a *assignment) All(cluster string, ignore bool) ([]*SocketAddress, bool) { cla := a.ClusterLoadAssignment(cluster) if cla == nil { return nil, false diff --git a/plugin/traffic/xds/client.go b/plugin/traffic/xds/client.go index 461b69439..071bf8d08 100644 --- a/plugin/traffic/xds/client.go +++ b/plugin/traffic/xds/client.go @@ -199,22 +199,23 @@ func (c *Client) receive(stream adsStream) error { // Select returns an address that is deemed to be the correct one for this cluster. The returned // boolean indicates if the cluster exists. -func (c *Client) Select(cluster string, locality []Locality, ignore bool) (*SocketAddress, bool) { +func (c *Client) Select(cluster string, ignore bool) (*SocketAddress, bool) { if cluster == "" { return nil, false } - return c.assignments.Select(cluster, locality, ignore) + return c.assignments.Select(cluster, ignore) } // All returns all endpoints. -func (c *Client) All(cluster string, locality []Locality, ignore bool) ([]*SocketAddress, bool) { +func (c *Client) All(cluster string, ignore bool) ([]*SocketAddress, bool) { if cluster == "" { return nil, false } - return c.assignments.All(cluster, locality, ignore) + return c.assignments.All(cluster, ignore) } // Locality holds the locality for this server. It contains a Region, Zone and SubZone. +// Currently this is not used. type Locality struct { Region string Zone string diff --git a/plugin/traffic/xds/metrics.go b/plugin/traffic/xds/metrics.go index 41c440795..e775e452d 100644 --- a/plugin/traffic/xds/metrics.go +++ b/plugin/traffic/xds/metrics.go @@ -18,6 +18,6 @@ var ( Namespace: plugin.Namespace, Subsystem: "traffic", Name: "endpoints_tracked", - Help: "Gauge of all tracked endpoints.", + Help: "Gauge of tracked endpoints.", }) )