``# Used Envoy for Rate Limiting
To protect the apiserver service from a misbehaving tenant, the REST APIs should be rate limited. This will allow us to better reason about how to scale the service to a large number of tenants as there will be limit to the amount of apiserver resources each tenant can consume.
This also translates to a limit of number of devices that a tenant can enroll before he starts to hit rate limits. There should be a way to increase the limits per tenant.
Using an Envoy proxy has the following benefits:
- Fast proxy used by many projects and supported by many organizations
- We could possibly move more functionality to it in the future (like oauth processing)
- Some K8s platforms are moving to envoy as the Ingress gateway so in the future we may be able to move this functionality into the ingress gateway (avoiding a proxy hop).
- Aligned with service mesh deployment models.
Web based UI interactions will store the user OAuth token in a Redis backed session. Envoy use the External Authorization Filter to set
Authentication header to
AccessToken obtained form the OAuth login. This will allow the Envoy proxy to have easy access to the AccessToken for all API requests being sent to the apiserver.
The JWT AccessToken will then be validated in Envoy and it's claims passed to limitador to enforce per-user rate limits.
Envoy can then send 429 responses for any requests that have been rate limited.
Since a valid AccessToken is needed, we will be able to identify the tenant generating the source of the traffic even if client trying to create multiple sessions or changing source IPs.
The user's JWT
sub claim will be used to identify a tenant.
Requests for resources that do not need AccessToken (for example: html and js resources or requests to authenticate against the auth server) should be rate limited using the source ip address.
Future Option: Run envoy as sidecar
- More secure since the proxy and apiserver communicate on the loopback interface reducing the possibility of packet snooping.
- More efficient since the proxy and apiserver are guaranteed to run on the same worker node.
- Easier to reason about scaling with traffic load since both get scaled up as unit.
- You can't use telepresence to debug pods that use sidecars in this way
Live without rate limiting. If you have a custom Nexodus service deployment, and don't share it with multiple tenants, you may not need rate limiting.