User story #9796
closedRelay Load balancing with HA
Description
Hi,
Having a big environment is only manageable if you deploy multiple relays, so that the multiple thousands of nodes do not hammer all on the root server, and you can secure the root server even better if only a small subset of IP-s has to be able to access it.
You can utilize relays depending on security zones, geo-location, and number of nodes that need to connect to it.
However, there is almost never an even distribution in the numbers of them, so there are peaks of similar "types" of nodes.
For example, this is a real distribution (you see one part of relay uuid vs nodes connected to them):6a81cf67da3d 486
e110a8eac053 13
16992e66f373 172
afb7081228de 3350
4a9f6afea17c 133
2555476521fb 2304
89ec081228de 856
76b2081228de 689
089053ddd718 588
As you can see, there are big peaks of 2k+ number of nodes connected to one relays, because they are "logically identical" based on the requirements.
And those two outraging are also basically in the same "zone", we just separated them by something we could automate, but basically 5600+ nodes would require the same relay.
- Create the concept of "zones" that you can use to describe an area like "TEST", "QA", "DMZ-1", "REGULAR", and also put the IP Subnet restrictions on this level. The default could be "default" :-)
- Assign the relays to these zones, so they inherit the description and IP subnet.
- The node registers to an existing policy server, that puts it in a zone /or/ If #7876 is implemented, you can define the "zone" on allocation or it gets "default", and the response would contain a relay that has the least nodes on it.
- Sync the policy to all relays in a zone.
- The node receives the list of all possible relays in the "zone" as a list, and chooses one of it randomly if it is available (port 443+5309 open?)
- ensure you can do rolling upgrades and the nodes can handle the outage of relays as long there are others in the "zone"
- distribute the load (hopefull) evenly among the relays in the same "zone"
- only extend the current setup, where you have one relay for a node, you just extend '1' to 'n'
Updated by Benoît PECCATTE over 7 years ago
- Category set to Performance and scalability
Updated by François ARMAND almost 3 years ago
- Status changed from New to Backlog
This is not tracked into our roadmap, we will change the status if it is priorized.