User story #10551
Updated by François ARMAND over 5 years ago
We want to make the policy generation node by node, so that:
* a faulty node does not block the policy generation for other nodes,
* in case of very long generation (>30 min), we don't have to wait the whole time to have node starting to get new policy generation,
* errors are reported on a node by node basis
* we can have a meaningfull progress bar for the generation ("7 nodes out of 25"...)
This, of course, lead a number of questions, for example:
* how do we manage dependencies (typically between a node and its policy server, if hostname change)? What happen if only one the the two updates breaks?
* how do we make errors understandable and discoverable? Imagine if 7000 nodes are in error.
(and certainly a number of others).
Moreover, the parallelism of the policy generation can be more fine-grained controlled, along with the JS timeout, dynamic group computation at start of generation, and change computation with API:
<pre>
# for max parallelism, either use '1x' to mean "1 time the number of CPU / 2" or '3' to mean '3 threads'
curl -k -H "X-API-Token: xxx" -X POST 'https://.../rudder/api/latest/settings/rudder_generation_max_parallelism' -d "value=1x"
# value is in seconds
curl -k -H "X-API-Token: xxx" -X POST 'https://.../rudder/api/latest/settings/rudder_generation_js_timeout' -d "value=10"
# use 'false' or 'true'
curl -k -H "X-API-Token: xxx" -X POST 'https://.../rudder/api/latest/settings/rudder_generation_compute_dyngroups' -d "value=false"
curl -k -H "X-API-Token: xxx" -X POST 'https://.../rudder/api/latest/settings/rudder_compute_changes' -d "value=false"
</pre>