User story #5641open
Make the agent policies update a state machine with integrity check
For now, we update policies on the node side in a fairly simple way (basically: if policies on the server are more recent than the one on the node, copy them, and then use them).
We need a clear and defined state machine to make that update more robust and resilient.
The basic machine (to be clearly specs) is:
- 1/ we have a current state of policies used by the agent.
- 2/ [policies copy on the node] the agent get new policies from the server in a dedicated, jailed directories
- 3/ [integrity checks] the agent runs a series of check to the new policies. Among the check, we can thing to, at least:
- some integrity check: are the files OK? (think to signed content, cheksum, etc)
- some age check: are the new files more recent than the one I have ?
- some parsing check: cf-promises
- why not, some user defined check
- 4/ if all the checks passed, then the old policies are back-up and replaced by the new one
- 5/ a new run of the agent is done with the new set of policies.
The most important part is the logic in step 3,4,5. It allows to be more confident that the new policies are correct and will run OK.
Note that step 2 allows to have the policies copied by other means than cf-agent, decoupling the way policies arrived on the server to the actual fact of starting to use them, and this is a good thing (tm).
That allows to optimize step 2 independently of the agent logic, even replace the current transport layer completly.
It also allows to implement special command for that part, like for example: DO TAKE the promise on the server now with checksums, whatever your optimization logic is.
Policies can also be corrupted on the agent side (neutrino rain or something). That's why we should check at some points if cf-promise still works. And if not, promises should be downloaded again.