Project

General

Profile

Actions

User story #5641

open

Make the agent policies update a state machine with integrity check

Added by François ARMAND over 9 years ago. Updated about 6 years ago.

Status:
New
Priority:
2
Assignee:
-
Category:
System techniques
UX impact:
Suggestion strength:
User visibility:
Effort required:
Name check:
Fix check:
Regression:

Description

For now, we update policies on the node side in a fairly simple way (basically: if policies on the server are more recent than the one on the node, copy them, and then use them).

We need a clear and defined state machine to make that update more robust and resilient.

The basic machine (to be clearly specs) is:

- 1/ we have a current state of policies used by the agent.

- 2/ [policies copy on the node] the agent get new policies from the server in a dedicated, jailed directories

- 3/ [integrity checks] the agent runs a series of check to the new policies. Among the check, we can thing to, at least:
- some integrity check: are the files OK? (think to signed content, cheksum, etc)
- some age check: are the new files more recent than the one I have ?
- some parsing check: cf-promises
- why not, some user defined check

- 4/ if all the checks passed, then the old policies are back-up and replaced by the new one
- 5/ a new run of the agent is done with the new set of policies.

The most important part is the logic in step 3,4,5. It allows to be more confident that the new policies are correct and will run OK.

Note that step 2 allows to have the policies copied by other means than cf-agent, decoupling the way policies arrived on the server to the actual fact of starting to use them, and this is a good thing (tm).

That allows to optimize step 2 independently of the agent logic, even replace the current transport layer completly.
It also allows to implement special command for that part, like for example: DO TAKE the promise on the server now with checksums, whatever your optimization logic is.

Policies can also be corrupted on the agent side (neutrino rain or something). That's why we should check at some points if cf-promise still works. And if not, promises should be downloaded again.


Related issues 6 (3 open3 closed)

Related to Rudder - Architecture #4427: cf-promises check on ALL generated promises leads to huge generation time NewNicolas CHARLESActions
Related to Rudder - Bug #5650: promises can become invalid if copies fail rendering the agent unusableReleasedBenoît PECCATTEActions
Related to Rudder - User story #751: Test the typed variables in Directives on a test nodeRejectedNicolas CHARLES2011-02-01Actions
Related to Rudder - User story #6847: Separate updating of failsafe.cf and update.cfNewActions
Related to Rudder - Bug #9704: As for Rudder 3.2.9, promises calculation is still too slowRejectedActions
Related to Rudder - Architecture #7831: Simplify usage and copy of ncf directoriesNewActions
Actions

Also available in: Atom PDF