User story #3679
closed
Make the agent run schedule configurable from 5 minutes to 6 hours, with configurable offset and splay time
Added by Nicolas CHARLES over 11 years ago.
Updated over 9 years ago.
Category:
Web - Config management
Description
Many user reports they'd like to have the schedule of the agent execution changed; either to every minutes, or every hour.
There are no easy way to change that for the moment (except from changing the executor control in system/common/1.0/promises.st) and there is no proof that everything will work perfectly.
What could go wrong (untested, and probably incomplete/invalid list):
- reporting on no answer rely on the 5 minutes schedule
- some classes may be set to persist for 5 minutes
- default lock is one minute
- database will get HUGE for one minute schedule
- Status changed from New to Discussion
- Assignee set to Nicolas CHARLES
- Priority changed from N/A to 3
I would also add the possibility to schedule de reports differently too, which might prove useful in network constrained environments like 3G networks.
This is clearly a big topic, with differents approaches:
- define a frequency global, via a web parameter, and use it to compute the reporting. Caution, when this value changes (especially when reduced), we can be led to believe that some nodes doesn't answer, when they simply are on the previous schedule. So we should historize all frequencies
- have a per node frequency (which also can be defined on install of rudder-agent on the node). Same type of constraints than previous points
- don't send compliance reports, only non-compliance, when in "non compliance mode". This would CLEARLY lower the amount of data over network and stored in database
- Target version changed from Ideas (not version specific) to 2.10.0~beta1
This is the meta ticket for having the scedule of the agent configurable
All nodes will have the same schedule, between 5 minutes and 6 hours
Please bear in mind that is applies only to node, and not to the relay server
- Subject changed from Have the schedule of agent execution configurable to Make the agent run schedule configurable from 5 minutes to 6 hours, with configurable offset and splay time
Updated the ticket title to reflect the exact change we are implementing:
- Configurable run "frequency" amongst a pre-defined list of choices: 5, 10, 15, 20, 30 minutes, 1, 2, 3, 4, 6 hours
- Configurable "offset" (time of the 'first' run), can be any hour/minute combinaison that is lower than the "frequency" above. For example, given a "frequency" of 2 hours, you could set the offset to be anything from 0 minutes to 1:59. In the latter case, the agent would be run at 01:59, 03:59, 05:59 and so on.
- Configurable "splay time" (maximum delay for run after scheduled time, that all nodes choose a random integer in the [0, splaytime] interval). This can be any hour/minute combinaison that is lower than the "frequency" above.
These three parameters will be configurable in the web interface.
Note: the changed schedule will only apply to managed nodes, not the root server or relay servers.
The reasons for this are that:
- Some actions run by cf-agent on the root and relay servers need to happen pretty frequently, in particular sending inventories to the endpoint (and given the recent change on queue size in the endpoint, this would simply not work if the frequency was too high)
- It is imperative that relay servers copy promises for the nodes that report to them from the root server before the run window starts. Therefore, if the interval was, for example, 6 hours, if we changed the frequency on relay servers too, it might take up to 12 hours for a change to apply to some managed nodes. We don't want that :)
- There may be hidden side-effects we haven't noted here yet.
- Status changed from Discussion to 12
- Description updated (diff)
So, to be consistent every where on naming, we will use:
- agent_run_interval
- agent_run_splaytime
- agent_run_schedule
- agent_run_starthour
- agent_run_startminute
- Status changed from 12 to Pending release
- Category changed from Web - Config management to 14
- Status changed from Pending release to Released
- Category changed from 14 to Web - Config management
- Related to Bug #7154: Agent schedule is not historised, so we can't know what was the agent run interval in the past added
- Related to Bug #18330: Agent run frequency must not be configurable on policy servers added
Also available in: Atom
PDF