Bug #14258
closedCron job checking rudder agent health, is ran every 5 minutes exactly, causing resource usage spike
Description
check-rudder-agent is triggered by a cron job, every 5 minutes
When it happens on a server with 400 VMs, we a 400 instance of this script, running at the same time.
It is an issue because this script also runs cf-promises on the whole promises set, so it uses a bit of resource.
We ought to:- have a splay on this script - could be a sleep of a random time (or deterministic time) based on actual defined spaytime
- don't cf-promises at each run - we should not do it more often than the agent frequency run, and it could be done much less often (once per day) as this is a failsafe after the failsafe
- (optionnaly) run this script at the agent run frequency, rather than every 5 minutes
Updated by Nicolas CHARLES almost 6 years ago
- Related to Bug #4768: check-rudder-agent should take splaytime into account when checking the last input update file added
Updated by François ARMAND almost 6 years ago
- Translation missing: en.field_tag_list set to Sponsored
- Severity set to Major - prevents use of part of Rudder | no simple workaround
- User visibility set to Operational - other Techniques | Rudder settings | Plugins
- Priority changed from 0 to 84
Linked to support ticket S10748
Updated by François ARMAND almost 6 years ago
- Related to Bug #11919: rudder agent check runs synchronously on all nodes, causing CPU spikes added
Updated by François ARMAND almost 6 years ago
- Assignee set to Nicolas CHARLES
We "just" need to add a sleep at the begin of "rudder agent check" with the spay time lengh. We need to check if we are in interactive mode to avoid having the sleep in that mode. Or it could be a new parameter in the cron (like --cron).
Updated by François ARMAND almost 6 years ago
- Effort required set to Small
- Priority changed from 84 to 100
Updated by Nicolas CHARLES almost 6 years ago
best way to detect if interactive seems to be using if [ -t 0 ];
Updated by Nicolas CHARLES almost 6 years ago
- Status changed from New to In progress
Updated by Nicolas CHARLES almost 6 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from Nicolas CHARLES to Benoît PECCATTE
- Pull Request set to https://github.com/Normation/rudder-agent/pull/205
Updated by Rudder Quality Assistant over 5 years ago
- Assignee changed from Benoît PECCATTE to Nicolas CHARLES
- Priority changed from 100 to 99
Updated by Nicolas CHARLES over 5 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder-agent|a94650a7e2e4c7b4c222646e56f4a0c36e7f3f64.
Updated by François ARMAND over 5 years ago
- Target version changed from 4.1.20 to 4.1.21
Updated by Vincent MEMBRÉ over 5 years ago
- Subject changed from check-rudder-agent runs every 5 minutes exactly by cron, and can cause spike in resource usage to Cron job checking rudder agent health, is ran every 5 minutes exactly, causing resource usage spike
- Priority changed from 99 to 98
Updated by François ARMAND over 5 years ago
- Related to Bug #14644: When installing rudder-agent, there's a long wait of run interval/2, so up to several hours added
Updated by Vincent MEMBRÉ over 5 years ago
- Status changed from Pending release to Released
- Priority changed from 98 to 97
This bug has been fixed in Rudder 4.1.21, 4.3.11 and 5.0.9 which were released today.