Cron job checking rudder agent health, is ran every 5 minutes exactly, causing resource usage spike
check-rudder-agent is triggered by a cron job, every 5 minutes
When it happens on a server with 400 VMs, we a 400 instance of this script, running at the same time.
It is an issue because this script also runs cf-promises on the whole promises set, so it uses a bit of resource.We ought to:
- have a splay on this script - could be a sleep of a random time (or deterministic time) based on actual defined spaytime
- don't cf-promises at each run - we should not do it more often than the agent frequency run, and it could be done much less often (once per day) as this is a failsafe after the failsafe
- (optionnaly) run this script at the agent run frequency, rather than every 5 minutes