Project

General

Profile

Actions

Bug #4768

closed

check-rudder-agent should take splaytime into account when checking the last input update file

Added by Vincent MEMBRÉ about 10 years ago. Updated over 4 years ago.

Status:
Rejected
Priority:
2
Assignee:
-
Category:
System integration
Target version:
Severity:
Minor - inconvenience | misleading | easy workaround
UX impact:
User visibility:
Operational - other Techniques | Technique editor | Rudder settings
Effort required:
Priority:
0
Name check:
Fix check:
Regression:

Description

Following #4766, check-rudder-agent now takes into account the frequency of agent execution.

We decided to double the run interval so we could handle the splay time correctly and that we are sure that we haven't miss any execution.

But this case only occurs if the splaytime is very close to the frequency, otherwise the check is made really late and could have been done before:

if the frequency is every 2 hours, and the splaytime 5 minutes, if the agent  has ran a 0:00, next agent run can occurs between 2:00 and 2:05
If the lock happens here, it will only be detected at ~ 4:00, so have to wait at least 1h55 before fixing the issue.

The check should look every "run Interval + splaytime + 5 minutes" (to ensure that the agent has finished to run), so you have to wait only 5 minutes after splytime to fix tcdb issues.

To do that we need a file where the splaytime is stored, like the run interval.


Related issues 1 (0 open1 closed)

Related to Rudder - Bug #14258: Cron job checking rudder agent health, is ran every 5 minutes exactly, causing resource usage spikeReleasedNicolas CHARLESActions
Actions

Also available in: Atom PDF