Bug #16859
closedrudder-agent check sleep and process pile-up
Description
Hello,
what is purpose of sleep in /opt/rudder/share/commands/agent-check
when it's still running every 5 minutes?
- run agent every 1h
- first run time 0
- maximum delay 15m
Rudder server OS: CentOS 8
Rudder server version: 6.0.4
Rudder client OS: CentOS 8, Debian 10
Rudder client version: 6.0.3 and 6.0.4
This is resulting to this behavior, so agent is is still started every 5 mins and server spammed with rudder processes:
root 30127 77 0 06:50 ? 00:00:00 /usr/sbin/CROND -n
root 30128 30127 0 06:50 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 30129 30128 0 06:50 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 30174 30129 0 06:50 ? 00:00:00 sleep 1663
root 30249 77 0 06:55 ? 00:00:00 /usr/sbin/CROND -n
root 30250 30249 0 06:55 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 30251 30250 0 06:55 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 30296 30251 0 06:55 ? 00:00:00 sleep 1663
root 30367 77 0 07:00 ? 00:00:00 /usr/sbin/CROND -n
root 30368 30367 0 07:00 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 30369 30368 0 07:00 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 30414 30369 0 07:00 ? 00:00:00 sleep 1663
root 30950 77 0 07:05 ? 00:00:00 /usr/sbin/CROND -n
root 30951 30950 0 07:05 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 30953 30951 0 07:05 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 31003 30953 0 07:05 ? 00:00:00 sleep 1663
root 31577 77 0 07:10 ? 00:00:00 /usr/sbin/CROND -n
root 31578 31577 0 07:10 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 31579 31578 0 07:10 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 31624 31579 0 07:10 ? 00:00:00 sleep 1663
root 31923 77 0 07:15 ? 00:00:00 /usr/sbin/CROND -n
root 31924 31923 0 07:15 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 31925 31924 0 07:15 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
Thanks
Updated by Nicolas CHARLES over 4 years ago
Hi
When called from the cron, this script should sleep for a random time to avoid hammering the host when there are plenty of VMs.
However, they should not pile up - we should not run it when it's already sleeping/running.
Updated by Marek Haluška over 4 years ago
Still same issue in 6.0.5, output from Debian 10.
root 10876 198 0 09:20 ? 00:00:00 /usr/sbin/CRON -f
root 10877 10876 0 09:20 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 10878 10877 0 09:20 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 10926 10878 0 09:20 ? 00:00:00 sleep 1397
root 12158 198 0 09:30 ? 00:00:00 /usr/sbin/CRON -f
root 12159 12158 0 09:30 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 12160 12159 0 09:30 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 12208 12160 0 09:30 ? 00:00:00 sleep 1258
root 13624 198 0 09:40 ? 00:00:00 /usr/sbin/CRON -f
root 13625 13624 0 09:40 ? 00:00:00 /bin/sh -c /opt/rudder/bin/rudder agent check -q >> /var/log/rudder/agent-check/check.log 2>&1
root 13626 13625 0 09:40 ? 00:00:00 /bin/sh /opt/rudder/share/commands/agent-check -q
root 13674 13626 0 09:40 ? 00:00:00 sleep 1119
Updated by Bernd Wolf over 4 years ago
We use an agent-check intervall of 6h.
This code is vanished in 6.0.6 (/opt/rudder/share/commands/agent-check):
"...
- Get the value of rudder-agent run interval from file /var/rudder/cfengine-community/inputs/run_interval
if [ -f "${CFE_DIR}/inputs/run_interval" ]; then
RUN_INTERVAL=`cat "${CFE_DIR}/inputs/run_interval"` # If the value is not a number, reset to 5
if ! test "${RUN_INTERVAL}" -gt 0 2>/dev/null
then
RUN_INTERVAL=5
fi
else # File does not exists, use default value 5
RUN_INTERVAL=5
fi
..."
Updated by François ARMAND over 4 years ago
- Tracker changed from Question to Bug
- Subject changed from rudder-agent check sleep to rudder-agent check sleep and process pile-up
- Severity set to Major - prevents use of part of Rudder | no simple workaround
- User visibility set to Operational - other Techniques | Rudder settings | Plugins
- Priority set to 49
- Name check set to To do
- Fix check set to To do
Updated by Benoît PECCATTE over 4 years ago
- Target version set to 6.2.0~beta1
- Priority changed from 49 to 24
Updated by Benoît PECCATTE over 4 years ago
- Status changed from New to In progress
- Assignee set to Benoît PECCATTE
Updated by Benoît PECCATTE over 4 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from Benoît PECCATTE to Alexis Mousset
- Pull Request set to https://github.com/Normation/rudder-techniques/pull/1615
Updated by Benoît PECCATTE over 4 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder-techniques|15780ae0600ecdfdd8f21b10e2c2c17b54739c4a.
Updated by Vincent MEMBRÉ about 4 years ago
- Target version changed from 6.2.0~beta1 to 7.0.0~beta1
Updated by Vincent MEMBRÉ about 3 years ago
- Status changed from Pending release to Released
- Priority changed from 24 to 43
This bug has been fixed in Rudder 7.0.0~beta1 which was released today.