Bug #7189
closedissues with process management on physical hosting LXC containers
Description
Seems i have process issue similar to the Bug #4498. Actually i have several physical servers each one have a rudder-agent installed, and hosting several lxc containers, each one having also rudder-agent installed.
Every 5 minutes, on cron job i received the stdout of the rudder cron job :
WARNING: No disable file detected and no CFEngine process neither. Relaunching CFEngine processes... Done
I think the problem is coming from the physical host that see all cf-execd|cf-agent process (its process and the LXC ones)
When launching the cron job on physical host, ther is too many cfengine instance processes running and the job is killing all of them before restarting only its local process.
It then kills all the cfengine process of the LXC containers...
- if [ -e /opt/rudder/bin/check-rudder-agent ]; then /opt/rudder/bin/check-rudder-agent; else if [ ! -e /opt/rudder/etc/disable-agent -a `ps -efww | grep -E "(cf-execd|cf-agent)" | grep -E "/var/rudder/cfengine-community/bin/(cf-execd|cf-agent)" | grep -v grep | wc -l` -eq 0 ]; then /var/rudder/cfengine-community/bin/cf-agent -f failsafe.cf >/dev/null 2>\&1 \&\& /var/rudder/cfengine-community/bin/cf-agent >/dev/null 2>\&1; if [ $? != 0 ]; then if [ -f /opt/rudder/etc/rudder-restart-message.txt ]; then cat /opt/rudder/etc/rudder-restart-message.txt; else echo "Rudder agent was unable to restart on $(hostname)."; fi; fi; fi; fi
WARNING: Too many instance of CFEngine cf-execd processes running. Killing them... Done
WARNING: No disable file detected and no CFEngine process neither. Relaunching CFEngine processes...
Done
root@libra1 ~#