Project

General

Profile

Bug #7189

issues with process management on physical hosting LXC containers

Added by LibrA LinuX over 4 years ago. Updated over 4 years ago.

Status:
Released
Priority:
2
Assignee:
Matthieu CERDA
Category:
Packaging
Target version:
Severity:
User visibility:
Effort required:
Priority:

Description

Seems i have process issue similar to the Bug #4498. Actually i have several physical servers each one have a rudder-agent installed, and hosting several lxc containers, each one having also rudder-agent installed.
Every 5 minutes, on cron job i received the stdout of the rudder cron job :
WARNING: No disable file detected and no CFEngine process neither. Relaunching CFEngine processes... Done

I think the problem is coming from the physical host that see all cf-execd|cf-agent process (its process and the LXC ones)
When launching the cron job on physical host, ther is too many cfengine instance processes running and the job is killing all of them before restarting only its local process.
It then kills all the cfengine process of the LXC containers...

  1. if [ -e /opt/rudder/bin/check-rudder-agent ]; then /opt/rudder/bin/check-rudder-agent; else if [ ! -e /opt/rudder/etc/disable-agent -a `ps -efww | grep -E "(cf-execd|cf-agent)" | grep -E "/var/rudder/cfengine-community/bin/(cf-execd|cf-agent)" | grep -v grep | wc -l` -eq 0 ]; then /var/rudder/cfengine-community/bin/cf-agent -f failsafe.cf >/dev/null 2>\&1 \&\& /var/rudder/cfengine-community/bin/cf-agent >/dev/null 2>\&1; if [ $? != 0 ]; then if [ -f /opt/rudder/etc/rudder-restart-message.txt ]; then cat /opt/rudder/etc/rudder-restart-message.txt; else echo "Rudder agent was unable to restart on $(hostname)."; fi; fi; fi; fi
    WARNING: Too many instance of CFEngine cf-execd processes running. Killing them... Done
    WARNING: No disable file detected and no CFEngine process neither. Relaunching CFEngine processes...
    Done
    root@libra1 ~#

Related issues

Related to Rudder - Bug #4498: Several issues with process management on Proxmox host (and container)RejectedActions
Related to Rudder - Bug #7335: check-rudder-agent silently fails if namespaces are not supportedReleased2015-10-30Vincent MEMBRÉActions
Related to Rudder - Bug #7338: All reports are missing (totally orange) for a node due to multiple cf-execd processesReleased2015-10-30Nicolas CHARLESActions
Related to Rudder - Bug #7381: Process management issues on nodes hosting LXC containersReleasedAlexis MOUSSETActions
#1

Updated by Nicolas CHARLES over 4 years ago

  • Assignee set to Benoît PECCATTE
  • Priority changed from N/A to 2
  • Target version changed from 3.1.2 to 2.11.14

Thank you for the bug report, this is something Benoit will be able to fix !

#2

Updated by Nicolas CHARLES over 4 years ago

  • Related to Bug #4498: Several issues with process management on Proxmox host (and container) added
#3

Updated by Benoît PECCATTE over 4 years ago

  • Status changed from New to In progress
#4

Updated by Benoît PECCATTE over 4 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Benoît PECCATTE to Matthieu CERDA
  • Pull Request set to https://github.com/Normation/rudder-packages/pull/747
#5

Updated by Benoît PECCATTE over 4 years ago

  • Status changed from Pending technical review to Pending release
  • % Done changed from 0 to 100
#6

Updated by Matthieu CERDA over 4 years ago

#7

Updated by Vincent MEMBRÉ over 4 years ago

  • Category changed from Agent to Packaging
#8

Updated by Vincent MEMBRÉ over 4 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 2.10.17, 2.11.14, 3.0.9 and 3.1.2 which were released today.

#9

Updated by Jonathan CLARKE over 4 years ago

  • Related to Bug #7335: check-rudder-agent silently fails if namespaces are not supported added
#10

Updated by Nicolas CHARLES over 4 years ago

  • Related to Bug #7338: All reports are missing (totally orange) for a node due to multiple cf-execd processes added
#11

Updated by Alexis MOUSSET over 4 years ago

  • Related to Bug #7381: Process management issues on nodes hosting LXC containers added

Also available in: Atom PDF