Bug #9622
closedFusioninventory is not tracked by check-rudder-health
Description
Rudder has process management in the health checker script.
Using that it can detect and kill superfluous processes.
I.e. if you have 5 running cf-agent's it'll sort that out.
The problem is that it's not also killing FusionInventory.
A dangerous real-world scenario that can happen is:
The system's IPMI/iDRAC module hangs a little bit.
That will hang FusionInventory, too.
It'll still killable, not in kernel IO.
In the end, you can have multiple thousand of those processes.
The worst thing is if someone fixes the IPMI issue.
Then you have load >2000 and the system becomes unusable.
Can you please extend check-rudder-health to also make sure there's not more than a few FusionInventory processes?
It obviously doesn't know locking, so we need to help it. ;-)
root 37597 0.0 0.0 11732 1116 ? S Oct01 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 37750 0.0 0.0 83784 24188 ? S Oct01 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 37759 0.0 0.0 11316 868
? S Oct01 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 37760 0.0 0.0 20228 1152
? S Oct01 0:00 \_ ipmitool lan print
root 816 0.0 0.0 11732 892 ? S Oct01 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 965 0.0 0.0 83784 8676 ? S Oct01 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 974 0.0 0.0 11316 712
? S Oct01 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 975 0.0 0.0 20228 1148
? S Oct01 0:00 \_ ipmitool lan print
root 14720 0.0 0.0 11732 976 ? S Oct01 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 14876 0.0 0.0 83784 14012 ? S Oct01 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 14892 0.0 0.0 11316 732
? S Oct01 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 14893 0.0 0.0 20228 936
? S Oct01 0:00 \_ ipmitool lan print
root 15289 0.0 0.0 11732 1096 ? S Oct02 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 15400 0.0 0.0 83784 8448 ? S Oct02 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 15418 0.0 0.0 11316 704
? S Oct02 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 15419 0.0 0.0 20228 1148
? S Oct02 0:00 \_ ipmitool lan print
root 29446 0.0 0.0 11732 904 ? S Oct02 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 29543 0.0 0.0 83784 9868 ? S Oct02 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 29558 0.0 0.0 11316 728
? S Oct02 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 29559 0.0 0.0 20228 896
? S Oct02 0:00 \_ ipmitool lan print
root 46064 0.0 0.0 11732 964 ? S Oct02 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 46173 0.0 0.0 83784 24172 ? S Oct02 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 46186 0.0 0.0 11316 876
? S Oct02 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 46187 0.0 0.0 20228 936
? S Oct02 0:00 \_ ipmitool lan print
root 5262 0.0 0.0 11732 1100 ? S Oct03 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 5551 0.0 0.0 83784 24156 ? S Oct03 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 5566 0.0 0.0 11316 876
? S Oct03 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 5567 0.0 0.0 20228 1148
? S Oct03 0:00 \_ ipmitool lan print
root 18636 0.0 0.0 11732 1108 ? S Oct03 0:00
/bin/sh /opt/rudder/bin/run-inventory --local=/var/rudder/tmp/inventory
--scan-homedirs
root 18789 0.0 0.0 83784 9180 ? S Oct03 0:00 \_
/opt/rudder/bin/perl -I /opt/rudder/lib/perl5
/opt/rudder/bin/fusioninventory-agent --config=none --
local=/var/rudder/tmp/inventory --scan-homedirs
root 18803 0.0 0.0 11316 704
? S Oct03 0:00 \_ sh -c ipmitool lan print 2>/dev/null
root 18805 0.0 0.0 20228 1148
? S Oct03 0:00 \_ ipmitool lan print
We thought we had reported something about this, but I couldn't find it.
It's a critical issue, the load spike will cause short outages.
Updated by Alexis Mousset about 8 years ago
- Translation missing: en.field_tag_list set to Next minor release
- Target version set to 3.1.17
Which Rudder version was running on this agent? If it was a 3.2, we might have recently fixed the root cause of the process accumulation.
Anyway, our cron script should not let those running, we should kill them like CFEngine processes.
Updated by Alexis Mousset about 8 years ago
- Has duplicate Bug #7726: Agent watchdog (check-rudder-agent) doesn't kill fusion added
Updated by Nicolas CHARLES about 8 years ago
As discussed internally, we will ensure that when we kill the agent, we kill its children also.
Updated by Nicolas CHARLES about 8 years ago
- Status changed from New to In progress
- Assignee set to Nicolas CHARLES
Updated by Nicolas CHARLES about 8 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from Nicolas CHARLES to Benoît PECCATTE
- Pull Request set to https://github.com/Normation/rudder-packages/pull/1142
Updated by Benoît PECCATTE about 8 years ago
- Assignee changed from Benoît PECCATTE to Nicolas CHARLES
Updated by Nicolas CHARLES about 8 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder-packages|862f4dab103961a026bf1630310b28c483aa95fc.
Updated by Nicolas CHARLES about 8 years ago
- Has duplicate Bug #7285: Only run inventory collection when no other is running added
Updated by Vincent MEMBRÉ about 8 years ago
- Status changed from Pending release to Released
Updated by Benoît PECCATTE almost 2 years ago
- Subtask deleted (
#11102) - Priority set to 0