Bug #14190
closedInventory may never finish if there is a disk issue or invalid mountpoint
Description
It occurs that inventory fails when some mount point are NFS, and NFS is failing - in this case, inventories are piling up, and never ending (they are simply killed).
We should include a timeout within inventory for disk exploration.
Updated by Alexis Mousset almost 6 years ago
- Target version changed from 4.3.9 to 4.3.10
Updated by François ARMAND almost 6 years ago
- Target version changed from 4.3.10 to 4.3.11
Updated by Vincent MEMBRÉ almost 6 years ago
- Target version changed from 4.3.11 to 4.3.12
Updated by François ARMAND almost 6 years ago
- Related to Bug #14476: Inventory can not complete on an hypervisor if one of the guest machine is not accessible any more added
Updated by François ARMAND over 5 years ago
- Translation missing: en.field_tag_list set to Sponsored
- Subject changed from Inventory may not finish if there is a disk issue or invalid mountpoint to Inventory may never finish if there is a disk issue or invalid mountpoint
- Description updated (diff)
- Target version changed from 4.3.12 to 5.0.10
- Severity set to Major - prevents use of part of Rudder | no simple workaround
- User visibility set to Getting started - demo | first install | Technique editor and level 1 Techniques
- Priority changed from 0 to 98
Updated by François ARMAND over 5 years ago
- Category changed from Packaging to Web - Nodes & inventories
Updated by Vincent MEMBRÉ over 5 years ago
- Target version changed from 5.0.10 to 5.0.11
Updated by Nicolas CHARLES over 5 years ago
The issue is in
[debug] Running FusionInventory::Agent::Task::Inventory::Linux::Drives [debug2] executing df -P -T -k
setting up a nfs share, and shutting down the nfs server exhibit it.
there's already a timeout, set by "alarm", using --backend-collect-timeout (defaut 180s) , but it fails in this case
alarm doesn't seem to work with filehandle correctly in non interactiv script (see https://perldoc.perl.org/functions/alarm.html and https://www.perlmonks.org/?node_id=999179 )
changing getFileHandle in Tools.pm, in case of command with the following doesn't solves the issue
eval { local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required alarm 5; if (!open $handle, '-|', $params{command} . " 2>$nowhere") { $params{logger}->error( "Can't run command $params{command}: $ERRNO" ) if $params{logger}; return; } };
Updated by Nicolas CHARLES over 5 years ago
actually, this returns, but it's after that it locks, during the my $line = <$handle>;
Updated by Nicolas CHARLES over 5 years ago
doing
my $line; # get headers line first eval { local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required alarm 5; $line = <$handle>; };
allows to get to the next step, however it still locks, and command doesn't end - probably the handle is not released
Updated by Nicolas CHARLES over 5 years ago
an easier solution coulf be to check it timeout is present on the system, and if so, run "timeout 5 df" , and if not, don't use timeout
Updated by Nicolas CHARLES over 5 years ago
- Status changed from New to In progress
Updated by Nicolas CHARLES over 5 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from Nicolas CHARLES to Benoît PECCATTE
- Pull Request set to https://github.com/Normation/rudder-packages/pull/1909
Updated by Nicolas CHARLES over 5 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder-packages|62e0b4d33e00945f29f571d952cfe17895b38547.
Updated by Vincent MEMBRÉ over 5 years ago
- Status changed from Pending release to Released
- Priority changed from 98 to 97
This bug has been fixed in Rudder 5.0.11 which was released today.
Updated by Nicolas CHARLES almost 4 years ago
- Related to Bug #18832: Rudder Agent consumes complete Memory because of fdisk added