Bug #4314
closedInventories containing very long (> 4096) process name cannot be send to rudder server via CFEngine
Description
This issue has been encountered while deploying v2.8.1, but evidently affects all the versions shipped with cfengine-community 3.5.2 at least.
The initial deployment of rudder-agent package depends on the initial policy included within the rudder-agent rpm. One of the first steps would be to run the inventory and submit the *.ocs file to the server. Until the file is accepted to the server, the client could not get any updates to the policy, so the issue could not be corrected by modifying the centralized policy.
In our environment there are quite a few hosts running java processes with very long command line args list. In fact, they are way longer than 4096 bytes, so the /proc/<pid>/cmdline actually truncates them at 4096. The FusionInventory process collects them as-is, and properly encloses the (truncated) command line in <CMD></CMD> tags and indents them, so the lines get even bigger (a bit over 4100 bytes). Unfortunately, subsequent actions in the initial policy try to edit this file in various ways; as the result of these editions, the lines get truncated at 4096, thus losing the trailing </CMD>. The file is then transmitted successfully to the rudder server, and the task is marked as successfully completed (my understanding is that it is now should be repeated not earlier than after 8hrs).
The transmitted to the server file is rejected because of XMS syntax error, so it is impossible to accept the host into Rudder.
This bug is apparently related to cfengine bug https://cfengine.com/dev/issues/3882: "Editing a file containing a line longer than 4096 chars will cause the text to be truncated". While it is not considered to be very serious in cfengine (i.e. one of the comments says: "Clarifying the title to be a bit less scary - even though this sucks, it doesn't break big files, just files with big lines..." in rudder it seems to be a blocker, as it breaks the deployment process.
It may also be related to https://cfengine.com/dev/issues/3852#change-15347, which implicitly puts the blame on CF_BUFSIZE, limited to 4096.
Also this issue may be related to http://www.rudder-project.org/redmine/issues/3838 - if those long processes are intermittent, one of the inventory collections may fly. Unfortunately in our case those processes run for months, and there is no chance to slip by them.
As a workaround, I have modified the perl file, /opt/rudder/share/fusioninventory/lib/FusionInventory/Agent/Tools/Unix.pm, to make the command line a bit shorter, so addition of xml tags and indentations would not trigger the cfengine bug:
--- /opt/rudder/share/fusioninventory/lib/FusionInventory/Agent/Tools/Unix.pm.orig 2014-01-02 18:46:15.252993839 0000
++ /opt/rudder/share/fusioninventory/lib/FusionInventory/Agent/Tools/Unix.pm 2014-01-02 18:50:23.954332253 +0000@ -288,6 +288,9
@
my $time = $10;
my $cmd = $11;
+ # Alex: a workaround for the cfengine bug truncating lines longer than 4096 in edit_line
+ $cmd = substr $cmd,0,4076 if length($cmd) > 4076;
+
# try to get a consistant time format
my $begin;
if ($started =~ /^(\d{1,2}):(\d{2})/) {
Whether this fix would lead to the information loss is a separate issue (as the command line is truncated anyway by the OS), but at least it allows to accept the host to the server.
Updated by Matthieu CERDA almost 11 years ago
- Category set to 26
- Status changed from New to 8
- Assignee set to Matthieu CERDA
- Priority changed from N/A to 1 (highest)
- Target version set to 2.6.10
Wow, I love the idea, the proposed fix would be much more flexible than having to wait for a CFEngine patch to be tested and validated...
Updated by François ARMAND almost 11 years ago
Alex, I will let Matthieu handle the ticket, but I just would like to thank you for such a detailled, nice, thoughtfull bug report - the kind all open source project love to have. So thank you, it's a marvelous gift for us!
On a side note, would you please sign the CLA so that we can merge you patch as is ? It's a couple of click here: https://secure.echosign.com/public/hostedForm?formid=ECPMWD7B3BU3H
(there is explanation about the CLA raison d'être here: http://www.rudder-project.org/foswiki/Development/HowToContribute).
Thanks again for that ticket.
Updated by Matthieu CERDA almost 11 years ago
- Status changed from 8 to Pending technical review
- Assignee changed from Matthieu CERDA to Jonathan CLARKE
- % Done changed from 0 to 100
- Pull Request set to https://github.com/Normation/rudder-packages/pull/204
PR ready to review !
Updated by Alex Tkachenko almost 11 years ago
I have signed the CLA.
As for the review - I take it there are no actions on my part are expected. However:
- the patch was submitted against one file, but there are two identical files, one under /opt/rudder/share/fusioninventory/lib/FusionInventory and another under /opt/rudder/lib/perl5/FusionInventory - I am not sure why the share is actually working while the other does not, as /opt/rudder/bin/run-inventory includes /opt/rudder/lib/perl5 earlier - anyway, you may want to apply the fix to both;
- does making the target v2.6.10 cover v2.8.x too? How long will it take to make it into actual RPMs? I thought I could do the rebuild myself, but it looks like the source rpms provided at the download location are not enough - is there a build setup description available anywhere?
- in theory, the proper place to fix this would be in addField, but as this is just a temp workaround (until cfe 3.6 which is due in March, I hope) it may be OK. We just need to remember to remove it in the future.
The patch was garbled a bit during the submission, and I am glad you have noticed that.
Thank you so much for the agile response!
Updated by François ARMAND almost 11 years ago
Alex Tkachenko wrote:
I have signed the CLA.
We saw that, thanks!
Just some more info on that part:
- does making the target v2.6.10 cover v2.8.x too? How long will it take to make it into actual RPMs?
Yes, making the target 2.6 will cover all (supported) version to the most recent. We have as policy to correct bug in the oldest supported version they appeared into, and then make the correction pop-up to newer versions.
As soon as the pull request is validated and merged, the code will go into our continuous integration infrastructure, and then, it should be present in the next nightly packages, and then in the next minor release (bug fix) of each supported version.
Now, Jon, it's up to you with the technical review.
Updated by Jonathan CLARKE almost 11 years ago
I agree with this fix as a temporary workaround - it is short and simple, can't really go wrong here. I will merge it in 2.6 branch, and it will be merged up into 2.7, 2.8 and 2.9, and be available in tomorrow's nightly builds of rudder-agent. Then, it will be in the next minor release on each of these branches.
However, I think that we can implement a cleaner workaround for the next major version. Currently, the only reason we have CFEngine editing the XML inventory file is to add some lines. How about we have CFEngine spit these lines out into a separate file, then just concatenate them together (we may need to be clever to get them in the right place in the XML, but tail/head/cat should manage just fine). This way, we no longer have any edit_line on this file in CFEngine, and the 4096 char limit is no longer a problem!
What do you guys think?
Updated by Matthieu CERDA almost 11 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset packages:commit:92c011866e562c97ba8682fd367280581035da0b.
Updated by Jonathan CLARKE almost 11 years ago
Applied in changeset packages:commit:a2b7f7b5d332bd257c11dfe666f2930ee70a1965.
Updated by Matthieu CERDA almost 11 years ago
Looks like a good idea, we have to make sure that this doesn't break anything since we add quite a bunch of info, os-specifically.
Updated by François ARMAND almost 11 years ago
Jonathan CLARKE wrote:
I agree with this fix as a temporary workaround - it is short and simple, can't really go wrong here. I will merge it in 2.6 branch, and it will be merged up into 2.7, 2.8 and 2.9, and be available in tomorrow's nightly builds of rudder-agent. Then, it will be in the next minor release on each of these branches.
However, I think that we can implement a cleaner workaround for the next major version. Currently, the only reason we have CFEngine editing the XML inventory file is to add some lines. How about we have CFEngine spit these lines out into a separate file, then just concatenate them together (we may need to be clever to get them in the right place in the XML, but tail/head/cat should manage just fine). This way, we no longer have any edit_line on this file in CFEngine, and the 4096 char limit is no longer a problem!
What do you guys think?
That seems far clearer, and even more futur-proof: it will be easier to integrate a Fusion-Rudder plugin if we already manage Rudder specific things without CFEngine (or at least, with the minimum implication of CFEngine in the middle).
Updated by Jonathan CLARKE almost 11 years ago
François ARMAND wrote:
That seems far clearer, and even more futur-proof: it will be easier to integrate a Fusion-Rudder plugin if we already manage Rudder specific things without CFEngine (or at least, with the minimum implication of CFEngine in the middle).
Actually that's a good point - we already contributed a lot of work upstream to FusionInventory for this (see http://forge.fusioninventory.org/issues/1689#change-6905 and https://www.rudder-project.org/foswiki/Development/FusionInventory). It should be in the latest releases. So, just upgrading our FusionInventory, and fixing rudder-inventory-endpoint to handle the new fields would do this.
Updated by Jonathan CLARKE almost 11 years ago
We can follow up on this in #2479.
Updated by Vincent MEMBRÉ almost 11 years ago
- Subject changed from A bug in cfengine prevents agent hosts from being added to the server to Inventories containing very long (> 4096) process name cannot be send to rudder server via CFEngine
Updated by Vincent MEMBRÉ almost 11 years ago
- Status changed from Pending release to Released
This bug has been fixed in Rudder 2.6.10 which was released today.
Check out:
- The release announcement: http://www.rudder-project.org/pipermail/rudder-announce/2014-January/000070.html
- The full ChangeLog: http://www.rudder-project.org/foswiki/bin/view/System/Documentation:ChangeLog26
- Download information: https://www.rudder-project.org/site/get-rudder/downloads/
Updated by Benoît PECCATTE almost 10 years ago
- Category changed from 26 to Web - Nodes & inventories