Bug #3052
closedHaving an error with a Directive based on Download from a shared folder using Generic Variable Definition, will lead to all the Directives using Generic Variable to be in error
Added by Nicolas PERRON about 12 years ago. Updated almost 12 years ago.
Description
- A Directive based on Download From A Shared Folder with a source using a Generic Variable
- Another Directive based on Download From A Shared Folder with a source using a Generic Variable but with a typo like "$(generic_variable_definiton.myvar)"
The two of them will be in error and execution of agent on the client will be about connection failed.
Updated by François ARMAND about 12 years ago
- Subject changed from Having an error with a Directive based on Download from a shared folder using Generic Variable Definition, will lead to all the Directives using Generic Variable to Having an error with a Directive based on Download from a shared folder using Generic Variable Definition, will lead to all the Directives using Generic Variable to be in error
Updated by François ARMAND about 12 years ago
- Category set to Web - Compliance & node report
- Status changed from New to 2
- Assignee set to Nicolas CHARLES
- Target version set to 2.4.0~rc2
Updated by Nicolas CHARLES about 12 years ago
- Status changed from 2 to In progress
Updated by Nicolas CHARLES about 12 years ago
The problem is technique/cfengine related, not reporting :
I reproduced it on a test environement, what is happening is that the server is denying further connections as the one with $() within is invalid :
rudder> Allowing 192.168.110.21 to connect without (re)checking ID rudder> Non-verified Host ID is 192.168.110.21 (Using skipverify) rudder> Non-verified User ID seems to be root (Using skipverify) rudder> -> Public key identity of host "192.168.110.21" is "MD5=f0318b7cb678e7f03a586ca784110555" rudder> -> Last saw -MD5=f0318b7cb678e7f03a586ca784110555 (alias 192.168.110.21) at Thu Nov 29 12:37:06 2012 rudder> A public key was already known from 192.168.110.21/192.168.110.21 - no trust required rudder> Adding IP 192.168.110.21 to SkipVerify - no need to check this if we have a key rudder> The public key identity was confirmed as root@192.168.110.21 rudder> -> Strong authentication of client 192.168.110.21/192.168.110.21 achieved rudder> -> Receiving session key from client (size=256)... rudder> Filename /var/rudder/configuration-repository/shared-files/$(generic_variable_definiton.def2) is resolved to /var/rudder/configuration-repository/shared-files/$(generic_variable_definiton.def2) rudder> Couldn't stat filename /var/rudder/configuration-repository/shared-files/$(generic_variable_definiton.def2) requested by host 192.168.110.21 rudder> !!! System error for lstat: "No such file or directory" rudder> Access control in sync rudder> From (host=192.168.110.21,user=root,ip=192.168.110.21) rudder> REFUSAL of request from connecting host: (SYNCH 1354189026 STAT /var/rudder/configuration-repository/shared-files/$(generic_variable_definiton.def2)) rudder> -> Accepting a connection rudder> Denying repeated connection from "192.168.110.21"
On the client side :
rudder> Comment: Enforce content of /tmp/two based on the content on the Rudder server with mtime method rudder> ......................................................... rudder> rudder> -> Copy file /tmp/two from /var/rudder/configuration-repository/shared-files/$(generic_variable_definiton.def2) check rudder> No existing connection to 192.168.110.20 is established... rudder> Set cfengine port number to 5309 = 5309 rudder> Set connection timeout to 10 rudder> -> Connect to 192.168.110.20 = 192.168.110.20 on port 5309 rudder> -> Matched IP 192.168.110.20 to key MD5=e82b35316903e3400a840a83fae1d295 rudder> .....................[.h.a.i.l.]................................. rudder> Strong authentication of server=192.168.110.20 connection confirmed rudder> -> Public key identity of host "192.168.110.20" is "MD5=e82b35316903e3400a840a83fae1d295" rudder> -> Last saw +MD5=e82b35316903e3400a840a83fae1d295 (alias 192.168.110.20) at Thu Nov 29 12:37:06 2012 rudder> Server returned error: Unspecified server refusal (see verbose server output) rudder> Can't stat /var/rudder/configuration-repository/shared-files/$(generic_variable_definiton.def2) in files.copyfrom promise rudder> ?> defining promise result class copy_file_1_failed (snip) rudder> -> Handling file existence constraints on /tmp/one rudder> -> File permissions on /tmp/one as promised rudder> ?> defining promise result class copy_file_2_kept rudder> -> Handling file existence constraints on /tmp/one rudder> -> File permissions on /tmp/one as promised rudder> ?> defining promise result class copy_file_2_kept rudder> -> Copy file /tmp/one from /var/rudder/configuration-repository/shared-files/def1 check rudder> Existing connection to 192.168.110.20 seems to be active... rudder> Set cfengine port number to 5309 = 5309 rudder> Set connection timeout to 10 rudder> -> Connect to 192.168.110.20 = 192.168.110.20 on port 5309 rudder> -> Matched IP 192.168.110.20 to key MD5=e82b35316903e3400a840a83fae1d295 rudder> Couldn't send rudder> !!! System error for send: "Broken pipe" rudder> Couldn't send rudder> !!! System error for send: "Broken pipe" rudder> Couldn't send rudder> !!! System error for send: "Broken pipe" rudder> Challenge response from server 192.168.110.20/192.168.110.20 was incorrect! rudder> I: Report relates to a promise with handle "" rudder> I: Made in version 'not specified' of '/var/rudder/cfengine-community/inputs/copyGitFile/1.3/copyFileFromSharedFolder.cf' near line 90 rudder> I: Comment: Enforce content of file /tmp/one based on the content on the Rudder server with mtime method rudder> !! Authentication dialogue with 192.168.110.20 failed rudder> Unable to establish connection with 192.168.110.20 rudder> ?> defining promise result class copy_file_2_failed
Updated by Nicolas CHARLES about 12 years ago
- Status changed from In progress to Discussion
Having only one connection available per node is clearly limiting for the download from a shared folder technique
Adding a "allallconnects" attribute in the server promises ( http://cfengine.com/manuals/cf3-Reference#allowallconnects-in-server ) solved the issue
It would allow each node to have several connection with the server. The obvious benefit is that if there is a long copy, other agent execution cannot connect to the server to fetch new promises. Apparently, if a copy fails, the connection is released late also
The risk is that if there are a lot of agent running on a specific node, they can hammer the policy server (but I'm not sure it woul really hammer, as they would still start every 5 minutes)
Should we implement this fix in 2.3 and/or 2.4 ??
Updated by Nicolas CHARLES about 12 years ago
- Assignee changed from Nicolas CHARLES to Jonathan CLARKE
Jon, can we implement this change in 2.3 and 2.4 ? It's a one line modification in the PT/Technique DistributePolicy
Updated by Jonathan CLARKE about 12 years ago
- Target version changed from 2.4.0~rc2 to 2.3.10
Yes, this seems like a good fix to me. I note that we have already set the max number of connections quite high, so this shouldn't be a problem (1000).
Of course, it must be fixed in 2.3 and 2.4, since this bug affects both versions.
Updated by Jonathan CLARKE about 12 years ago
- Assignee changed from Jonathan CLARKE to Nicolas CHARLES
Updated by Nicolas CHARLES about 12 years ago
- Status changed from Discussion to Pending technical review
The pull request is here
https://github.com/Normation/rudder-techniques/pull/6
Updated by Jonathan CLARKE about 12 years ago
- Status changed from Pending technical review to Released
Nicolas CHARLES wrote:
The pull request is here
https://github.com/Normation/rudder-techniques/pull/6
Looks good to me, merged.
Updated by Jonathan CLARKE about 12 years ago
- Status changed from Released to Pending release
Updated by Nicolas PERRON almost 12 years ago
- Status changed from Pending release to Released