Bug #5925
closedOn SLES, when we upgrade rudder-server-root, while a node is copying its promise, the copy is corrupted, and files are lost
Description
When upgrading the rudder-server-root on a SLES, it seems it broke something on the client side, for some nodes got corrupted promises (namely missing files)
The output file in the outputs folder on the node side, at this time, is
2014-11-05T10:29:13+0000 error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Timeout - remote end did not respond with the expected amount of data (received=0, expecting=8). (recv: Resource temporarily unavailable) 2014-11-05T10:29:13+0000 error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Protocol transaction broken off (1). (ReceiveTransaction: Resource temporarily unavailable) 2014-11-05T10:29:13+0000 error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Authentication dialogue with '192.168.249.141' failed 2014-11-05T10:29:13+0000 error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: No suitable server responded to hail 2014-11-05T10:31:12+0000 error: /default/update/methods/'update'[0]: Method 'update_action' failed in some repairs R: @@Common@@log_info@@hasPolicyServer-root@@common-root@@379@@common@@StartRun@@2014-11-05 10:31:15+00:00##8b4a4e31-3241-42bb-be63-8d917d3ee9c7@#Start execution
and
2014-11-05T10:33:49+0000 error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Timeout - remote end did not respond with the expected amount of data (received=0, expecting=8). (recv: Resource temporarily unavailable) 2014-11-05T10:35:06+0000 error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/cfengine-community/inputs'[0]: Couldn't receive. (recv: Connection reset by peer) 2014-11-05T10:35:06+0000 error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/cfengine-community/inputs'[0]: Failed receive. (ReceiveTransaction: Connection reset by peer) 2014-11-05T10:35:06+0000 error: /default/update/methods/'update'[0]: Method 'update_action' failed in some repairs 2014-11-05T10:35:06+0000 error: Can't stat file '/var/rudder/cfengine-community/inputs/common/1.0/cf-served.cf' for parsing. (stat: No such file or directory) 2014-11-05T10:35:06+0000 error: Policy failed validation with command '"/var/rudder/cfengine-community/bin/cf-promises" -c "/var/rudder/cfengine-community/inputs/promises.cf"' 2014-11-05T10:35:06+0000 error: CFEngine was not able to get confirmation of promises from cf-promises, so going to failsafe 2014-11-05T10:35:06+0000 error: Can't stat file '/var/rudder/cfengine-community/inputs/common/1.0/update.cf' for parsing. (stat: No such file or directory).
Happened on SLES, while upgrading from Rudder 2.11.1 to 2.11.4
May happen on others systems
Note: I did not run /etc/init.d/rudder-server-root stop before upgrading
Updated by Vincent MEMBRÉ almost 10 years ago
- Target version changed from 2.11.5 to 2.11.6
Updated by François ARMAND almost 10 years ago
the only things I can see that can be done are:
- forcing to stop the server before upgrading to nicelly shut down connection
- testing client-side if the copied promises are OK => see #5641
Is there any other ideas on that ?
Updated by Vincent MEMBRÉ almost 10 years ago
- Target version changed from 2.11.6 to 2.11.7
Updated by Vincent MEMBRÉ almost 10 years ago
- Target version changed from 2.11.7 to 2.11.8
Updated by Vincent MEMBRÉ almost 10 years ago
- Target version changed from 2.11.8 to 2.11.9
Updated by Benoît PECCATTE almost 10 years ago
- Category changed from 14 to Web - Config management
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 2.11.9 to 2.11.10
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 2.11.10 to 2.11.11
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 2.11.11 to 2.11.12
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 2.11.12 to 2.11.13
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 2.11.13 to 2.11.14
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.11.14 to 2.11.15
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.11.15 to 2.11.16
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.11.16 to 2.11.17
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.11.17 to 2.11.18
Updated by Vincent MEMBRÉ almost 9 years ago
- Target version changed from 2.11.18 to 2.11.19
Updated by Vincent MEMBRÉ almost 9 years ago
- Target version changed from 2.11.19 to 2.11.20
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 2.11.20 to 2.11.21
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 2.11.21 to 2.11.22
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 2.11.22 to 2.11.23
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 2.11.23 to 2.11.24
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 2.11.24 to 308
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 308 to 3.1.14
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.14 to 3.1.15
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.15 to 3.1.16
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.16 to 3.1.17
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.17 to 3.1.18
Updated by Vincent MEMBRÉ almost 8 years ago
- Target version changed from 3.1.18 to 3.1.19
Updated by François ARMAND over 7 years ago
- Severity set to Major - prevents use of part of Rudder | no simple workaround
- User visibility set to Operational - other Techniques | Technique editor | Rudder settings
- Priority set to 30
Updated by Vincent MEMBRÉ over 7 years ago
- Target version changed from 3.1.19 to 3.1.20
Updated by Vincent MEMBRÉ over 7 years ago
- Target version changed from 3.1.20 to 3.1.21
Updated by Vincent MEMBRÉ over 7 years ago
- Target version changed from 3.1.21 to 3.1.22
Updated by Vincent MEMBRÉ over 7 years ago
- Target version changed from 3.1.22 to 3.1.23
Updated by Vincent MEMBRÉ over 7 years ago
- Target version changed from 3.1.23 to 3.1.24
Updated by Vincent MEMBRÉ about 7 years ago
- Target version changed from 3.1.24 to 3.1.25
Updated by Vincent MEMBRÉ about 7 years ago
- Target version changed from 3.1.25 to 387
Updated by Vincent MEMBRÉ almost 7 years ago
- Target version changed from 387 to 4.1.10
Updated by Vincent MEMBRÉ almost 7 years ago
- Target version changed from 4.1.10 to 4.1.11
- Priority changed from 43 to 44
Updated by Vincent MEMBRÉ over 6 years ago
- Target version changed from 4.1.11 to 4.1.12
- Priority changed from 44 to 45
Updated by Vincent MEMBRÉ over 6 years ago
- Target version changed from 4.1.12 to 4.1.13
Updated by Vincent MEMBRÉ over 6 years ago
- Target version changed from 4.1.13 to 4.1.14
- Priority changed from 45 to 46
Updated by Benoît PECCATTE over 6 years ago
- Target version changed from 4.1.14 to 4.1.15
Updated by Nicolas CHARLES about 6 years ago
- Effort required set to Very Small
- Priority changed from 46 to 73
i'm quite sure it can recover given that rudder agent health monitor the state of policies
Setting to Very Small to try to reproduce and check it is indeed recovering
Updated by Vincent MEMBRÉ about 6 years ago
- Target version changed from 4.1.15 to 4.1.16
Updated by Vincent MEMBRÉ about 6 years ago
- Target version changed from 4.1.16 to 4.1.17
- Priority changed from 73 to 74
Updated by Nicolas CHARLES about 6 years ago
- Status changed from New to Rejected
- Priority changed from 74 to 0
This has been fixed via #12265 : if policies are corrupted, then rudder agent check will fix them
Updated by Nicolas CHARLES about 6 years ago
- Related to Bug #12265: rudder agent check should trigger failsafe run when promises are broken added