Project

General

Profile

Actions

Bug #5925

closed

On SLES, when we upgrade rudder-server-root, while a node is copying its promise, the copy is corrupted, and files are lost

Added by Nicolas CHARLES almost 10 years ago. Updated about 6 years ago.

Status:
Rejected
Priority:
1 (highest)
Category:
Web - Config management
Target version:
Severity:
Major - prevents use of part of Rudder | no simple workaround
UX impact:
User visibility:
Operational - other Techniques | Technique editor | Rudder settings
Effort required:
Very Small
Priority:
0
Name check:
Fix check:
Regression:

Description

When upgrading the rudder-server-root on a SLES, it seems it broke something on the client side, for some nodes got corrupted promises (namely missing files)

The output file in the outputs folder on the node side, at this time, is

2014-11-05T10:29:13+0000    error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Timeout - remote end did not respond with the expected amount of data (received=0, expecting=8). (recv: Resource temporarily unavailable)
2014-11-05T10:29:13+0000    error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Protocol transaction broken off (1). (ReceiveTransaction: Resource temporarily unavailable)
2014-11-05T10:29:13+0000    error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Authentication dialogue with '192.168.249.141' failed
2014-11-05T10:29:13+0000    error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: No suitable server responded to hail
2014-11-05T10:31:12+0000    error: /default/update/methods/'update'[0]: Method 'update_action' failed in some repairs
R: @@Common@@log_info@@hasPolicyServer-root@@common-root@@379@@common@@StartRun@@2014-11-05 10:31:15+00:00##8b4a4e31-3241-42bb-be63-8d917d3ee9c7@#Start execution

and
2014-11-05T10:33:49+0000    error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/ncf/common'[0]: Timeout - remote end did not respond with the expected amount of data (received=0, expecting=8). (recv: Resource temporarily unavailable)
2014-11-05T10:35:06+0000    error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/cfengine-community/inputs'[0]: Couldn't receive. (recv: Connection reset by peer)
2014-11-05T10:35:06+0000    error: /default/update/methods/'update'/default/update_action/files/'/var/rudder/cfengine-community/inputs'[0]: Failed receive. (ReceiveTransaction: Connection reset by peer)
2014-11-05T10:35:06+0000    error: /default/update/methods/'update'[0]: Method 'update_action' failed in some repairs
2014-11-05T10:35:06+0000    error: Can't stat file '/var/rudder/cfengine-community/inputs/common/1.0/cf-served.cf' for parsing. (stat: No such file or directory)
2014-11-05T10:35:06+0000    error: Policy failed validation with command '"/var/rudder/cfengine-community/bin/cf-promises" -c "/var/rudder/cfengine-community/inputs/promises.cf"'
2014-11-05T10:35:06+0000    error: CFEngine was not able to get confirmation of promises from cf-promises, so going to failsafe
2014-11-05T10:35:06+0000    error: Can't stat file '/var/rudder/cfengine-community/inputs/common/1.0/update.cf' for parsing. (stat: No such file or directory).

Happened on SLES, while upgrading from Rudder 2.11.1 to 2.11.4
May happen on others systems

Note: I did not run /etc/init.d/rudder-server-root stop before upgrading


Related issues 1 (0 open1 closed)

Related to Rudder - Bug #12265: rudder agent check should trigger failsafe run when promises are brokenReleasedAlexis MoussetActions
Actions #1

Updated by Vincent MEMBRÉ almost 10 years ago

  • Target version changed from 2.11.5 to 2.11.6
Actions #2

Updated by François ARMAND almost 10 years ago

the only things I can see that can be done are:

- forcing to stop the server before upgrading to nicelly shut down connection
- testing client-side if the copied promises are OK => see #5641

Is there any other ideas on that ?

Actions #3

Updated by Vincent MEMBRÉ almost 10 years ago

  • Target version changed from 2.11.6 to 2.11.7
Actions #4

Updated by Vincent MEMBRÉ almost 10 years ago

  • Target version changed from 2.11.7 to 2.11.8
Actions #5

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.11.8 to 2.11.9
Actions #6

Updated by Benoît PECCATTE over 9 years ago

  • Category changed from 14 to Web - Config management
Actions #7

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.11.9 to 2.11.10
Actions #8

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.11.10 to 2.11.11
Actions #9

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.11.11 to 2.11.12
Actions #10

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.11.12 to 2.11.13
Actions #11

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.11.13 to 2.11.14
Actions #12

Updated by Vincent MEMBRÉ about 9 years ago

  • Target version changed from 2.11.14 to 2.11.15
Actions #13

Updated by Vincent MEMBRÉ about 9 years ago

  • Target version changed from 2.11.15 to 2.11.16
Actions #14

Updated by Vincent MEMBRÉ about 9 years ago

  • Target version changed from 2.11.16 to 2.11.17
Actions #15

Updated by Vincent MEMBRÉ almost 9 years ago

  • Target version changed from 2.11.17 to 2.11.18
Actions #16

Updated by Vincent MEMBRÉ almost 9 years ago

  • Target version changed from 2.11.18 to 2.11.19
Actions #17

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 2.11.19 to 2.11.20
Actions #18

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 2.11.20 to 2.11.21
Actions #19

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 2.11.21 to 2.11.22
Actions #20

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 2.11.22 to 2.11.23
Actions #21

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 2.11.23 to 2.11.24
Actions #22

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 2.11.24 to 308
Actions #23

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 308 to 3.1.14
Actions #24

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.1.14 to 3.1.15
Actions #25

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.1.15 to 3.1.16
Actions #26

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.1.16 to 3.1.17
Actions #27

Updated by Vincent MEMBRÉ almost 8 years ago

  • Target version changed from 3.1.17 to 3.1.18
Actions #28

Updated by Vincent MEMBRÉ almost 8 years ago

  • Target version changed from 3.1.18 to 3.1.19
Actions #29

Updated by François ARMAND over 7 years ago

  • Severity set to Major - prevents use of part of Rudder | no simple workaround
  • User visibility set to Operational - other Techniques | Technique editor | Rudder settings
  • Priority set to 30
Actions #30

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.1.19 to 3.1.20
Actions #31

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.1.20 to 3.1.21
Actions #32

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.1.21 to 3.1.22
Actions #33

Updated by Benoît PECCATTE over 7 years ago

  • Priority changed from 30 to 43
Actions #34

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.1.22 to 3.1.23
Actions #35

Updated by Vincent MEMBRÉ about 7 years ago

  • Target version changed from 3.1.23 to 3.1.24
Actions #36

Updated by Vincent MEMBRÉ about 7 years ago

  • Target version changed from 3.1.24 to 3.1.25
Actions #37

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 3.1.25 to 387
Actions #38

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 387 to 4.1.10
Actions #39

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 4.1.10 to 4.1.11
  • Priority changed from 43 to 44
Actions #40

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.11 to 4.1.12
  • Priority changed from 44 to 45
Actions #41

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.12 to 4.1.13
Actions #42

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.13 to 4.1.14
  • Priority changed from 45 to 46
Actions #43

Updated by Benoît PECCATTE over 6 years ago

  • Target version changed from 4.1.14 to 4.1.15
Actions #44

Updated by Nicolas CHARLES about 6 years ago

  • Effort required set to Very Small
  • Priority changed from 46 to 73

i'm quite sure it can recover given that rudder agent health monitor the state of policies
Setting to Very Small to try to reproduce and check it is indeed recovering

Actions #45

Updated by Vincent MEMBRÉ about 6 years ago

  • Target version changed from 4.1.15 to 4.1.16
Actions #46

Updated by Vincent MEMBRÉ about 6 years ago

  • Target version changed from 4.1.16 to 4.1.17
  • Priority changed from 73 to 74
Actions #47

Updated by François ARMAND about 6 years ago

  • Assignee set to Nicolas CHARLES
Actions #48

Updated by Nicolas CHARLES about 6 years ago

  • Status changed from New to Rejected
  • Priority changed from 74 to 0

This has been fixed via #12265 : if policies are corrupted, then rudder agent check will fix them

Actions #49

Updated by Nicolas CHARLES about 6 years ago

  • Related to Bug #12265: rudder agent check should trigger failsafe run when promises are broken added
Actions

Also available in: Atom PDF