Project

General

Profile

Bug #5650

promises can become invalid if copies fail rendering the agent unusable

Added by Nicolas CHARLES over 5 years ago. Updated over 3 years ago.

Status:
Released
Priority:
1
Category:
System integration
Target version:
Severity:
User visibility:
Effort required:
Priority:

Description

When there is no space left on device, inputs may get purge. Or inputs may also get purge if somebody is manually deleting the files in the inputs folder on the node
In this case, the node is in a broken state, and nothing can revive it, unless a manual intervention

We should have a check for the validity of syntax of failsafe.cf, and if invalid, use initial promises to fetch repared promises from the server


Related issues

Related to Rudder - User story #5641: Make the agent policies update a state machine with integrity checkNewActions

Associated revisions

Revision c83c708c (diff)
Added by Jonathan CLARKE almost 4 years ago

Fixes #5650: promises can become invalid if copies fail rendering the agent unusable

Revision 93eb034b
Added by Jonathan CLARKE almost 4 years ago

Merge pull request #943 from jooooooon/bug_5650/promises_can_become_invalid_if_copies_fail_rendering_the_agent_unusable

Fixes #5650: promises can become invalid if copies fail rendering the agent unusable

History

#1

Updated by François ARMAND over 5 years ago

This seems that it can be one of the check of the step 3 of #5641

#2

Updated by Jonathan CLARKE about 5 years ago

  • Target version set to 3.1.0~beta1

This will be addressed by #5641.

#3

Updated by Florian Heigl almost 5 years ago

You could additionaly consider adding a copy of the most basic initial promises (failsafe.cf?) as part of /opt/rudder/bin/rudder-check.

#4

Updated by Vincent MEMBRÉ almost 5 years ago

  • Target version changed from 3.1.0~beta1 to 3.1.0~rc1
#5

Updated by Vincent MEMBRÉ over 4 years ago

  • Target version changed from 3.1.0~rc1 to 3.1.0
#6

Updated by Vincent MEMBRÉ over 4 years ago

  • Target version changed from 3.1.0 to 3.1.1
#7

Updated by Vincent MEMBRÉ over 4 years ago

  • Target version changed from 3.1.1 to 3.1.2
#8

Updated by Vincent MEMBRÉ over 4 years ago

  • Target version changed from 3.1.2 to 3.1.3
#9

Updated by Vincent MEMBRÉ over 4 years ago

  • Target version changed from 3.1.3 to 3.1.4
#10

Updated by Vincent MEMBRÉ over 4 years ago

  • Target version changed from 3.1.4 to 3.1.5
#11

Updated by Vincent MEMBRÉ about 4 years ago

  • Target version changed from 3.1.5 to 3.1.6
#12

Updated by Vincent MEMBRÉ about 4 years ago

  • Target version changed from 3.1.6 to 3.1.7
#13

Updated by Vincent MEMBRÉ almost 4 years ago

  • Target version changed from 3.1.7 to 3.1.8
#14

Updated by Vincent MEMBRÉ almost 4 years ago

  • Target version changed from 3.1.8 to 3.1.9
#15

Updated by Vincent MEMBRÉ almost 4 years ago

  • Target version changed from 3.1.9 to 3.1.10
#16

Updated by Jonathan CLARKE almost 4 years ago

  • Tags set to Sponsored, Next minor release, Quick and important
  • Subject changed from When there is no space left on device, promises files can get deleted, rendering the agent unusable to promises can become invalid if copies fail rendering the agent unusable
  • Assignee set to Jonathan CLARKE
  • Priority changed from N/A to 1
  • Target version changed from 3.1.10 to 2.11.21

This can happen in two distinct cases:

  1. The main promises (promises.cf + includes) get broken. An aborted copy can cause this (a new promises.cf gets copies, including a reference to fileX.cf, but the copy stops before fileX.cf is copied, breaking that set of promises)
  2. The backup failsafe.cf can be broken in much rarer cases, like no space left on device, a really bad error in failsafe.cf or update.cf, or possibly neutrino rain (see #5641)

A good, thorough workaround is to implement #5641. A quick and easy workaround is to check for these conditions in check-rudder-agent and fix them.

#17

Updated by Jonathan CLARKE almost 4 years ago

  • Status changed from New to In progress
#18

Updated by Jonathan CLARKE almost 4 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Jonathan CLARKE to Benoît PECCATTE
  • Pull Request set to https://github.com/Normation/rudder-packages/pull/943
#19

Updated by Jonathan CLARKE almost 4 years ago

  • Status changed from Pending technical review to Pending release
  • % Done changed from 0 to 100
#20

Updated by Vincent MEMBRÉ over 3 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 2.11.21, 3.0.16, 3.1.10 and 3.2.3 which were released on 2016-06-01, but not announced.

Also available in: Atom PDF