Bug #5650
closed
promises can become invalid if copies fail rendering the agent unusable
Added by Nicolas CHARLES about 10 years ago.
Updated over 2 years ago.
Category:
System integration
Description
When there is no space left on device, inputs may get purge. Or inputs may also get purge if somebody is manually deleting the files in the inputs folder on the node
In this case, the node is in a broken state, and nothing can revive it, unless a manual intervention
We should have a check for the validity of syntax of failsafe.cf, and if invalid, use initial promises to fetch repared promises from the server
Related issues
1 (1 open — 0 closed)
This seems that it can be one of the check of the step 3 of #5641
- Target version set to 3.1.0~beta1
This will be addressed by #5641.
You could additionaly consider adding a copy of the most basic initial promises (failsafe.cf?) as part of /opt/rudder/bin/rudder-check.
- Target version changed from 3.1.0~beta1 to 3.1.0~rc1
- Target version changed from 3.1.0~rc1 to 3.1.0
- Target version changed from 3.1.0 to 3.1.1
- Target version changed from 3.1.1 to 3.1.2
- Target version changed from 3.1.2 to 3.1.3
- Target version changed from 3.1.3 to 3.1.4
- Target version changed from 3.1.4 to 3.1.5
- Target version changed from 3.1.5 to 3.1.6
- Target version changed from 3.1.6 to 3.1.7
- Target version changed from 3.1.7 to 3.1.8
- Target version changed from 3.1.8 to 3.1.9
- Target version changed from 3.1.9 to 3.1.10
- Translation missing: en.field_tag_list set to Sponsored, Next minor release, Quick and important
- Subject changed from When there is no space left on device, promises files can get deleted, rendering the agent unusable to promises can become invalid if copies fail rendering the agent unusable
- Assignee set to Jonathan CLARKE
- Priority changed from N/A to 1 (highest)
- Target version changed from 3.1.10 to 2.11.21
This can happen in two distinct cases:
- The main promises (promises.cf + includes) get broken. An aborted copy can cause this (a new promises.cf gets copies, including a reference to fileX.cf, but the copy stops before fileX.cf is copied, breaking that set of promises)
- The backup failsafe.cf can be broken in much rarer cases, like no space left on device, a really bad error in failsafe.cf or update.cf, or possibly neutrino rain (see #5641)
A good, thorough workaround is to implement #5641. A quick and easy workaround is to check for these conditions in check-rudder-agent and fix them.
- Status changed from New to In progress
- Status changed from In progress to Pending technical review
- Assignee changed from Jonathan CLARKE to Benoît PECCATTE
- Pull Request set to https://github.com/Normation/rudder-packages/pull/943
- Status changed from Pending technical review to Pending release
- % Done changed from 0 to 100
- Status changed from Pending release to Released
This bug has been fixed in Rudder 2.11.21, 3.0.16, 3.1.10 and 3.2.3 which were released on 2016-06-01, but not announced.
Also available in: Atom
PDF