Bug #7091
closedThe uuid in the promises and the uuid in /opt/rudder/etc/uuid.hive may be out of sync, and chaos and sadness follows
Description
In the promises, we hardcode the uuid to fetch the promises from, but still use the /opt/rudder/etc/uuid.hive for everything else (reports, inventories and all).
This leads to very "funny" moments, where uuid.hive matches the uuid in the web interface, but the node is not able to fetch its promises, because it looks in a funny place.
We should have one and only reference, be it the file or the promises, but not random async data
Updated by François ARMAND over 9 years ago
Just to be sure that the bug is correctly understood:
- we have a file, /opt/rudder/etc/uuid.hive, with the node ID.
- when a node is accepted, it starts getting ITS promises. These promises contains at several point the node id (for the URL where the node should get its future promises, the identification of reports, etc).
So, if somebody change the content of /opt/rudder/etc/uuid.hive on an accepted node, the two set of files will be unsynchronized. So at the next inventory, Rudder will get the new node id, don't know what to do with that, and the node won't be able to go get it's promises because of authorisation.
So, the problem seems to be that changing a node ID is NOT a trivial action. It's an important decision, that bears consequences. Your are changing the ID of the node in Rudder. It's not the same node anymore. Most likelly, it will get OTHER promises. It can even be managed by totally other people, for a totally different role.
So we should fails immediatelly if we are finding that somehow, the node id in /opt/rudder/etc/uuid.hive is not coherent witht the node id for which the promises were produced.
And to make the uuid update easier (for example, when cloning VMs), we should explain what to do to actually having a new node in Rudder, with new, dedicated promises, once accepted:
- add the script corresponding to "rudder agent reinit" in earlier version of Rudder,
- add an error message explaining how to change the UUID when an inconssitancy is found between promises and conig file,
- in the /opt/rudder/etc/uuid.hive config file, add comments eplaining how to update the uuid (NOT by editing the file directly).
What do you thing ?
Updated by Janos Mattyasovszky over 9 years ago
+1
We have hit the same issue, and made even a Nagios Monitor that checks if both UUIDs are in sync.
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 2.10.16 to 2.10.17
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.10.17 to 2.10.18
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.10.18 to 2.10.19
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.10.19 to 2.10.20
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 2.10.20 to 277
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 277 to 2.11.18
Updated by Vincent MEMBRÉ almost 9 years ago
- Target version changed from 2.11.18 to 2.11.19
Updated by Vincent MEMBRÉ almost 9 years ago
- Target version changed from 2.11.19 to 2.11.20
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 2.11.20 to 2.11.21
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 2.11.21 to 2.11.22
Updated by Benoît PECCATTE over 8 years ago
- Status changed from New to In progress
- Assignee set to Benoît PECCATTE
Updated by Jonathan CLARKE over 8 years ago
- Always use the UUID from uuid.hive on the node for all operations, logs, reports, etc.
- Keep the UUID from the server in a generated promises file for the sole purpose of running a check that will display an error message explaining what happened and how to workaround it, then abort the agent.
Updated by Benoît PECCATTE over 8 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from Benoît PECCATTE to Jonathan CLARKE
- Pull Request set to https://github.com/Normation/rudder-techniques/pull/952
Updated by Benoît PECCATTE over 8 years ago
- Has duplicate Bug #8391: The failsafe doesn't abort is there is no uuid added
Updated by Benoît PECCATTE over 8 years ago
- Status changed from Pending technical review to Pending release
- % Done changed from 0 to 100
Applied in changeset rudder-techniques|89f25f981d89d3c252eca57f580d9eb9575bc2f9.
Updated by Vincent MEMBRÉ over 8 years ago
- Status changed from Pending release to Released