Bug #12474
closedroot node disapeared while upgrading from 4.1 to 4.3 on debian 9
Description
WORAROUND: we are not sure of the cause of that ticket (hard to reproduce, but it DID happen two times at least). In all cases, you can correct the problem by executing the following command in the root server:
rudder agent inventory
Description:
After upgrading, root server disapeared from the node list
However, it looks like it is there in ldap:
dn: nodeId=root,ou=Nodes,cn=rudder-configuration objectClass: rudderPolicyServer objectClass: rudderNode objectClass: top cn: root nodeId: root description: the policy server isSystem: TRUE isBroken: FALSE structuralObjectClass: rudderPolicyServer entryUUID: 33359356-d6a4-1037-9ee3-1d253a7788ee creatorsName: cn=manager,cn=rudder-configuration createTimestamp: 20180417160031Z entryCSN: 20180417160031.621579Z#000000#000#000000 modifiersName: cn=manager,cn=rudder-configuration modifyTimestamp: 20180417160031Z
dn: nodeId=root,ou=Nodes,ou=Accepted Inventories,ou=Inventories,cn=rudder-co nfiguration objectClass: top objectClass: node objectClass: unixNode objectClass: linuxNode nodeId: root osKernelVersion: 1.0-dummy-version osName: Linux osVersion: Linux localAccountName: root cn: root localAdministratorAccountName: root nodeHostname: server.rudder.local policyServerId: root inventoryDate: 19700101000000+0200 receiveDate: 19700101000000+0200 ipHostNumber: 127.0.0.1 agentName: Community rudderServerRole: rudder-web structuralObjectClass: linuxNode entryUUID: 3335b732-d6a4-1037-9ee4-1d253a7788ee creatorsName: cn=manager,cn=rudder-configuration createTimestamp: 20180417160031Z entryCSN: 20180417160031.622498Z#000000#000#000000 modifiersName: cn=manager,cn=rudder-configuration modifyTimestamp: 20180417160031Z
Chaos follow :/
Updated by Nicolas CHARLES over 6 years ago
it seems that i did not have correct inventory for this machine, so when upgrading, the root disapeared, and caught everything with it
putting back root via an invnentory solved EVERYTHING
Updated by Nicolas CHARLES over 6 years ago
- Related to Bug #12475: Technique Inventory is deleted when upgrading from 4.1 to 4.3 added
Updated by Vincent MEMBRÉ over 6 years ago
- Target version changed from 4.3.1 to 4.3.2
Updated by Vincent MEMBRÉ over 6 years ago
- Target version changed from 4.3.2 to 410
Updated by Benoît PECCATTE over 6 years ago
- Target version changed from 410 to 4.3.2
Updated by Vincent MEMBRÉ over 6 years ago
- Target version changed from 4.3.2 to 4.3.3
Updated by Benoît PECCATTE over 6 years ago
- Status changed from New to Rejected
Cannot reproduce, feel free to reopen if needed
Updated by Alexis Mousset over 6 years ago
- Status changed from Rejected to New
Reproduced between two 4.3 nightlies, reopening.
Updated by Alexis Mousset over 6 years ago
Actually, the inventory after 4.1 -> 4.3 ugrade was not accepted in my case:
[2018-07-12 08:55:25] ERROR com.normation.inventory.provisioning.endpoint.FusionReportEndpoint - Error when trying to check inventory signature <- class configured for Signature (provider: BC) cannot be found.
And after 4.3 -> 4.3 upgrade, I got:
[2018-07-16 05:07:59] ERROR com.normation.rudder.services.nodes.NodeInfoServiceCachedImpl - An error occured while updating node cache: can not unserialize node with id 'root', it will be ignored <- Error when mapping '{"agentType":"Community","version":"4.1.7.release-1.SLES.11"}' to an agent info. We are expecting either an agentType with allowed values in cfengine-nova, cfengine-community, dsc or a json like {'agentType': type, 'version': opt_version, 'securityToken': ...} but we get: Error when parsing JSON information about the agent type. <- Invalid value for security token: no value define for security token, and no public key were stored <- Wrong type of value for the agent '{"agentType":"Community","version":"4.1.7.release-1.SLES.11"}' ... [2018-07-16 05:07:59] ERROR com.normation.rudder.services.policies.RuleValServiceImpl - Some nodes are in the target of rule 'Rudder system policy: daily inventory' (inventory-all) but are not present in the system. It looks like an inconsistency error. Ignored nodes: root [2018-07-16 05:07:59] ERROR com.normation.rudder.services.policies.RuleValServiceImpl - Some nodes are in the target of rule 'distributePolicy' (root-DP) but are not present in the system. It looks like an inconsistency error. Ignored nodes: root [2018-07-16 05:07:59] ERROR com.normation.rudder.services.policies.RuleValServiceImpl - Some nodes are in the target of rule 'Rudder system policy: basic setup (common)' (hasPolicyServer-root) but are not present in the system. It looks like an inconsistency error. Ignored nodes: root
Updated by François ARMAND over 6 years ago
- Related to Bug #12606: Restricted java security policy breaks Rudder (class configured for Cipher(provider: BC)cannot be found) added
Updated by François ARMAND over 6 years ago
- Status changed from New to In progress
- Assignee set to François ARMAND
Updated by François ARMAND over 6 years ago
OK, so the problem is that:
- inventory from 4.1 are stored with :
agentName: {"agentType":"Community","version":"4.1.14~rc1~git201807130352-stretch0"} publicKey: ....
But we consider that an error. If we parse the inventory (from 4.1), the parsing is done correctly (the problem is really in the unserialisation of an already stored inventory during a migration).
Updated by François ARMAND over 6 years ago
My previous comment was erroneous, cf the added tests in #12474.
So the problem is that for some reason, we get an inventory for root without a public key in it. That inventory is accepted (but it should not, key or certificate are mandatory in 4.3), and so afterward, the node can't be read back (because it does not met the requirement of having a key).
So we need to refuse inventory without either a public key or a certificate in 4.3
Updated by François ARMAND over 6 years ago
- Description updated (diff)
I'm really not sure that accepting an inventory without a security token was the root cause. We don't have such inventories since... 2.8 ?
I will add the check. If the bug occures again, we will need to try to think to other problems that could happen. At least, we will know that it can't be from the inventory part, and more likelly there is something that breaks root stored inventory information in LDAP (replay of init data? Deletion of some attributes?)
Updated by François ARMAND over 6 years ago
So, more data !
Because of some error in cache, we reach a situation where:
- there is an inventory for Rudder 4.3 in LDAP,
- but there was an eviction problem with the cache of node info for the node, so we removed the faulty cache (ok, why not) but we never added back the fresh cache info.
So the cache thinks it is up-to-date, but without the node info.
The next time the node is modified (for ex with a new inventory or by clicking the "clear cache" button in Rudder settings), everything goes back to normal.
Updated by François ARMAND over 6 years ago
- Related to Bug #12988: NodeInfoCache is precise to the second but we need it to be precise to the millisecond added
Updated by François ARMAND over 6 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from François ARMAND to Vincent MEMBRÉ
- Pull Request set to https://github.com/Normation/rudder/pull/1988
Updated by François ARMAND over 6 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder|c4a3f6e50a92c80128df4f7195407d8e9a662bf0.
Updated by Vincent MEMBRÉ over 6 years ago
- Status changed from Pending release to Released
This bug has been fixed in Rudder 4.3.3 which was released today.
- 4.3.3: Announce Changelog
- Download: https://www.rudder-project.org/site/get-rudder/downloads/