Project

General

Profile

Actions

Bug #24815

closed

Node with inventories with bad certificate still get into Rudder

Added by François ARMAND 8 months ago. Updated 7 months ago.

Status:
Released
Priority:
N/A
Category:
Web - Nodes & inventories
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
0
Name check:
To do
Fix check:
To do
Regression:
No

Description

With our RTF plateform, we sometime gets node with inventory whose certificate is:

-----BEGIN RSA PUBLIC KEY-----
not initialized
-----END RSA PUBLIC KEY-----'

I correctly have a /var/rudder/inventories/failed/12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21\:30\:47+00\:00.ocs.reject-2024-04-30T21\:31\:05Z.log which says that the node inventory is refused:

2024-04-30T21:31:05Z
Inventory '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:47+00:00.ocs' for Node 'unknown' failed to be saved in Rudder. Cause was: Error when trying to process inventory '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:47+00:00.ocs'
cause was: CryptoEx: Key '-----BEGIN RSA PUBLIC KEY-----
not initialized
-----END RSA PUBLIC KEY-----' cannot be parsed as a public key; root exception was: unable to decode base64 string: String index out of range: 15

But still: the node is accepted into rudder, and the key "not initialized" is certified.

Logs for that node show:

2024-04-30 21:30:44+0000 INFO  inventory-processing - Received new inventory file '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:36+00:00.ocs' with signature available: process.
2024-04-30 21:30:44+0000 INFO  nodes - New pending node: 'node1.rudder.local' [12fabbe9-fe1d-4663-8194-d7272dc3c4c6]'
2024-04-30 21:30:44+0000 INFO  inventory-processing - Inventory '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:36+00:00.ocs' for node 'node1.rudder.local' [12fabbe9-fe1d-4663-8194-d7272dc3c4c6] (signature:certified) processed in 200 milliseconds
2024-04-30 21:31:05+0000 INFO  inventory-processing - Received new inventory file '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:47+00:00.ocs' with signature available: process.
2024-04-30 21:31:05+0000 ERROR inventory-processing - Error when trying to process inventory '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:47+00:00.ocs'; cause was: CryptoEx: Key '-----BEGIN RSA PUBLIC KEY-----
2024-04-30 21:31:05+0000 ERROR inventory-processing - Inventory '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:47+00:00.ocs' for Node 'unknown' failed to be saved in Rudder. Cause was: Error when trying to process inventory '12fabbe9-fe1d-4663-8194-d7272dc3c4c6_2024-04-30T21:30:47+00:00.ocs'; cause was: CryptoEx: Key '-----BEGIN RSA PUBLIC KEY-----
2024-04-30 21:31:11+0000 INFO  nodes - New accepted node: 'node1.rudder.local' [12fabbe9-fe1d-4663-8194-d7272dc3c4c6]'
2024-04-30 21:31:12+0000 INFO  nodes - Update in node '12fabbe9-fe1d-4663-8194-d7272dc3c4c6' inventories main information detected: triggering dynamic group update and a policy generation
2024-04-30 21:31:12+0000 INFO  dynamic-group - Dynamic group all-nodes-with-cfengine-agent: added node with id: [ 12fabbe9-fe1d-4663-8194-d7272dc3c4c6 ], removed: nothing
2024-04-30 21:31:12+0000 INFO  dynamic-group - Dynamic group hasPolicyServer-root: added node with id: [ 12fabbe9-fe1d-4663-8194-d7272dc3c4c6 ], removed: nothing
2024-04-30 21:31:12+0000 WARN  explain_compliance.12fabbe9-fe1d-4663-8194-d7272dc3c4c6 - Can not get compliance for node with ID '12fabbe9-fe1d-4663-8194-d7272dc3c4c6' because it has no configuration id initialised nor sent reports (node just added ?)
2024-04-30 21:31:12+0000 ERROR policy.generation - Error when trying to get the CFEngine-MD5 digest of CFEngine public key for node 'node1.rudder.local' (12fabbe9-fe1d-4663-8194-d7272dc3c4c6) <- An error occurred. Cause was: DecoderException: unable to decode base64 string: String index out of range: 15

So Rudder sees that the node should be refused, and still accept it.

It seems to happen only in Rudder 8.1.


Related issues 1 (1 open0 closed)

Related to Rudder - Bug #25563: We can't disable a node with a bad certificate anymoreNewActions
Actions #1

Updated by François ARMAND 7 months ago

So, it's even stranger:

- the node inventory is correct. It has a certificate with what is expected
- the agent is not init, as if NodeFact.defaultRudderAgent was used: in ldap, we have:

{"agentType":"cfengine-community","version":"unknown","securityToken":{"value":"-----BEGIN RSA PUBLIC KEY-----\nnot initialized\n-----END RSA PUBLIC KEY-----","type":"publicKey"},"capabilities":[]}

But all other values are correct: software, env vars, etc.

Actions #2

Updated by François ARMAND 7 months ago

And I'm never able to reproduce it localy, even if I try to accept the same inventory, also by API.

Actions #3

Updated by François ARMAND 7 months ago

It doesn't look like it's a "only one core + race condition" thing either

Actions #4

Updated by François ARMAND 7 months ago

I got an other instance on 8.1.2:

2024-05-16 21:22:10+0000 INFO  inventory-processing - Received new inventory file 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:03+00:00.ocs' with signature available: process.
2024-05-16 21:22:10+0000 WARN  inventory-processing - Error when parsing date for last user loggin. Awaited format is EEE MMM dd HH:mm, found: Thu May 16 21:19:47
2024-05-16 21:22:10+0000 WARN  nodes.security - only certificate are supported for agent security token since Rudder 7.0
2024-05-16 21:22:10+0000 INFO  nodes - New pending node: 'node1.rudder.local' [e13386a3-643d-48f5-9c2a-eab1b8662052]'
2024-05-16 21:22:10+0000 INFO  inventory-processing - Inventory 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:03+00:00.ocs' for node 'node1.rudder.local' [e13386a3-643d-48f5-9c2a-eab1b8662052] (signature:certified) processed in 201 milliseconds
2024-05-16 21:22:26+0000 INFO  inventory-processing - Received new inventory file 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:14+00:00.ocs' with signature available: process.
2024-05-16 21:22:26+0000 WARN  inventory-processing - Error when parsing date for last user loggin. Awaited format is EEE MMM dd HH:mm, found: Thu May 16 21:19:47
2024-05-16 21:22:26+0000 ERROR inventory-processing - Error when trying to process inventory 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:14+00:00.ocs'; cause was: CryptoEx: Key '-----BEGIN RSA PUBLIC KEY-----
not initialized
-----END RSA PUBLIC KEY-----' cannot be parsed as a public key; root exception was: unable to decode base64 string: String index out of range: 15
2024-05-16 21:22:26+0000 ERROR inventory-processing - Inventory 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:14+00:00.ocs' for Node 'unknown' failed to be saved in Rudder. Cause was: Error when trying to process inventory 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:14+00:00.ocs'; cause was: CryptoEx: Key '-----BEGIN RSA PUBLIC KEY-----
not initialized
-----END RSA PUBLIC KEY-----' cannot be parsed as a public key; root exception was: unable to decode base64 string: String index out of range: 15
2024-05-16 21:22:35+0000 INFO  nodes - New accepted node: 'node1.rudder.local' [e13386a3-643d-48f5-9c2a-eab1b8662052]'

What is strange is that there is two lines Received new inventory file 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:03+00:00.ocs' with signature available: process.

The first doesn't have the signature error, while the second has. So perhaps the first inventory is incorrect in some way, perhaps it has both a certificate and signature but they don't match ? I don't have access to that inventory, it is erased by the second one just after.

The lines:

2024-05-16 21:22:10+0000 WARN  nodes.security - only certificate are supported for agent security token since Rudder 7.0
2024-05-16 21:22:10+0000 INFO  nodes - New pending node: 'node1.rudder.local' [e13386a3-643d-48f5-9c2a-eab1b8662052]'
2024-05-16 21:22:10+0000 INFO  inventory-processing - Inventory 'e13386a3-643d-48f5-9c2a-eab1b8662052_2024-05-16T21:22:03+00:00.ocs' for node 'node1.rudder.local' [e13386a3-643d-48f5-9c2a-eab1b8662052] (signature:certified) processed in 201 milliseconds

Are even stranger. Why error and then pending ok ? And why "signature: certified" here ? It's pending.

Actions #5

Updated by François ARMAND 7 months ago

So, it looks like things happen like that:

- we get a first inventory. That inventory does not have a certificate or at least, for some reason, it contains a "publickey" entry that is not valid, or something else BUT something that leads to the message "public key are not supported anymore", but still with a valid signature somehow (HOW?)
- that certificate get to pending, with its (invalid) key getting the certified status.
- a second, correct inventory comes, with everything good.
- rudder looks and find the other inventory in pending, and get it from there. This is a regression in 8.1 where we don't look only on accepted nodes in that case
- so we use the incorrect key with its certified status and everything else are correct values from the second inventory
- the node is accepted with the incorrect key.

I don't get how the first step happens. If the signature is there, there must be a valid cert/public key to use. If it's not there, the inventory is ignored.
I think that maybe both certificate and public key attribute are present, and the certificate is ok but not the public key, and we check with the first but store the second one.
I was not able to see that in the code, but in any case, the problem happens because we still parse the public key part, what we stopped supporting in 7.0, so we really need to get ride of it now. Ignoring the public key at root will make the whole scenario much simpler to understand.

Actions #6

Updated by François ARMAND 7 months ago

  • Status changed from New to In progress
  • Assignee set to François ARMAND
Actions #7

Updated by François ARMAND 7 months ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Vincent MEMBRÉ
  • Pull Request set to https://github.com/Normation/rudder/pull/5670
Actions #8

Updated by Anonymous 7 months ago

  • Status changed from Pending technical review to Pending release
Actions #9

Updated by Vincent MEMBRÉ 7 months ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 8.1.3 which was released today.

Actions #10

Updated by François ARMAND 3 months ago

  • Related to Bug #25563: We can't disable a node with a bad certificate anymore added
Actions

Also available in: Atom PDF