Architecture #19920
closedroot inventory is missing and need to be resent after install
Description
Even after correction of #17390 and related tickets, we still have cases where after a fresh installation, rudder root server inventory information is missing.
As far as we know, in that case, we don't have any relevant logs, but we suspect that it can happen when rudder-jetty
restart in the middle of inventory processing, just after the inventory was moved from /var/rudder/inventories/incoming
to /var/rudder/inventories/received
but before it's full saved in LDAP.
Workaround for 6.1/6.2/7.0: on the root server, do: rudder agent inventory
Solution: we need to rework inventory-processing so that all inventories, even the one coming from the inventory API, go to the incoming directory, and then only move inventories to received once they are fully saved so that if a restart occures before that point, they will be reprocessed at next startup.
Updated by Nicolas CHARLES over 3 years ago
I just had the issue on Rudder 7.0
Inventory was put in /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz at 19:29:07.195664832, signature at 19:29:07.150856158
Webapp started at 19:28:29.093, inventory watcher up and running at 19:28:52+0000
[2021-09-07 19:28:52+0000] INFO inventory-processing - Incoming inventory watcher started - process existing inventories
It finally got processed at 2021-09-07 19:38:53+0000
[2021-09-07 19:38:53+0000] INFO inventory-processing - Watch new inventory file 'server-root.ocs' with signature available: process. [2021-09-07 19:38:58+0000] INFO inventory-processing - Inventory 'server-root.ocs' for node 'server.rudder.local' [root] (signature:certified) processed in 3 seconds and 607 milliseconds [2021-09-07 19:39:09+0000] INFO scheduled.job - Update in node inventories main information detected: triggering a policy generation
Updated by François ARMAND over 3 years ago
If you are reliably able to reproduce it, can you set log level of inventory-processing
to trace? I still don't understand what is going on, it's not even inventories.watcher.period.garbage.old
which is set to 5min by default (but it's exactly 2x@inventories.watcher.period.garbage.old@, so perhaps there's something about that, which I don't understand - it would miss the initial inotify, and the first gc old inventories run would also miss them?).
Is there a server time correction happening in between ? IE, can the inventory mod time can be wrong ?
Updated by Nicolas CHARLES over 3 years ago
Here is what I have in trace
[2021-09-08 08:28:36+0000] INFO application - registered property: history.inventories.rootdir="/var/rudder/inventories/historical" [2021-09-08 08:28:36+0000] INFO application - registered property: inventories.root.directory="/var/rudder/inventories" [2021-09-08 08:28:36+0000] INFO application - registered property: inventories.watcher.period.garbage.old="5 minutes" [2021-09-08 08:28:36+0000] INFO application - registered property: inventories.watcher.waitForSignatureDuration="10 seconds" [2021-09-08 08:28:36+0000] INFO application - registered property: inventory.parse.parallelization="0.5x" [2021-09-08 08:28:36+0000] INFO application - registered property: ldif.tracelog.rootdir="/var/rudder/inventories/debug" [2021-09-08 08:28:36+0000] INFO application - registered property: rudder.batch.purge.inventories.delete.TTL="7" [2021-09-08 08:28:36+0000] INFO application - registered property: rudder.batch.purge.inventories.delete.interval="24" [2021-09-08 08:28:36+0000] INFO application - registered property: waiting.inventory.queue.size="5" [2021-09-08 08:28:44+0000] INFO application - INFO Configure inventory processing with parallelism of '1' and queue size of '5' [2021-09-08 08:28:44+0000] DEBUG inventory-processing - Incoming inventories directory [ok]: /var/rudder/inventories/incoming [2021-09-08 08:28:44+0000] DEBUG inventory-processing - Accepted nodes updates inventories directory [ok]: /var/rudder/inventories/accepted-nodes-updates [2021-09-08 08:28:44+0000] DEBUG inventory-processing - Received inventories directory [ok]: /var/rudder/inventories/received [2021-09-08 08:28:44+0000] DEBUG inventory-processing - Failed inventories directory [ok]: /var/rudder/inventories/failed [2021-09-08 08:28:45+0000] INFO inventory-processing - Incoming inventory watcher started - process existing inventories [2021-09-08 08:29:01+0000] DEBUG inventory-processing - Not starting process of 'server-root.ocs.gz': already processing. [2021-09-08 08:29:01+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign [2021-09-08 08:29:01+0000] DEBUG inventory-processing - Watch incoming signature file '/var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign' but no corresponding inventory available: waiting [2021-09-08 08:29:06+0000] INFO scheduled.job - Update in node inventories main information detected: triggering a policy generation
[root@server vagrant]# stat /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign File: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign Size: 1151 Blocks: 8 IO Block: 4096 regular file Device: fd00h/64768d Inode: 101784312 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:public_content_rw_t:s0 Access: 2021-09-08 08:29:00.671305958 +0000 Modify: 2021-09-08 08:29:00.671305958 +0000 Change: 2021-09-08 08:29:00.671305958 +0000 Birth: - [root@server vagrant]# stat /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz File: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz Size: 49649 Blocks: 104 IO Block: 4096 regular file Device: fd00h/64768d Inode: 101794903 Links: 1 Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:public_content_rw_t:s0 Access: 2021-09-08 08:29:00.710920093 +0000 Modify: 2021-09-08 08:29:00.710920093 +0000 Change: 2021-09-08 08:29:00.710920093 +0000 Birth: -
This does not always happen, i needed to respawn a VM thrice to reach that
It somehow thinks it is already processing something
Updated by Nicolas CHARLES over 3 years ago
Finally, it got parsed by "Found old inventories"
[2021-09-08 08:36:02+0000] DEBUG inventory-processing - watcher ignored file .davfs.tmp2db271 (unrecognized extension: 'tmp2db271') [2021-09-08 08:36:02+0000] DEBUG inventory-processing - watcher ignored file .davfs.tmp2db271 (unrecognized extension: 'tmp2db271') [2021-09-08 08:36:02+0000] DEBUG inventory-processing - watcher ignored file uuid.hive (unrecognized extension: 'hive') [2021-09-08 08:38:45+0000] DEBUG inventory-processing - Found old inventories: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign, /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz [2021-09-08 08:38:46+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign [2021-09-08 08:38:46+0000] DEBUG inventory-processing - Watch incoming signature file '/var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign' but no corresponding inventory available: waiting [2021-09-08 08:38:46+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz [2021-09-08 08:38:46+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs [2021-09-08 08:38:46+0000] INFO inventory-processing - Watch new inventory file 'server-root.ocs' with signature available: process. [2021-09-08 08:38:46+0000] TRACE inventory-processing - Query blocking save of 'server-root.ocs' [2021-09-08 08:38:46+0000] DEBUG inventory-processing - Enter pre-processed inventory for server-root.ocs with blocking='true' and current backend queue size=-1 (saturated=false) [2021-09-08 08:38:46+0000] DEBUG inventory-processing - Start parsing inventory 'server-root.ocs' [2021-09-08 08:38:47+0000] TRACE inventory-processing - Parsing done for inventory 'server-root.ocs' [2021-09-08 08:38:47+0000] TRACE inventory-processing - Blocking add 'server-root.ocs' to backend processing queue [2021-09-08 08:38:47+0000] TRACE inventory-processing - Blocking add 'server-root.ocs' to backend processing queue: done [2021-09-08 08:38:47+0000] DEBUG inventory-processing - Inventory 'server-root.ocs' for node 'root' pre-processed in 954 milliseconds ms [2021-09-08 08:38:47+0000] TRACE inventory-processing - Inventory 'server-root.ocs' for Node 'root' added to processing queue. [2021-09-08 08:38:47+0000] TRACE inventory-processing - Query blocking save of 'server-root.ocs': done [2021-09-08 08:38:47+0000] TRACE inventory-processing - Took report 'server-root.ocs' from backend queue and saving it [2021-09-08 08:38:47+0000] TRACE inventory-processing - Start post processing of inventory 'server-root.ocs' for node 'root' [2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:check_os_type_is_known': 3 ms [2021-09-08 08:38:47+0000] DEBUG inventory-processing - Processing node id finder 'use_existing_id' [2021-09-08 08:38:47+0000] DEBUG inventory-processing - Node Id 'root' found in DIT 'accepted' with id finder 'use_existing_id' [2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:merge_uuid': 59 ms [2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:check_machine_cn': 0 ms [2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:set_last_inventory_date': 0 ms [2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:add_ip_values': 1 ms [2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:log_inventory': 1 ms [2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Pre commit report: 70 ms [2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving software: 2334 ms [2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving machine: 59 ms [2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving node: 152 ms [2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving vms: 5 ms [2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Commit report: 2555 ms [2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Post commit report: 8 ms [2021-09-08 08:38:50+0000] DEBUG inventory-processing - Inventory saved. [2021-09-08 08:38:50+0000] INFO inventory-processing - Inventory 'server-root.ocs' for node 'server.rudder.local' [root] (signature:certified) processed in 2 seconds and 653 milliseconds [2021-09-08 08:38:51+0000] INFO scheduled.job - Update in node inventories main information detected: triggering a policy generation
Updated by Nicolas CHARLES over 3 years ago
I think it's because the .sign file arrives first
- .sign arrives, it creates a lockName on the file name
- it process inventory, and the .gz is still not there (it goes really fast, si it misses it) and so it `Watch incoming signature file '/var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign' but no corresponding inventory available: waiting`
- meanwhile the .gz arrives, and is ignored, because of lock.
we should have the same mecanism of waiting waitForSig if signature is there but not inventory
Updated by Nicolas CHARLES over 3 years ago
interestingly, I cannot reproduce it by copying manually the files
Updated by Nicolas CHARLES over 3 years ago
- Related to Bug #19585: Sometime inventory processing is not done when inventory is receveived added
Updated by François ARMAND almost 3 years ago
- Related to Bug #15655: After installation, root server OS is listed as "other Linux" in the web interface added
Updated by François ARMAND almost 3 years ago
- Tracker changed from Bug to Architecture
- Priority deleted (
0)
Updated by Vincent MEMBRÉ almost 3 years ago
- Target version changed from 7.1.0~beta1 to 7.1.0~beta2
Updated by Vincent MEMBRÉ almost 3 years ago
- Target version changed from 7.1.0~beta2 to 7.1.0~rc1
Updated by Alexis Mousset almost 3 years ago
- Target version changed from 7.1.0~rc1 to 7.2.0~beta1
Updated by François ARMAND over 2 years ago
- Related to Bug #20994: Some inventories are stuck in the incoming folder added
Updated by François ARMAND over 2 years ago
- Status changed from New to In progress
- Assignee set to François ARMAND
Updated by François ARMAND over 2 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from François ARMAND to Nicolas CHARLES
- Pull Request set to https://github.com/Normation/rudder/pull/4281
Updated by François ARMAND over 2 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder|9dc81bdf0e836bbd6918435ec31a41f25e1c49d1.
Updated by Vincent MEMBRÉ over 2 years ago
- Status changed from Pending release to Released
This bug has been fixed in Rudder 7.2.0~beta1 which was released today.
Updated by François ARMAND about 2 years ago
- Related to Bug #22099: Missing error log for inventory with bad signature added