Project

General

Profile

Actions

Architecture #19920

closed

root inventory is missing and need to be resent after install

Added by François ARMAND over 3 years ago. Updated over 2 years ago.

Status:
Released
Priority:
N/A
Category:
Web - Nodes & inventories
Target version:
Effort required:
Name check:
To do
Fix check:
To do
Regression:

Description

Even after correction of #17390 and related tickets, we still have cases where after a fresh installation, rudder root server inventory information is missing.

As far as we know, in that case, we don't have any relevant logs, but we suspect that it can happen when rudder-jetty restart in the middle of inventory processing, just after the inventory was moved from /var/rudder/inventories/incoming to /var/rudder/inventories/received but before it's full saved in LDAP.

Workaround for 6.1/6.2/7.0: on the root server, do: rudder agent inventory

Solution: we need to rework inventory-processing so that all inventories, even the one coming from the inventory API, go to the incoming directory, and then only move inventories to received once they are fully saved so that if a restart occures before that point, they will be reprocessed at next startup.


Related issues 4 (0 open4 closed)

Related to Rudder - Bug #19585: Sometime inventory processing is not done when inventory is receveivedReleasedNicolas CHARLESActions
Related to Rudder - Bug #15655: After installation, root server OS is listed as "other Linux" in the web interfaceResolvedActions
Related to Rudder - Bug #20994: Some inventories are stuck in the incoming folderResolvedActions
Related to Rudder - Bug #22099: Missing error log for inventory with bad signatureReleasedVincent MEMBRÉActions
Actions #1

Updated by Nicolas CHARLES over 3 years ago

I just had the issue on Rudder 7.0
Inventory was put in /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz at 19:29:07.195664832, signature at 19:29:07.150856158
Webapp started at 19:28:29.093, inventory watcher up and running at 19:28:52+0000

[2021-09-07 19:28:52+0000] INFO  inventory-processing - Incoming inventory watcher started - process existing inventories

It finally got processed at 2021-09-07 19:38:53+0000

[2021-09-07 19:38:53+0000] INFO  inventory-processing - Watch new inventory file 'server-root.ocs' with signature available: process.
[2021-09-07 19:38:58+0000] INFO  inventory-processing - Inventory 'server-root.ocs' for node 'server.rudder.local' [root] (signature:certified) processed in 3 seconds and 607 milliseconds
[2021-09-07 19:39:09+0000] INFO  scheduled.job - Update in node inventories main information detected: triggering a policy generation

Actions #2

Updated by François ARMAND over 3 years ago

If you are reliably able to reproduce it, can you set log level of inventory-processing to trace? I still don't understand what is going on, it's not even inventories.watcher.period.garbage.old which is set to 5min by default (but it's exactly @, so perhaps there's something about that, which I don't understand - it would miss the initial inotify, and the first gc old inventories run would also miss them?).

Is there a server time correction happening in between ? IE, can the inventory mod time can be wrong ?

Actions #3

Updated by Nicolas CHARLES over 3 years ago

Here is what I have in trace

[2021-09-08 08:28:36+0000] INFO  application - registered property: history.inventories.rootdir="/var/rudder/inventories/historical" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: inventories.root.directory="/var/rudder/inventories" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: inventories.watcher.period.garbage.old="5 minutes" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: inventories.watcher.waitForSignatureDuration="10 seconds" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: inventory.parse.parallelization="0.5x" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: ldif.tracelog.rootdir="/var/rudder/inventories/debug" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: rudder.batch.purge.inventories.delete.TTL="7" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: rudder.batch.purge.inventories.delete.interval="24" 
[2021-09-08 08:28:36+0000] INFO  application - registered property: waiting.inventory.queue.size="5" 
[2021-09-08 08:28:44+0000] INFO  application - INFO Configure inventory processing with parallelism of '1' and queue size of '5'
[2021-09-08 08:28:44+0000] DEBUG inventory-processing - Incoming inventories directory [ok]: /var/rudder/inventories/incoming
[2021-09-08 08:28:44+0000] DEBUG inventory-processing - Accepted nodes updates inventories directory [ok]: /var/rudder/inventories/accepted-nodes-updates
[2021-09-08 08:28:44+0000] DEBUG inventory-processing - Received inventories directory [ok]: /var/rudder/inventories/received
[2021-09-08 08:28:44+0000] DEBUG inventory-processing - Failed inventories directory [ok]: /var/rudder/inventories/failed
[2021-09-08 08:28:45+0000] INFO  inventory-processing - Incoming inventory watcher started - process existing inventories
[2021-09-08 08:29:01+0000] DEBUG inventory-processing - Not starting process of 'server-root.ocs.gz': already processing.
[2021-09-08 08:29:01+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign
[2021-09-08 08:29:01+0000] DEBUG inventory-processing - Watch incoming signature file '/var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign' but no corresponding inventory available: waiting
[2021-09-08 08:29:06+0000] INFO  scheduled.job - Update in node inventories main information detected: triggering a policy generation


[root@server vagrant]# stat /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign 
  File: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign
  Size: 1151          Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 101784312   Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:public_content_rw_t:s0
Access: 2021-09-08 08:29:00.671305958 +0000
Modify: 2021-09-08 08:29:00.671305958 +0000
Change: 2021-09-08 08:29:00.671305958 +0000
 Birth: -
[root@server vagrant]# stat /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz 
  File: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz
  Size: 49649         Blocks: 104        IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 101794903   Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:public_content_rw_t:s0
Access: 2021-09-08 08:29:00.710920093 +0000
Modify: 2021-09-08 08:29:00.710920093 +0000
Change: 2021-09-08 08:29:00.710920093 +0000
 Birth: -

This does not always happen, i needed to respawn a VM thrice to reach that
It somehow thinks it is already processing something

Actions #4

Updated by Nicolas CHARLES over 3 years ago

Finally, it got parsed by "Found old inventories"

[2021-09-08 08:36:02+0000] DEBUG inventory-processing - watcher ignored file .davfs.tmp2db271 (unrecognized extension: 'tmp2db271')
[2021-09-08 08:36:02+0000] DEBUG inventory-processing - watcher ignored file .davfs.tmp2db271 (unrecognized extension: 'tmp2db271')
[2021-09-08 08:36:02+0000] DEBUG inventory-processing - watcher ignored file uuid.hive (unrecognized extension: 'hive')
[2021-09-08 08:38:45+0000] DEBUG inventory-processing - Found old inventories: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign, /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz
[2021-09-08 08:38:46+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign
[2021-09-08 08:38:46+0000] DEBUG inventory-processing - Watch incoming signature file '/var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign' but no corresponding inventory available: waiting
[2021-09-08 08:38:46+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs.gz
[2021-09-08 08:38:46+0000] TRACE inventory-processing - Processing new file: /var/rudder/inventories/accepted-nodes-updates/server-root.ocs
[2021-09-08 08:38:46+0000] INFO  inventory-processing - Watch new inventory file 'server-root.ocs' with signature available: process.
[2021-09-08 08:38:46+0000] TRACE inventory-processing - Query blocking save of 'server-root.ocs'
[2021-09-08 08:38:46+0000] DEBUG inventory-processing - Enter pre-processed inventory for server-root.ocs with blocking='true' and current backend queue size=-1 (saturated=false)
[2021-09-08 08:38:46+0000] DEBUG inventory-processing - Start parsing inventory 'server-root.ocs'
[2021-09-08 08:38:47+0000] TRACE inventory-processing - Parsing done for inventory 'server-root.ocs'
[2021-09-08 08:38:47+0000] TRACE inventory-processing - Blocking add 'server-root.ocs' to backend processing queue
[2021-09-08 08:38:47+0000] TRACE inventory-processing - Blocking add 'server-root.ocs' to backend processing queue: done
[2021-09-08 08:38:47+0000] DEBUG inventory-processing - Inventory 'server-root.ocs' for node 'root' pre-processed in 954 milliseconds ms
[2021-09-08 08:38:47+0000] TRACE inventory-processing - Inventory 'server-root.ocs' for Node 'root' added to processing queue.
[2021-09-08 08:38:47+0000] TRACE inventory-processing - Query blocking save of 'server-root.ocs': done
[2021-09-08 08:38:47+0000] TRACE inventory-processing - Took report 'server-root.ocs' from backend queue and saving it
[2021-09-08 08:38:47+0000] TRACE inventory-processing - Start post processing of inventory 'server-root.ocs' for node 'root'
[2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:check_os_type_is_known': 3 ms
[2021-09-08 08:38:47+0000] DEBUG inventory-processing - Processing node id finder 'use_existing_id'
[2021-09-08 08:38:47+0000] DEBUG inventory-processing - Node Id 'root' found in DIT 'accepted' with id finder 'use_existing_id'
[2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:merge_uuid': 59 ms
[2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:check_machine_cn': 0 ms
[2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:set_last_inventory_date': 0 ms
[2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:add_ip_values': 1 ms
[2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Precommit 'pre_commit_inventory:log_inventory': 1 ms
[2021-09-08 08:38:47+0000] TRACE inventory-processing.timing - Pre commit report: 70 ms
[2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving software: 2334 ms
[2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving machine: 59 ms
[2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving node: 152 ms
[2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Saving vms: 5 ms
[2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Commit report: 2555 ms
[2021-09-08 08:38:50+0000] TRACE inventory-processing.timing - Post commit report: 8 ms
[2021-09-08 08:38:50+0000] DEBUG inventory-processing - Inventory saved.
[2021-09-08 08:38:50+0000] INFO  inventory-processing - Inventory 'server-root.ocs' for node 'server.rudder.local' [root] (signature:certified) processed in 2 seconds and 653 milliseconds
[2021-09-08 08:38:51+0000] INFO  scheduled.job - Update in node inventories main information detected: triggering a policy generation

Actions #5

Updated by Nicolas CHARLES over 3 years ago

I think it's because the .sign file arrives first

  1. .sign arrives, it creates a lockName on the file name
  2. it process inventory, and the .gz is still not there (it goes really fast, si it misses it) and so it `Watch incoming signature file '/var/rudder/inventories/accepted-nodes-updates/server-root.ocs.sign' but no corresponding inventory available: waiting`
  3. meanwhile the .gz arrives, and is ignored, because of lock.

we should have the same mecanism of waiting waitForSig if signature is there but not inventory

Actions #6

Updated by Nicolas CHARLES over 3 years ago

interestingly, I cannot reproduce it by copying manually the files

Actions #7

Updated by Nicolas CHARLES over 3 years ago

1 have one core

Actions #8

Updated by Nicolas CHARLES over 3 years ago

  • Related to Bug #19585: Sometime inventory processing is not done when inventory is receveived added
Actions #9

Updated by François ARMAND almost 3 years ago

  • Related to Bug #15655: After installation, root server OS is listed as "other Linux" in the web interface added
Actions #10

Updated by François ARMAND almost 3 years ago

  • Tracker changed from Bug to Architecture
  • Priority deleted (0)
Actions #11

Updated by Vincent MEMBRÉ almost 3 years ago

  • Target version changed from 7.1.0~beta1 to 7.1.0~beta2
Actions #12

Updated by Vincent MEMBRÉ almost 3 years ago

  • Target version changed from 7.1.0~beta2 to 7.1.0~rc1
Actions #13

Updated by Alexis Mousset almost 3 years ago

  • Target version changed from 7.1.0~rc1 to 7.2.0~beta1
Actions #14

Updated by François ARMAND over 2 years ago

  • Related to Bug #20994: Some inventories are stuck in the incoming folder added
Actions #15

Updated by François ARMAND over 2 years ago

  • Status changed from New to In progress
  • Assignee set to François ARMAND
Actions #16

Updated by François ARMAND over 2 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Nicolas CHARLES
  • Pull Request set to https://github.com/Normation/rudder/pull/4281
Actions #17

Updated by François ARMAND over 2 years ago

  • Status changed from Pending technical review to Pending release
Actions #18

Updated by Vincent MEMBRÉ over 2 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 7.2.0~beta1 which was released today.

Actions #19

Updated by François ARMAND about 2 years ago

  • Related to Bug #22099: Missing error log for inventory with bad signature added
Actions

Also available in: Atom PDF