Project

General

Profile

Actions

Bug #10485

closed

Inventory endpoint accepts inventory even if ldap or postgresql connectivity failed

Added by Nicolas CHARLES almost 8 years ago. Updated over 7 years ago.

Status:
Released
Priority:
N/A
Category:
Server components
Target version:
Severity:
Major - prevents use of part of Rudder | no simple workaround
UX impact:
User visibility:
Getting started - demo | first install | level 1 Techniques
Effort required:
Priority:
53
Name check:
Fix check:
Regression:

Description

I had a misconfigured web interface, that accepted the root inventory, so it was deleted afterward, and I end up with "no machine inventory" for my root server

Agent log run:

udder     info: <address>Apache/2.4.18 (Ubuntu) Server at 127.0.0.1 Port 443</address>
rudder     info: </body></html>
rudder     info: Automatically promoting context scope for 'inventory_sent' to namespace visibility, due to persistence
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.gz' => '/usr/bin/curl -L -k -1 -f -s --proxy '' --user rudder:rudder -T /var/rudder/inventories/server-root.ocs.gz https://127.0.0.1/inventories/' seemed to work ok
rudder     info: Transforming '/usr/bin/curl -L -k -1 -f -s --proxy '' --user rudder:rudder -T /var/rudder/inventories/server-root.ocs.sign https://127.0.0.1/inventories/' 
rudder     info: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
rudder     info: <html><head>
rudder     info: <title>201 Created</title>
rudder     info: </head><body>
rudder     info: <h1>Created</h1>
rudder     info: <p>Resource /inventories/server-root.ocs.sign has been created.</p>
rudder     info: <hr />
rudder     info: <address>Apache/2.4.18 (Ubuntu) Server at 127.0.0.1 Port 443</address>
rudder     info: </body></html>
rudder     info: Automatically promoting context scope for 'inventory_sent' to namespace visibility, due to persistence
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.sign' => '/usr/bin/curl -L -k -1 -f -s --proxy '' --user rudder:rudder -T /var/rudder/inventories/server-root.ocs.sign https://127.0.0.1/inventories/' seemed to work ok
rudder     info: Transforming '/bin/rm -f /var/rudder/inventories/server-root.ocs.gz' 
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.gz' => '/bin/rm -f /var/rudder/inventories/server-root.ocs.gz' seemed to work ok
rudder     info: Transforming '/bin/rm -f /var/rudder/inventories/server-root.ocs.sign' 
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.sign' => '/bin/rm -f /var/rudder/inventories/server-root.ocs.sign' seemed to work ok
rudder     info: Created file '/var/rudder/tmp/inventory_sent', mode 0600
rudder     info: Touched (updated time stamps) for path '/var/rudder/tmp/inventory_sent'
rudder     info: Transforming '/bin/rm -f /var/rudder/tmp/inventory/server-root.ocs' 
rudder     info: Transformer '/var/rudder/tmp/inventory/server-root.ocs' => '/bin/rm -f /var/rudder/tmp/inventory/server-root.ocs' seemed to work ok
E| compliant     Inventory                 inventory                                    The inventory has been successfully sent
rudder     info: Deleted file '/opt/rudder/etc/force_inventory'
   info          Inventory                 inventory                                    An inventory was already sent less than 8 hours ago
rudder     info: Can't stat file '/var/rudder/cfengine-community/inputs/distributePolicy/1.0/nodeslist.json' on 'localhost' in files.copy_from promise
E| compliant     DistributePolicy          Configure ncf                                Configure ncf was correct
   warning       DistributePolicy          Propagate nodeslist                          Cannot copy local nodes list
rudder     info: Transforming '/var/rudder/tools/send-clean.sh http://localhost:8080/endpoint/upload/ /var/rudder/inventories/incoming/server-root.ocs.gz /var/rudder/inventories/received/ /var/rudder/inventories/failed/' 
rudder     info:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
rudder     info:                                  Dload  Upload   Total   Spent    Left  Speed
100  182k  100    62  100  182k     55   163k  0:00:01  0:00:01 --:--:--  163k
rudder     info: Transformer '/var/rudder/inventories/incoming/server-root.ocs.gz' => '/var/rudder/tools/send-clean.sh http://localhost:8080/endpoint/upload/ /var/rudder/inventories/incoming/server-root.ocs.gz /var/rudder/inventories/received/ /var/rudder/inventories/failed/' seemed to work ok
E| compliant     DistributePolicy          Send inventories to CMDB                     Incoming inventories were successfully added to Rudder
E| compliant     server-roles              Check logrotate configur|                    The logrotate configuration is correct
E| compliant     server-roles              Check LDAP in rudder-web|                    The Rudder Webapp configuration files are OK (checked LDAP password)
E| compliant     server-roles              Check LDAP credentials                       The OpenLDAP configuration file is OK (checked rootdn password)
rudder     info: Executing 'no timeout,uid=112' ... '/usr/bin/psql -q -c "ALTER USER rudder WITH PASSWORD '8f12d6dcfb56'"'
rudder     info: Completed execution of '/usr/bin/psql -q -c "ALTER USER rudder WITH PASSWORD '8f12d6dcfb56'"'
E| compliant     server-roles              Check SQL in rudder-weba|                    The Rudder Webapp configuration files are OK (checked SQL password)
E| repaired      server-roles              Check SQL credentials                        The Rudder PostgreSQL user account's password has been changed
E| compliant     server-roles              Check rudder-passwords.c|                    The Rudder passwords file is present and secure
E| compliant     server-roles              Check allowed networks c|                    The Rudder allowed networks configuration is OK
E| compliant     server-roles              Check WebDAV credentials                     The Rudder WebDAV user and password are OK
R: [INFO] Executing is-active-process on apache2 using the systemctl method
E| compliant     server-roles              Check apache process                         Check apache process running was correct
R: [INFO] Executing is-enabled on apache2 using the systemctl method
E| compliant     server-roles              Check apache boot script                     Check apache boot starting parameters was correct
R: [INFO] Executing is-active-process on .*java.*/opt/rudder/jetty7/start.jar using the systemctl method
E| compliant     server-roles              Check jetty process                          Check jetty process running was correct
E| compliant     server-roles              Check configuration-repo|                    The /var/rudder/configuration-repository directory is present
E| compliant     server-roles              Check configuration-repo|                    The /var/rudder/configuration-repository GIT lock file is not present or not older than 5 minutes
rudder     info: Executing 'no timeout' ... '/usr/bin/curl --proxy '' -s http://localhost:8080/rudder/api/techniqueLibrary/reload |/bin/grep -q OK'
   error: Finished command related to promiser '/usr/bin/curl --proxy '' -s http://localhost:8080/rudder/api/techniqueLibrary/reload |/bin/grep -q OK' -- an error occurred, returned 1
rudder     info: Completed execution of '/usr/bin/curl --proxy '' -s http://localhost:8080/rudder/api/techniqueLibrary/reload |/bin/grep -q OK'
   info          server-roles              Check Technique library |                    The /opt/rudder/etc/force_technique_reload file is present. Reloading Technique library...
   warning       server-roles              Check Technique library |                    The Technique library failed to reload. Will try again next time
   error: Method 'root_technique_reload' failed in some repairs
rudder     info: Executing 'no timeout' ... '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/rudder/api/status |/bin/grep -q OK'
   error: Finished command related to promiser '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/rudder/api/status |/bin/grep -q OK' -- an error occurred, returned 1
rudder     info: Completed execution of '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/rudder/api/status |/bin/grep -q OK'
rudder     info: Executing 'no timeout' ... '/bin/systemctl --no-ask-password restart rudder-jetty.service'
rudder     info: Completed execution of '/bin/systemctl --no-ask-password restart rudder-jetty.service'
R: [INFO] Executing restart on rudder-jetty using the systemctl method
R: [INFO] Promise repaired, made a change: Run action restart on service rudder-jetty
R: [INFO] Promise repaired, made a change: Restart service rudder_jetty
E| error         server-roles              Check rudder status                          The http://localhost:8080/rudder/api/status web application failed to respond for the second time. Restarting jetty NOW !
   error: Method 'generic_alive_check' failed in some repairs
rudder     info: Executing 'no timeout' ... '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/endpoint/api/status |/bin/grep -q OK'
rudder     info: Automatically promoting context scope for 'site_ok' to namespace visibility, due to persistence
rudder     info: Completed execution of '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/endpoint/api/status |/bin/grep -q OK'
E| compliant     server-roles              Check endpoint status                        The http://localhost:8080/endpoint/api/status web application is running
R: [INFO] Executing is-active-process on /opt/rudder/libexec/slapd using the systemctl method
E| compliant     server-roles              Check slapd process                          Check slapd process running was correct
E| compliant     server-roles              Check PostgreSQL configu|                    There is no need of specific postgresql configuration on this system
R: [INFO] Executing is-active-process on postgres:.* writer process using the systemctl method
E| compliant     server-roles              Check postgresql process                     Check postgresql process running was correct
R: [INFO] Executing is-enabled on postgresql using the systemctl method

rudder-jetty logs

        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.eclipse.jetty.start.Main.invokeMain(Main.java:473)
        at org.eclipse.jetty.start.Main.start(Main.java:615)
        at org.eclipse.jetty.start.Main.main(Main.java:96)
2017-03-22 17:55:50.709:INFO:oejs.AbstractConnector:Started SelectChannelConnector@127.0.0.1:8080
[2017-03-22 17:59:06] INFO  com.normation.inventory.provisioning.endpoint.FusionReportEndpoint - New input inventory: 'server-root.ocs'
[2017-03-22 17:59:07] INFO  com.normation.inventory.provisioning.endpoint.FusionReportEndpoint - Inventory 'server-root.ocs' parsed in 698 milliseconds ms, now checking signature
[2017-03-22 17:59:07] INFO  com.normation.inventory.provisioning.endpoint.FusionReportEndpoint - Inventory 'server-root.ocs' signature checked in 274 milliseconds ms, now saving
2017-03-22 17:59:09.446:INFO:oejs.Server:Graceful shutdown SelectChannelConnector@127.0.0.1:8080

(note that the webapp is shut down, because it didn't correctly work

Targeting to 4.1, but it may happen in every version

Actions #1

Updated by François ARMAND almost 8 years ago

  • User visibility changed from First impressions of Rudder to Getting started - demo | first install | level 1 Techniques
Actions #2

Updated by Benoît PECCATTE almost 8 years ago

  • Priority set to 54
Actions #3

Updated by Nicolas CHARLES almost 8 years ago

  • Target version set to 3.1.19

Happens in 3.1

Actions #4

Updated by Vincent MEMBRÉ almost 8 years ago

  • Target version changed from 3.1.19 to 3.1.20
Actions #5

Updated by François ARMAND over 7 years ago

The problem here is that we have two steps:

- 1/ check that inventory is well formed and signature is OK. If so, we ACK it and tell the sender "it is OK, I will process it" and put it in a queue. Notice that even if an error happens here, only the root node will know.
- 2/ try to save the inventory. But if an error happens here, the agent does not know that there was a problem.

Before accepting a node, we could check that LDAP is up and report an error if so.

Actions #6

Updated by François ARMAND over 7 years ago

  • Status changed from New to In progress
  • Assignee set to François ARMAND
Actions #7

Updated by François ARMAND over 7 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Nicolas CHARLES
  • Pull Request set to https://github.com/Normation/ldap-inventory/pull/104
Actions #8

Updated by François ARMAND over 7 years ago

  • Status changed from Pending technical review to Pending release
Actions #9

Updated by Vincent MEMBRÉ over 7 years ago

  • Status changed from Pending release to Released
  • Priority changed from 54 to 53

This bug has been fixed in Rudder 3.1.20, 4.0.5 and 4.1.2 which were released today.

Actions

Also available in: Atom PDF