Bug #10785
closedWith initial promises, cf-execd is not restarted
Description
I have a test case on debian 8, on https://app.packet.net where after the installation, before acceptation, cf-execd was not started.
I could see in /var/log/syslog agent runs every 5 minutes like that (node not accepted yet):
May 24 14:55:01 agent1 CRON[7092]: (root) CMD (/opt/rudder/bin/rudder agent check >/dev/null) May 24 14:55:01 agent1 CRON[7093]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) May 24 14:55:01 agent1 rudder[7160]: CFEngine(agent) rudder Failed to establish TLS connection: underlying network error (Connection reset by peer) May 24 14:55:01 agent1 rudder[7160]: CFEngine(agent) rudder No suitable server found May 24 14:55:01 agent1 rudder[7160]: CFEngine(agent) rudder Failed to establish TLS connection: underlying network error (Connection reset by peer) May 24 14:55:01 agent1 rudder[7160]: CFEngine(agent) rudder No suitable server found May 24 14:55:01 agent1 rudder[7160]: CFEngine(agent) rudder R: *********************************************************************************#012* rudder-agent could not get an updated configuration from the policy server. *#012* This can be caused by: *#012* * an incorrect DNS resolution of this node *#012* * an agent key that has been changed *#012* * if this node is not accepted or deleted node on the Rudder root server *#012* * if this node has changed policy server without sending a new inventory *#012* Any existing configuration policy will continue to be applied without change. *#012********************************************************************************* May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@log_info@@hasPolicyServer-root@@common-root@@00@@common@@StartRun@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Start execution with config [0] May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_error@@hasPolicyServer-root@@common-root@@00@@Update@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Cannot update tools (update flag file) May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_error@@hasPolicyServer-root@@common-root@@00@@Update@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Cannot update configuration policy May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: *********************************************************************************#012 * rudder-agent could not get an updated configuration from the policy server. *#012 * This can be caused by a network issue, an unavailable server, or if this *#012 * node was deleted from the Rudder root server. *#012 * Any existing configuration policy will continue to be applied without change. *#012 ********************************************************************************* May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder Unable to open template file '/var/rudder/cfengine-community/inputs/rudder_expected_reports.csv' to make '/var/rudder/cfengine-community/state/rudder_expected_reports.7171.csv.tmp' May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@ncf Initialization@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Configuration library initialization was correct May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@Security parameters@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#The internal environment security is acceptable May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@Red Button@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Red Button is not in effect, continuing as normal... May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_na@@hasPolicyServer-root@@common-root@@00@@Process checking@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Rudder agent proccesses check is done by the rudder-agent cron job May 24 14:55:03 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@CRON Daemon@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Cron daemon status was correct May 24 14:55:05 agent1 rudder[7171]: CFEngine(agent) rudder Q: "...-inventory --lo": [info] FusionInventory instance: builtin#012Q: "...-inventory --lo": [info] running task Inventory#012Q: "...-inventory --lo": Use of uninitialized value $FQDN in string eq at /opt/rudder/share/fusioninventory/lib/FusionInventory/Agent/Task/Inventory/Generic.pm line 21.#012Q: "...-inventory --lo": [info] Inventory saved in /var/rudder/tmp/inventory/agent1-8b099c38-5d93-4524-bb3e-bd28600c4c59.ocs#012Q: "...-inventory --lo": [error] Can't locate object method "isEnabled" via package "FusionInventory::Agent::Task::Deploy" at /opt/rudder/share/fusioninventory/lib/FusionInventory/Agent.pm line 473. May 24 14:55:05 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Inventory@@log_debug@@inventory-all@@inventory-all@@00@@inventory@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Running inventory May 24 14:55:05 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Inventory@@log_info@@inventory-all@@inventory-all@@00@@inventory@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Curl installed May 24 14:55:05 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Inventory@@log_debug@@inventory-all@@inventory-all@@00@@inventory@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#User list generation tool is not present yet. Skipping... May 24 14:55:06 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Inventory@@log_debug@@inventory-all@@inventory-all@@00@@inventory@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#CPUID generation tool not available for this platform, or not present yet. Skipping... May 24 14:55:07 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Inventory@@result_error@@inventory-all@@inventory-all@@00@@inventory@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Generated inventory has been detected as invalid May 24 14:55:07 agent1 rudder[7171]: CFEngine(agent) rudder R: ********************************************************************************#012* WARNING Generated inventory is not valid, as it is missing mandatory fields. *#012* Not sending it to the Rudder Server *#012* You can get a full error message by running: /opt/rudder/bin/perl -I /opt/rudder/lib/perl5 /var/rudder/cfengine-community/inputs/inventory/1.0/test-inventory.pl /var/rudder/tmp/inventory/agent1-8b099c38-5d93-4524-bb3e-bd28600c4c59.ocs #012******************************************************************************** May 24 14:55:07 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Inventory@@result_success@@inventory-all@@inventory-all@@00@@inventory@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#Next inventory scheduled between 00:00 and 06:00 May 24 14:55:07 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@Binaries update@@None@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#The agent binaries in /var/rudder/cfengine-community/bin are up to date May 24 14:55:07 agent1 rudder[7171]: CFEngine(agent) rudder R: @@Common@@log_info@@hasPolicyServer-root@@common-root@@00@@common@@EndRun@@2017-05-24 14:55:02+00:00##8b099c38-5d93-4524-bb3e-bd28600c4c59@#End execution with config [0]
But /var/rudder/cfengine-community/outputs/ remained empty.
As soon as the initial promise were replaced by the updated promises for the node, cf-execd was restarted.
After the first run after acceptation, log started comming to /var/rudder/cfengine-community/outputs/
Hypothesis is that as cf-execd was not started, every 5 minutes rudder check was running cf-agent (so no reports in outputs) and that something in the initial promises was preventing cf-execd to be restarted (something to with the file about run frequency not present?)