Bug #8379
closedinvalid reporting in ncf
Description
I'm having spurious reports generated on the nodes:
2016-05-25T13:45:05+0000 info: /default/restart_services/methods/'restart_rsyslog'/default/service_restart/methods/'wrap'/default/service_restart_if/methods/'restart'/default/service_action/methods/'action'/default/command_execution/commands/'/sbin/service rsyslog restart'[0]: Last 2 quoted lines were generated by promiser '/sbin/service rsyslog restart' 2016-05-25T13:45:05+0000 info: /default/restart_services/methods/'restart_rsyslog'/default/service_restart/methods/'wrap'/default/service_restart_if/methods/'restart'/default/service_action/methods/'action'/default/command_execution/commands/'/sbin/service rsyslog restart'[0]: Completed execution of '/sbin/service rsyslog restart' R: @@my_test_technique@@result_repaired@@32377fd7-02fd-43d0-aab7-28460a91347b@@5caf1d29-8880-4279-b744-956db1bf4b6c@@3@@Command execution@@/bin/true@@2016-05-25 13:45:03+00:00##4293df6e-c63b-4ece-84dd-50abafc092ca@#Execute the command /sbin/service rsyslog restart was repaired R: @@Common@@log_repaired@@hasPolicyServer-3be88199-01d9-4470-918c-73fbff78183d@@common-3be88199-01d9-4470-918c-73fbff78183d@@1@@Log system for reports@@None@@2016-05-25 13:45:03+00:00##4293df6e-c63b-4ece-84dd-50abafc092ca@#Logging system has been restarted
here, rsyslog restart triggers a /bin/true report.
content of rudder_expected_reports.csv.res is
my_test_technique;;directory_create__tmp_bar;;32377fd7-02fd-43d0-aab7-28460a91347b@@5caf1d29-8880-4279-b744-956db1bf4b6c@@3;;Directory create;;/tmp/bar my_test_technique;;directory_create__tmp_plop;;32377fd7-02fd-43d0-aab7-28460a91347b@@5caf1d29-8880-4279-b744-956db1bf4b6c@@3;;Directory create;;/tmp/plop my_test_technique;;command_execution__bin_true;;32377fd7-02fd-43d0-aab7-28460a91347b@@5caf1d29-8880-4279-b744-956db1bf4b6c@@3;;Command execution;;/bin/true my_test_technique;;command_execution__bin_false;;32377fd7-02fd-43d0-aab7-28460a91347b@@5caf1d29-8880-4279-b744-956db1bf4b6c@@3;;Command execution;;/bin/false my_test_technique;;directory_create__tmp_fop;;32377fd7-02fd-43d0-aab7-28460a91347b@@5caf1d29-8880-4279-b744-956db1bf4b6c@@3;;Directory create;;/tmp/fop
Happened on 3.0, most likely on 2.11 also, on centos 6.6
Files
Updated by Nicolas CHARLES over 8 years ago
in this case:
class prefix is command_execution__sbin_service_rsyslog_restart
canonified_class_prefix is command_execution__sbin_service_rsyslog_restart
and detected technique is my_test_technique
Updated by Nicolas CHARLES over 8 years ago
and it is getting field in the file R: ^my_test_technique;;service_restart_rsyslog;;.*
which should not match anything
/var/rudder/cfengine-community/bin/cf-agent -V
CFEngine Core 3.6.0
Updated by Nicolas CHARLES over 8 years ago
Upgrading agent to 3.0 changes the behaviour:
reports is bogus
R: @@${report_data[1]}@@result_repaired@@${report_data[3]}@@${report_data[4]}@@${report_data[5]}@@2016-05-25 14:11:12+00:00##4293df6e-c63b-4ece-84dd-50abafc092ca@#Execute the command /sbin/service rsyslog restart was repaired
class report_data_read, even if number_lines is equal0
Updated by Nicolas CHARLES over 8 years ago
- File debug.out.gz debug.out.gz added
attached, debug output of agent 3.0
Updated by Nicolas CHARLES over 8 years ago
- File debug-2.11.out.gz debug-2.11.out.gz added
and the debug log on 2.11 agent
Updated by Alexis Mousset over 8 years ago
- Target version set to 2.11.22
This happens since #8343. I don't understand why for now.
Updated by Jonathan CLARKE over 8 years ago
- Project changed from Rudder to 41
- Target version changed from 2.11.22 to 0.x
Updated by Jonathan CLARKE over 8 years ago
- Status changed from New to In progress
- Assignee set to Jonathan CLARKE
Updated by Jonathan CLARKE over 8 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from Jonathan CLARKE to Nicolas CHARLES
- Pull Request set to https://github.com/Normation/ncf/pull/369
Updated by Jonathan CLARKE over 8 years ago
I have found a way to fix this. Looking at this part of the code:
vars: ... (logger_rudder_final_resfile_repaired|logger_rudder_temp_resfile_kept.!logger_rudder_temp_resfile_repaired.final_resfile_exists):: ... # 4/ Once the final expected reports file has been expanded, read in our array "number_lines" int => getfields("^${current_technique_report_info.technique_name};;${c_class_prefix};;.*", "${expected_reports_file}", ";;", "report_data"); classes: "report_data_read" expression => isgreaterthan("${number_lines}","0"); ... methods: report_data_read:: "any" usebundle => _rudder_common_reports_generic("${report_data[1]}", "${c_class_prefix}", "${report_data[3]}", "${report_data[4]}", "${report_data[5]}", "${message}"), classes => classes_generic("logger_rudder_${c_class_prefix}");
And analysing the verbose output from the agent, it became clear to me that the report_data_read class is being set on the first pass through this bundle, despite the fact the number_lines has not yet been populated. I can only assume that ${number_lines} resolves to some random number that happens to be > 0.
A simple workaround is to use the same class expression on the classes promise that examines number_lines as on the vars promise that populates number_lines. I have successfully tested this workaround, and ncf reporting behaves as expected, and no bogus ncf report is generated (I previously successfully reproduced the problem on a Rudder 3.2/ncf 1.x setup).
Updated by Jonathan CLARKE over 8 years ago
- Status changed from Pending technical review to Pending release
- % Done changed from 0 to 100
Applied in changeset commit:f0d6ea22d85e7a1f2960c33bbe6042887597ee3d.
Updated by Nicolas CHARLES over 8 years ago
I really don't understand this fix as all these classes are defined at the first bundle run, and then remain for the rest of the execution (so, when we get the extra report, the classes are already all defined
(tinkering with the promises)
Ok ! it does make sense, in a weird way. If the files .res and .tmp were already correct, logger_rudder_final_resfile_repaired is not defined, so it relies on the second part of the class condition (logger_rudder_temp_resfile_kept.!logger_rudder_temp_resfile_repaired.final_resfile_exists) which contains only global classes + final_resfile_exists that is a local class, defined after report_data_read
So, the value used for ${number_line} is the value that was defined at the previous bundle execution, hence the bogus report
Good catch Jon
Updated by Nicolas CHARLES over 8 years ago
- Related to Bug #8430: logger_rudder can re-read expected_reports_temp even without change, impacting performance significantly added
Updated by Vincent MEMBRÉ over 8 years ago
- Status changed from Pending release to Released
Updated by Alexis Mousset over 2 years ago
- Target version changed from 0.x to ncf-0.x
- Priority set to 0
Updated by Alexis Mousset over 2 years ago
- Project changed from 41 to Rudder
- Category set to Generic methods