Bug #11682
closedRudder agent 4.1.8 fails to run with promises generated by 4.1.3 server
Description
Since 4.1.3 till 4.1.7 all agents implements promises successfully.
Today I got stucked to:
# rudder agent run -i Rudder agent 4.1.8-trusty0 (CFEngine Core 3.10.2) Node uuid: 791c561f-baa5-4030-a325-20c5d65cde10 /var/rudder/ncf/common/30_generic_methods/file_ensure_key_value_option.cf:60:0: error: Undefined bundle ncf_maintain_keys_values_option with type edit_line error: Policy failed validation with command '"/var/rudder/cfengine-community/bin/cf-promises" -c "/var/rudder/cfengine-community/inputs/promises.cf"' error: Failsafe condition triggered. Interactive session detected, skipping failsafe.cf execution. error: Error reading CFEngine policy. Exiting...
So I suppose something is broken in initial promises in new build.
The current brief workaround is to revert agent version to 4.1.7.
But this is not so plain with rudder repo (at least for ubuntu 14.04), because of only the last version on index. (I done revert by means of my local repo: just placed agent here and turned off rudder repo.)
Updated by Benoît PECCATTE about 7 years ago
- Severity set to Major - prevents use of part of Rudder | no simple workaround
- User visibility set to Getting started - demo | first install | level 1 Techniques
- Effort required set to Small
- Priority changed from 0 to 85
It seem that ncf has not been upgraded with your rudder package.
could you try upgrading it ?
As for our side, we should update the dependency on ncf to be exact.
Updated by Dmitry Svyatogorov about 7 years ago
As far as I understand, NCF is server-side. On ubuntu-clients rudder installations are alike to:
# dpkg -l |grep ncf # dpkg -l |grep rudder ii rsyslog 7.4.4-1ubuntu2.6rudder1 amd64 reliable system and kernel logging daemon ii rudder-agent 4.1.5-trusty0 amd64 Configuration management and audit tool - agent
The problematic promises was generated on server 4.1.3, that was installed on fresh distro and had not yet been upgraded. However, directives was imported from old 3.2.10 through API, and NCF was rsynced because it's not yet covered by API.
So, have I to do something with NCF on server-side? M.b. I missed some nuance during server "resettlement"?
Updated by Benoît PECCATTE about 7 years ago
The question was about ncf version on the server side.
You said that you rsynced ncf from 3.2, which folder did you synchronize ?
Rudder cannot work on 4.1 with ncf from 3.2.
You should make sure that you have the last version of ncf on your server.
Updated by Dmitry Svyatogorov almost 7 years ago
Finally, I found the cause. It's not NCF.
The stand-alone host was executed prepacked bundle from older agent.
We currently are using this trick to execute run-once recipe on out-of-rudder hosts.
The mistake was to install the actual agent instead of putting the version under which the bundle was prepared. In this case error looks like "agent is crashing while working with older server", but in fact is not.
- M.b. it will be rational to use some mark inside pulled promises to produce more informative "wtf version mismatch"?
- M.b. it will be helpful to document such a trick? It's usefull during migration of a large infrastructure to Rudder.
Sorry for false alarm.
Updated by François ARMAND almost 7 years ago
- Status changed from New to Rejected
No problem, we are glad that you found the root cause !
We are thinking to had some version sanity check to prevent that kind of problem (or more preciselly, to give meaningfull info to the user in that case). We are not here yet.
I'm closing this one :)