Bug #17939
closedAfter upgrade from 6.0 to 6.1, deleted technique in editor still present in techniques repos breaks generation
Description
When it was in 6.0, the server had a custom directory I created in /var/rudder/configuration-repository/shared-files called sit-rudder.
This sit-rudder directory is itself a clone from a custom git repository from our company. It is changing very often.
Some files from this directory are used in builtin directives "File download (Rudder server)" and are happily pushed towards some nodes.
When we change the files in this sit-rudder directory, and wait enough, we can see them changing on our nodes smoothly.
This was before the 6.0 to 6.1 upgrade.
For the upgrade, as I was a bit lost on what I had to do, I ran :
echo "deb http://repository.rudder.io/apt/6.1/ $(lsb_release -cs) main" > /etc/apt/sources.list.d/rudder.list
apt-get update
apt-get install rudder-server-root
rudder server upgrade-techniques -o
rudder server upgrade-techniques -i
rudder server upgrade-techniques -u
After that, things seemed to work well, until I had to change the content of my sit-rudder custom dir.
At this point, Rudder began to :
- become unable to re-generate its policies : [1]
- show strange warnings in the Techniques editor, showing things not saved though I didn't change anything : [2]
I tried to :
- touch /opt/rudder/etc/force_ncf_technique_update; systemctl restart rudder-jetty
but to no avail
- cd /var/rudder/configuration-repository; git add shared-files/sit-rudder # and commit
This seemed to work at first, but now I know that after the next change in sit-rudder, the issue will arise again.
- I also tried to add shared-files/sit-rudder in /var/rudder/configuration-repository/.gitignore, but Rudder wasn't able to save any change in the web gui.
At present, our server is stuck and no policies are generated.
In the Techniques editor, I never use the "Resource" tab in zero technique.
In the server, the log file /var/log/rudder/core/rudder-webapp.log is telling [3].
I'm saying it again, but : before the 6.1 upgrade, this whole workflow was working fine.
At present, I don't know what can I check or give more info to cope with this.
Nicolas
[1] :
Policy update process was stopped due to an error: ⇨ Policy update error for process '3853' at 2020-07-10 14:24:54 ⇨ Cannot write nodes configuration ⇨ Accumulated: Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___timezone___centos_new, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___timezone___centos_new, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___timezone___centos_new, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, [...]
[2] :
Unsaved changes Some changes made on Technique 'sit - package vim - centos' were not saved. If you switch before saving, all your changes will be lost. Field Stored value New value Resources [] [{"name":"shared-files/sit-rudder","state":"untouched"}]
[3] :
Jul 10 15:12:19 sit-conf-prd01 rudder[policy.generation.manager]: [ERROR] Error when updating policy, reason was: Cannot write nodes configuration <- Accumulated: Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___timezone___centos_new, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___timezone___centos_new, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___timezone___centos_new, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___remote_clock_sync_enabled, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___remote_clock_sync_enabled, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. ; Unexpected: Error when trying to open template 'TechniqueResourceIdByPath(List(techniques, ncf_techniques, sit___remote_clock_sync_enabled, 1.0, resources, shared-files),sit-rudder)'. Check that the file exists with a .st extension and is correctly commited in Git, or that the metadata for the technique are corrects. Jul 10 15:12:19 sit-conf-prd01 rudder[policy.generation.manager]: [INFO] Flag file '/opt/rudder/etc/policy-update-running' successfully removed Jul 10 15:12:19 sit-conf-prd01 rudder[policy.generation.manager]: [ERROR] Policy update error for process '3854' at 2020-07-10 15:12:19: Cannot write nodes configuration
Updated by Nicolas Ecarnot over 4 years ago
Update : Carefully reading the error logs, I see that not all my custom techniques are implied.
Comparing the "good and the bad" ones (those which don't generate an issue), I see that in /var/rudder/configuration-repository/techniques/ncf_techniques are directories named along old techniques I made but since deleted.
I don't know if I can just delete them and commit?
Updated by François ARMAND over 4 years ago
You can go to "settings > active tree" and click on techniques and delete from here. It may happen that a technique from the editor was deleted for it but remains known on rudder (it should not happen and it's a bug, but it can).
Updated by François ARMAND over 4 years ago
I'm unable to reproduce.
What I did:
- install rudder server and a node with Rudder 6.0.6 on debian 10
- clone a git repos in server /var/rudder/configuration-repository/sit-rudder
- create 2 user techniques with executre command generic method (and directives and bind to rule)
- create a directive "Download file (from server)" (and bind to rule)
(note that shared-file is not added ni /var/rudder/configuration-repository in my case - is it the same for you?)
Check that the node download file, and that if I change sit-rudder repos and git pull ni clone, node get updated content.
Then I upgraded to rudder 6.1.0 with your procedure (putting "6.1.0" in repos line). I checked that everything was working ok, deleted one of the two user technique (but I didn't leave anything in /var/rudder/configuration-repository/techniques/ncf_techniques
so that's a difference).
I modified repos, pulled in clone, regenerated, created new technique => everything is fine.
Some random ideas:
- your message about "something change in resources" is very strange. Rudder believes that you added "shared-files/sit-rudder" as a resource, but that it is not in saved for technique in rudder fs (under /var/rudder/configuration-repository/techniques/ncf_techniques/xxxx/1.0/resources
). I don't even know how you could do that". Perhaps browser local storage used by rudder got corrupted for some reason. Can you try to clean it ? In firefox: shift+f9 to open storage debug panel, at the bottom there is "session storage", click on url of rudder server, then there should be two keys: storeOriginalTechnique and storeSelectedTechnique => delete them (suppr). (session storage is deleted when you close your firefox or the tab with rudder, not when you log out of rudder)
- do you use symlink ? I'm not sure why it would create chaos like what you experience, but I feel they could
- the error message is incorrect for Error when trying to open template 'TechniqueResourceIdByPath(...
. It should not be "template" but resource. Can you provide us with a tar.gz of the technique directory of one of the technique rising that error ? (for ex: /var/rudder/configuration-repository/techniques/ncf_techniques/sit___timezone___centos_new
) ?
Updated by Nicolas Ecarnot over 4 years ago
François ARMAND wrote in #note-2:
You can go to "settings > active tree" and click on techniques and delete from here. It may happen that a technique from the editor was deleted for it but remains known on rudder (it should not happen and it's a bug, but it can).
François,
This morning, I used your hint about cleaning things up with this active tree, and by keeping an eye on /var/log/rudder/core/rudder-webapp.log
It can be too soon to yell victory, but so far so good, as no error appear now and the policies were able to being built.
The next step is for me to change the content of shared-files/sit-rudder and see how it goes.
I'm sorry to say that but I think your right when you write : "my best guess for now is that something want horribly wrong during update, technique deleted from editor but not from fs/git had their metadata.xml corrupted, Ragnarök ensues.".
And the worst : I'm quite sure I won't be able to reproduce, and honestly, not really willing to as it's a production setup.
Updated by François ARMAND over 4 years ago
- Subject changed from After upgrade from 6.0 to 6.1, policies cannot be generated to After upgrade from 6.0 to 6.1, deleted technique in editor still present in techniques repos breaks generation
- Severity changed from Critical - prevents main use of Rudder | no workaround | data loss | security to Major - prevents use of part of Rudder | no simple workaround
- Priority changed from 76 to 52
No pb regarding reproduction/etc. We did pinned-out the cause, even if not the root cause of that first problem.
So we know that if technique deleted from editor, but still present in /var/rudder/configuration-repository/techniques/ncf_techniques/
exists, migration is broken. We need to prevent that at least.
I'm updating title accordingly and lessen a bit severity since we seem to have a workaround now.
Updated by François ARMAND over 4 years ago
- Related to Bug #17977: Resource automatically added on newly created technique - since 6.1 upgrade added
Updated by Vincent MEMBRÉ over 4 years ago
- Target version changed from 6.1.2 to 6.1.3
Updated by François ARMAND over 4 years ago
- Target version changed from 6.1.3 to 6.1.4
Updated by Vincent MEMBRÉ about 4 years ago
- Target version changed from 6.1.4 to 6.1.5
- Priority changed from 52 to 51
Updated by Vincent MEMBRÉ about 4 years ago
- Target version changed from 6.1.5 to 6.1.6
- Priority changed from 51 to 50
Updated by Vincent MEMBRÉ about 4 years ago
- Target version changed from 6.1.6 to 6.1.7
Updated by Vincent MEMBRÉ almost 4 years ago
- Target version changed from 6.1.7 to 6.1.8
- Priority changed from 50 to 48
Updated by Vincent MEMBRÉ almost 4 years ago
- Target version changed from 6.1.8 to 6.1.9
- Priority changed from 48 to 47
Updated by Vincent MEMBRÉ almost 4 years ago
- Target version changed from 6.1.9 to 6.1.10
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.1.10 to 6.1.11
- Priority changed from 47 to 46
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.1.11 to 6.1.12
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.1.12 to 6.1.13
- Priority changed from 46 to 45
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.1.13 to 6.1.14
- Priority changed from 45 to 44
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.1.14 to 6.1.15
- Priority changed from 44 to 43
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.1.15 to 6.1.16
Updated by Vincent MEMBRÉ about 3 years ago
- Target version changed from 6.1.16 to 6.1.17
Updated by Vincent MEMBRÉ about 3 years ago
- Target version changed from 6.1.17 to 6.1.18
Updated by Vincent MEMBRÉ almost 3 years ago
- Target version changed from 6.1.18 to 6.1.19
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.1.19 to 6.1.20
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.1.20 to 6.1.21
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.1.21 to old 6.1 issues to relocate
Updated by François ARMAND about 1 year ago
- Status changed from New to Rejected
- Priority changed from 43 to 0
- Regression set to No
I'm closing that one since it's very very old, and (hopefuly) not relevant anymore