Actions
Bug #25738
openRace condition when the agent run during a server/relay upgrade
Pull Request:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
0
Name check:
To do
Fix check:
To do
Regression:
No
Description
When upgrading a Rudder relay or server, if an agent is already running, it can end up trying to restart jetty or apache2 (at least) while they are being stopped by the package scripts.
Given how system works, this leads to making the stop operation fail, hence interrupting the packaging script and making the upgrade fail.
For example, the server install output:
Job for apache2.service canceled. ************************************************************************************** ERROR: rudder-webapp postinstall script failed ! Trying to recover the problem, you should check that your instance is properly working You may need to execute: apt-get install -f You should also try to manually execute: /opt/rudder/bin/rudder-upgrade Such errors should not happen, please open an issue for this problem on https://issues.rudder.io/projects/rudder/issues/new **************************************************************************************
As an illustration, if you have a stop action that takes a long time, then calling start while the stop job is running will with the default job mode (replace) cancel the installed/running stop job, and install a start job in the unit's job slot.
And indeed in the agent logs:
2024-10-25T10:05:25+00:00 R: @@rudder-service-apache@@result_repaired@@policy-server-root@@rudder-service-apache-root@@apache_started@@Apache service@@Started@@2024-10-25 10:05:09+00:00##root@#Ensure that service apache2 is running was repaired
Updated by Alexis Mousset 28 days ago
- Related to Bug #24559: When upgrading from 8.0 to 8.1 on Ubuntu 22, the webapp doesn't start added
Updated by Vincent MEMBRÉ 15 days ago
- Target version changed from 8.1.8 to 8.1.9
Actions