Bug #6448
closedError 404 after rebooting system
Added by Jérémy HOCDÉ over 9 years ago. Updated about 8 years ago.
Description
After rebooting system, rudder does not work properly
HTTP ERROR: 404 Problem accessing /rudder. Reason: Not Found Powered by Jetty:/
FiX: service rudder-jetty restart
tested on ubuntu 14.04 server adm64 - rudder-server-root 3.0.3
Files
2015_03_31.stderrout.log (80.2 KB) 2015_03_31.stderrout.log | Jérémy HOCDÉ, 2015-03-31 16:51 |
Updated by Vincent MEMBRÉ over 9 years ago
- Category set to System integration
- Assignee set to Matthieu CERDA
- Priority changed from N/A to 2
- Target version set to 3.0.4
- How to reproduce updated (diff)
This is strange, I'm assigning Matthieu, since I think he will be the best at this
Updated by Matthieu CERDA over 9 years ago
- Description updated (diff)
- Status changed from New to Discussion
- Assignee changed from Matthieu CERDA to Jérémy HOCDÉ
Jeremy, dennis:
Can you please give me what's in /var/log/rudder/webapp/ in the log corresponding to the day when it happened ?
I especially look for a stacktrace near the end of the file, if Jetty outputs a 404 it means Rudder did not initialize properly.
Updated by Florian Heigl over 9 years ago
At least on SLES this was caused by a race:
If slapd isn't done starting when jetty wants to connect it will never successfully initialize.
Let's see if this is the same thing.
Updated by Jérémy HOCDÉ over 9 years ago
- File 2015_03_31.stderrout.log 2015_03_31.stderrout.log added
Thks Matthieu for helping :o)
Updated by Matthieu CERDA over 9 years ago
- Related to Bug #6263: Startup links for rudder-server-root on Ubuntu are not correct - before 3.1 added
Updated by Matthieu CERDA over 9 years ago
- Assignee changed from Jérémy HOCDÉ to François ARMAND
- Priority changed from 2 to 3
- SysV init script dependency is ... flaky, at best. It does'nt always work even with correct dependency definition => j(etty) comes before s(lapd), it starts first => problem
- New init systems worsen the problem: they handle dependencies natively, but try by default to run init scripts concurrently => jetty will certainly run before slapd again, unless you create a native job/unit/whatever specifying "slapd must be started first"
Also, on splitted installations, slapd will not be on the same machine. We can not simply add a hard dependency on rudder-slapd because it might be on another machine.
The solution would be either to have:- Jetty able to start without a ready database (since they get started right after, the error page would not be here for long :) )
- Have an intelligent init script that tests database connections before starting (but it would be difficult to do it properly, it would bloat some more an already bloated script and make it harder to create native init jobs/units...
The best approach would be number one for me, but I remember you told me it would be hard to do it. Can you tell here how and why it would be difficult to have a webapp able to start without a database connection ? (even if it were to display "unable to connect, please try again later" or something like that in the meantime). Actually, anything else than 404 + webapp unloaded by Jetty would be cool.
Also, an important note: this 404 will not last for more than 5/10mn, as the Rudder agent will notice the webapp does not run properly and restart it automatically after 2 consecutive failed verifications.
This bug report is thus not over-critical, setting priority as 3.
Updated by François ARMAND over 9 years ago
Matthieu,
The problem is that we are doing a lot of consistency checks at boot time (in stored data, migration, etc). The way it is done, it's quite hard to delay them "when connection is available" - or a least, it needs an architecture change that is non trivial. (and of course, we had as hypothesis in a lot of code that data are coherent / up to date / etc, because of that architecture choice).
It could be easier to have a check about connection availability, and display some "you must restart app when connection ready" (but looking at the code, I'm not even sure of that).
Updated by Nicolas CHARLES over 9 years ago
On this point, I think we could try, in the code base of the web interface, have a check at start that checks if database is reachable, and if ldap is reachable. If not, wait for 10 secondes, and try again. If it fails, then abort/display an error page
Updated by François ARMAND over 9 years ago
That could be a good idea, and cover a big part of use case - at least, the simple races condition on service init.
It requires that we don't test connection on class instanciation, but that should be a detail. Thanks for the idea !
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 3.0.4 to 3.0.5
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 3.0.5 to 3.0.6
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 3.0.6 to 3.0.7
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 3.0.7 to 3.0.8
Updated by Vincent MEMBRÉ over 9 years ago
- Target version changed from 3.0.8 to 3.0.9
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 3.0.9 to 3.0.10
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 3.0.10 to 3.0.11
Updated by Vincent MEMBRÉ about 9 years ago
- Target version changed from 3.0.11 to 3.0.12
Updated by Vincent MEMBRÉ almost 9 years ago
- Target version changed from 3.0.12 to 3.0.13
Updated by Vincent MEMBRÉ almost 9 years ago
- Target version changed from 3.0.13 to 3.0.14
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 3.0.14 to 3.0.15
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 3.0.15 to 3.0.16
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 3.0.16 to 3.0.17
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 3.0.17 to 302
Updated by Alexis Mousset over 8 years ago
- Target version changed from 302 to 3.1.12
Updated by Vincent MEMBRÉ over 8 years ago
- Target version changed from 3.1.12 to 3.1.13
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.13 to 3.1.14
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.14 to 3.1.15
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.15 to 3.1.16
Updated by Vincent MEMBRÉ about 8 years ago
- Target version changed from 3.1.16 to 3.1.17
Updated by François ARMAND about 8 years ago
- Status changed from Discussion to Rejected
I'm closing that one because the problem was solved.
The propose enhancement is a major, risky, almost-whole-application refactoring that need a dedicated user story and a couple of month of work ;)