Actions
Bug #15324
closedWebapp fails to boot (several time)
Status:
Released
Priority:
N/A
Assignee:
Category:
System integration
Target version:
Pull Request:
Severity:
Critical - prevents main use of Rudder | no workaround | data loss | security
UX impact:
User visibility:
Getting started - demo | first install | Technique editor and level 1 Techniques
Effort required:
Priority:
90
Name check:
To do
Fix check:
To do
Regression:
Description
When we install a new rudder server, it need to restart jetty 7 times before being successfull.
Here are the log:
Boot 1: suicide¶
First boot fails because LDAP is not ready and rudder kill itself. This seems legit.
[2019-07-26 09:49:23] ERROR scheduledJob - report logger could not fetch latest 100 non compliant reports, retry on next run: Could not get node information from database [2019-07-26 09:49:23] INFO bootchecks - LDAP and PostgreSQL connection are OK [2019-07-26 09:49:23] INFO bootchecks - Check mandatory DIT entries [2019-07-26 09:49:23] ERROR bootchecks - There is some required entries missing in the LDAP directory: ou=Pending Inventories, ou=Inventories, cn=rudder-configuration | ou=Inventories, cn=rudder-configuration | ou=Accepted Inventories, ou=Inventories, cn=rudder-configuration | ou=Inventories, cn=rudder-configuration | ou=Removed Inventories, ou=Inventories, cn=rudder-configuration | ou=Inventories, cn=rudder-configuration | ou=Rudder, cn=rudder-configuration [2019-07-26 09:49:23] ERROR application - Fatal error during boot, Rudder will stop now javax.servlet.UnavailableException: There is some required entries missing in the LDAP directory: ou=Pending Inventories, ou=Inventories, cn=rudder-configuration | ou=Inventories, cn=rudder-configuration | ou=Accepted Inventories, ou=Inventories, cn=rudder-configuration | ou=Inventories, cn=rudder-configuration | ou=Removed Inventories, ou=Inventories, cn=rudder-configuration | ou=Inventories, cn=rudder-configuration | ou=Rudder, cn=rudder-configuration at bootstrap.liftweb.checks.CheckDIT.FAIL$1(CheckDIT.scala:77) at bootstrap.liftweb.checks.CheckDIT.checks(CheckDIT.scala:106) at bootstrap.liftweb.SequentialImmediateBootStrapChecks.$anonfun$checks$1(BootstrapChecks.scala:99) at bootstrap.liftweb.SequentialImmediateBootStrapChecks.$anonfun$checks$1$adapted(BootstrapChecks.scala:91) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at bootstrap.liftweb.SequentialImmediateBootStrapChecks.checks(BootstrapChecks.scala:91) at bootstrap.liftweb.RudderConfig$.init(RudderConfig.scala:860) at bootstrap.liftweb.LiftInitContextListener.contextInitialized(LiftInitContextListener.scala:109) at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:953) at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:558) at org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:918) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:370) at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1497) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1459)
Boot 2: stall¶
In second boot, rudder fails to start after 1 minute and systemd kills it:
Jul 26 09:49:33 server systemd[1]: Starting Jetty Web Application Server... ..... Jul 26 09:50:34 server rudder-jetty.sh[12515]: Starting Jetty: . . . . . . . . . . . . . . . FAILED Fri Jul 26 09:50:34 UTC 2019 Jul 26 09:50:34 server systemd[1]: rudder-jetty.service: Control process exited, code=exited status=1 Jul 26 09:50:34 server systemd[1]: Failed to start Jetty Web Application Server. Jul 26 09:50:34 server systemd[1]: rudder-jetty.service: Unit entered failed state. Jul 26 09:50:34 server systemd[1]: rudder-jetty.service: Failed with result 'exit-code'.
In webapp logs, the last lines are:
[2019-07-26 09:49:51] WARN scheduledJob - There is no reports to log [2019-07-26 09:49:51] WARN com.normation.rudder.repository.jdbc.RudderPropertiesRepositoryImpl - last id not present in database, create it with value 0
Boot 3 - stall¶
Same than previsous, but elsewhere. Rudder fails to start in 1 minutes:
Jul 26 09:50:35 server systemd[1]: Starting Jetty Web Application Server... ... Jul 26 09:51:35 server rudder-jetty.sh[12961]: Starting Jetty: . . . . . . . . . . . . . . . FAILED Fri Jul 26 09:51:35 UTC 2019 Jul 26 09:51:35 server systemd[1]: Failed to start Jetty Web Application Server.
But app seems to block elsewhere (we get at least one generation, and errors on bad inventories):
[2019-07-26 09:50:57] ERROR com.normation.rudder.services.policies.WriteNodeCertificatesPemImpl - SystemError: An error occured; cause was: /var/rudder/lib/ssl/allnodecerts.pem.new [2019-07-26 09:50:58] ERROR com.normation.rudder.domain.nodes.NodeInfo - Error when trying to get the CFEngine-MD5 digest of CFEngine public key for node 'server.rudder.local' (root) <- unable to decode base64 string: invalid characters encountered in base64 data [2019-07-26 09:50:58] ERROR com.normation.rudder.domain.nodes.NodeInfo - Error when trying to get the sha-256 digest of CFEngine public key for node 'server.rudder.local' (root) <- unable to decode base64 string: invalid characters encountered in base64 data ... [2019-07-26 09:51:01] INFO policy.generation - Successful policy update '1' [started 2019-07-26 09:50:56 - ended 2019-07-26 09:51:01]
Boot 4 - stall¶
Again, app killed after 1min
Jul 26 09:51:36 server systemd[1]: Starting Jetty Web Application Server... ... Jul 26 09:52:36 server rudder-jetty.sh[14178]: Starting Jetty: . . . . . . . . . . . . . . . FAILED Fri Jul 26 09:52:36 UTC 2019 Jul 26 09:52:36 server systemd[1]: rudder-jetty.service: Control process exited, code=exited status=1 Jul 26 09:52:37 server systemd[1]: Failed to start Jetty Web Application Server. Jul 26 09:52:37 server systemd[1]: rudder-jetty.service: Unit entered failed state.
The last log are:
[2019-07-26 09:51:53] INFO bootchecks - No full archive of configuration-repository items seems to have been done, initialising the system with one [2019-07-26 09:51:54] INFO migration - No migration detected by migration script (table 'migrationeventlog' is empty or does not exist) [2019-07-26 09:51:55] INFO com.normation.rudder.rule.category.GitRuleCategoryArchiverImpl - Create a new archive: GitPath(archives/configurations-rules/2019-07-26_09-51-55.058)
Boot 5,6,7 - stalling again, again, and finally works¶
Here, we have 3 minutes before the next restart:
Jul 26 09:55:41 server systemd[1]: Starting Jetty Web Application Server... Jul 26 09:56:41 server rudder-jetty.sh[15847]: Starting Jetty: . . . . . . . . . . . . . . . FAILED Fri Jul 26 09:56:41 UTC 2019 Jul 26 09:56:41 server systemd[1]: rudder-jetty.service: Control process exited, code=exited status=1 Jul 26 09:56:42 server systemd[1]: Failed to start Jetty Web Application Server. Jul 26 09:56:42 server systemd[1]: rudder-jetty.service: Unit entered failed state. Jul 26 09:56:42 server systemd[1]: rudder-jetty.service: Failed with result 'exit-code'.
Things happen, but the last bootbecks is again about archive:
[2019-07-26 09:55:58] INFO bootchecks - No full archive of configuration-repository items seems to have been done, initialising the system with one
The next one is exactly the same.
And the last one finally succeed with the archive.
[2019-07-26 10:00:58] INFO bootchecks - No full archive of configuration-repository items seems to have been done, initialising the system with one [2019-07-26 10:00:58] DEBUG com.normation.rudder.rule.category.GitRuleCategoryArchiverImpl - Archived Rule category: /var/rudder/configuration-repository/ruleCategories/category.xml [2019-07-26 10:00:59] INFO com.normation.rudder.rule.category.GitRuleCategoryArchiverImpl - Create a new archive: GitPath(archives/configurations-rules/2019-07-26_10-00-59.360) [2019-07-26 10:00:59] DEBUG com.normation.rudder.repository.xml.GitRuleArchiverImpl - Archived rule: /var/rudder/configuration-repository/rules/32377fd7-02fd-43d0-aab7-28460a91347b.xml [2019-07-26 10:00:59] INFO com.normation.rudder.repository.xml.GitRuleArchiverImpl - Create a new archive: GitPath(archives/configurations-rules/2019-07-26_10-00-59.768) [2019-07-26 10:01:01] INFO com.normation.rudder.repository.xml.GitActiveTechniqueCategoryArchiverImpl - Create a new archive: GitPath(archives/directives/2019-07-26_10-01-01.400) ..... [2019-07-26 10:01:01] DEBUG com.normation.rudder.repository.xml.GitNodeGroupArchiverImpl - Archived node group category: /var/rudder/configuration-repository/groups/category.xml [2019-07-26 10:01:01] INFO com.normation.rudder.repository.xml.GitNodeGroupArchiverImpl - Create a new archive: GitPath(archives/groups/2019-07-26_10-01-01.660) [2019-07-26 10:01:01] DEBUG com.normation.rudder.repository.xml.GitParameterArchiverImpl - Archived parameter: /var/rudder/configuration-repository/parameters/rudder_file_edit_header.xml [2019-07-26 10:01:01] INFO com.normation.rudder.repository.xml.GitParameterArchiverImpl - Create a new archive: GitPath(archives/parameters/2019-07-26_10-01-01.794) [2019-07-26 10:01:02] INFO com.normation.rudder.repository.xml.ItemArchiveManagerImpl - Create a new archive: GitPath(archives/full/2019-07-26_10-01-02.108) [2019-07-26 10:01:02] INFO bootchecks - First full archive of configuration-repository items done ...
Actions