Project

General

Profile

Actions

User story #5617

closed

Detecting and restarting Rudder on OOM (Out Of Memory Exception)

Added by Lionel Le Folgoc over 9 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
N/A
Assignee:
-
Category:
System integration
Target version:
UX impact:
Suggestion strength:
User visibility:
Effort required:
Name check:
Fix check:
Regression:

Description

Hi,

I've a lot of nodes (5000 "test" nodes). The policy generation following their acceptation has been ongoing since yesterday 21:28 (for 12 hours now). This is not normal, it should not have been longer than 4 hours.

In the webapp log, I can see the following exception:

[2014-10-07 21:26:35] INFO  com.normation.rudder.services.policies.DeploymentServiceImpl - Start policy generation, checking updated rules
[2014-10-07 21:28:28] WARN  application - [Store Agent Run Times] Task frequency is set too low! Last task took 74577 ms but tasks are scheduled every 5000 ms. Adjust rudder.batch.storeAgentRunTimes.updateInterval if this problem persists.
[2014-10-07 21:28:29] ERROR net.liftweb.actor.ActorLogger - Actor threw an exception
java.lang.OutOfMemoryError: Java heap space
        at com.unboundid.util.StaticUtils.toLowerCase(StaticUtils.java:440) ~[unboundid-ldapsdk-2.3.4.jar:2.3.4]
Exception in thread "pool-3-thread-8" java.lang.OutOfMemoryError: Java heap space
        at com.unboundid.util.StaticUtils.toLowerCase(StaticUtils.java:440)
        at com.unboundid.ldap.sdk.Entry.<init>(Entry.java:309)
        at com.unboundid.ldap.sdk.Entry.<init>(Entry.java:284)
        at com.normation.ldap.sdk.LDAPEntry$.apply(LDAPEntry.scala:291)
        at com.normation.ldap.sdk.LDAPEntry$.apply(LDAPEntry.scala:293)
        at com.normation.ldap.sdk.RoLDAPConnection$$anonfun$search$1.apply(LDAPConnection.scala:303)
        at com.normation.ldap.sdk.RoLDAPConnection$$anonfun$search$1.apply(LDAPConnection.scala:303)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.AbstractTraversable.map(Traversable.scala:105)
        at com.normation.ldap.sdk.RoLDAPConnection.search(LDAPConnection.scala:303)
        at com.normation.ldap.sdk.ReadOnlyEntryLDAPConnection$class.search(LDAPConnection.scala:82)
        at com.normation.ldap.sdk.RoLDAPConnection.search(LDAPConnection.scala:283)
        at com.normation.ldap.sdk.ReadOnlyEntryLDAPConnection$class.searchOne(LDAPConnection.scala:142)
        at com.normation.ldap.sdk.RoLDAPConnection.searchOne(LDAPConnection.scala:283)
        at com.normation.rudder.services.nodes.NodeInfoServiceImpl$$anonfun$getAll$1.apply(NodeInfoService.scala:193)
        at com.normation.rudder.services.nodes.NodeInfoServiceImpl$$anonfun$getAll$1.apply(NodeInfoService.scala:189)
        at com.normation.ldap.sdk.LDAPConnectionProvider$$anonfun$map$1.apply(LDAPConnectionProvider.scala:94)
        at com.normation.ldap.sdk.LDAPConnectionProvider$$anonfun$map$1.apply(LDAPConnectionProvider.scala:93)
        at com.normation.ldap.sdk.LDAPConnectionProvider$class.withCon(LDAPConnectionProvider.scala:154)
        at com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider.withCon(LDAPConnectionProvider.scala:369)
        at com.normation.ldap.sdk.LDAPConnectionProvider$class.map(LDAPConnectionProvider.scala:93)
        at com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider.map(LDAPConnectionProvider.scala:369)
        at com.normation.rudder.services.nodes.NodeInfoServiceImpl.getAll(NodeInfoService.scala:189)
        at com.normation.rudder.services.policies.DeploymentService_findDependantRules_bruteForce$class.getAllNodeInfos(DeploymentService.scala:322)
        at com.normation.rudder.services.policies.DeploymentServiceImpl.getAllNodeInfos(DeploymentService.scala:276)
        at com.normation.rudder.services.policies.DeploymentService$$anonfun$2.apply(DeploymentService.scala:90)
[2014-10-07 21:28:30] INFO  com.normation.rudder.batch.AsyncDeploymentAgent - One automatic policy update process is already pending, ignoring new policy update request

Looks like it actually failed two minutes after the beginning, but rudder is stuck thinking it's still ongoing:

Updating policies (started at 2014-10-07 21:26). Another update is pending since 2014-10-07 21:26

Thanks.


Related issues 5 (0 open5 closed)

Related to Rudder - Bug #2843: Rudder can fail to generate promises when Java is lacking memoryRejectedActions
Related to Rudder - Bug #7524: Java OOM during Java's log migrationReleasedNicolas CHARLES2015-12-01Actions
Related to Rudder - Bug #7735: OOM in Rudder when there are too many repaired reportsReleasedFrançois ARMANDActions
Related to Rudder - Architecture #8923: Requires Java8 (jdk8) for Rudder 4.0RejectedJonathan CLARKE2016-09-07Actions
Related to Rudder - Bug #8165: rudder-init fails to report memory errors from jetty startReleasedBenoît PECCATTEActions
Actions

Also available in: Atom PDF