Project

General

Profile

Actions

Bug #7524

closed

Java OOM during Java's log migration

Added by Janos Mattyasovszky almost 9 years ago. Updated over 8 years ago.

Status:
Released
Priority:
1 (highest)
Category:
Performance and scalability
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

Please check the memory usage of "XML FileFormat migration from 5 to 6", as 32GB of Heap Size is apparently not enough:

[2015-12-01 22:30:42] ERROR migration - Error when migrating XML FileFormat' datas from format 5 to 6 in database <- Could not correctly finish the migration for EventLog fileFormat from '5' to '6'. Check logs for errors. The process can be trigger by restarting the application. <- Java heap space
[2015-12-01 22:30:42] ERROR migration - Exception was:
java.lang.OutOfMemoryError: Java heap space

On the Console where the rudder-jetty was originally started:

JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait.
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait.
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait.
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait.
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait.
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait.
JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0001.phd' in response to an event
JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0001.phd
JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0002.txt' in response to an event
JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0002.txt
JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0008.trc' in response to an event
JVMDUMP010I Snap dump written to /opt/rudder/jetty7/Snap.20151201.222707.10834.0008.trc
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0003.phd' in response to an event
JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0003.phd
JVMDUMP032I JVM requested System dump using '/opt/rudder/jetty7/core.20151201.222707.10834.0004.dmp' in response to an event
JVMDUMP012E Error in System dump: Unable to derive system core filename from pattern in /proc/sys/kernel/core_pattern: "/var/local/dumps/core.%e.%p.%h.%t". "%t" specifier is not supported. Look for core file generated by child process with pid = 13982
JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0005.phd' in response to an event
JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0005.phd
JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0011.trc' in response to an event
JVMDUMP010I Snap dump written to {nothing to snap}
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0006.phd' in response to an event
JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0006.phd
JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0012.trc' in response to an event
JVMDUMP010I Snap dump written to {nothing to snap}
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0007.txt' in response to an event
JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0007.txt
JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0013.trc' in response to an event
JVMDUMP010I Snap dump written to {nothing to snap}
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0009.txt' in response to an event
JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0009.txt
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0010.txt' in response to an event
JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0010.txt
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".

We have a core dump: core.HashSessionScav.13982.

gdb backtrace:

Core was generated by `/usr/bin/java -server -Xms32768m -Xmx32768m -XX:PermSize=128m -XX:MaxPermSize=2'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007ff31b7deca2 in j9dump_create () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9prt26.so
(gdb) bt
#0  0x00007ff31b7deca2 in j9dump_create () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9prt26.so
#1  0x00007ff31b18aaa7 in doSystemDump () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so
#2  0x00007ff31b7ece3f in j9sig_protect () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9prt26.so
#3  0x00007ff31b1895a8 in runDumpFunction () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so
#4  0x00007ff31b18dd08 in runDumpAgent () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so
#5  0x00007ff31b19f892 in triggerDumpAgents () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so
#6  0x00007ff31b19e6a5 in rasDumpHookExceptionSysthrow () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so
#7  0x00007ff31b91f13f in J9HookDispatch () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9hookable26.so
#8  0x00007ff31bb4e56c in internalSetCurrentExceptionWithCause () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so
#9  0x00007ff31bb4e633 in setCurrentExceptionWithCause () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so
#10 0x00007ff31bb588aa in throwHeapOutOfMemoryError () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so
#11 0x00007feaed2cfb60 in ?? ()
#12 0x00007feaed2cfce0 in ?? ()
#13 0x00007ff31ce05ed0 in main_arena () from /lib64/libc.so.6
#14 0x00007ff31ce05e80 in next_to_use.12783 () from /lib64/libc.so.6
#15 0x0000000000001000 in ?? ()
#16 0x0000000000001040 in ?? ()
#17 0x00007ff31bbba125 in ?? () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so
#18 0x00007ff31beff5c0 in ?? () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libjvm.so
#19 0x0000000000000001 in ?? ()
#20 0x00007ff31cb09a5e in _int_malloc () from /lib64/libc.so.6
#21 0x000000000165c4e0 in ?? ()
#22 0x00007feae9d8ea00 in ?? ()
#23 0x0000000000000000 in ?? ()

Full trace:

# tail -f /var/log/rudder/webapp/2015_12_01.stderrout.log
[2015-12-01 22:11:26] INFO  application - JVM property -Drudder.plugin.externalNodeInformation.config is not defined, use configuration file at '/opt/rudder/etc/plugins/external-node-information.properties'
[2015-12-01 22:11:26] INFO  bootstrap.rudder.plugin.NodeExternalReportConf$$EnhancerBySpringCGLIB$$7e80cd - Extension initialized
[2015-12-01 22:11:26] INFO  application - Use configuration file defined by JVM property -Drudder.authFile : /opt/rudder/etc/rudder-users.xml
[2015-12-01 22:11:26] INFO  application - Use configuration file defined by JVM property -Drudder.authFile : /opt/rudder/etc/rudder-users.xml
2015-12-01 22:11:27.054:INFO:oejsh.ContextHandler:started o.e.j.w.WebAppContext{/rudder,file:/tmp/jetty-127.0.0.1-8080-rudder.war-_rudder-any-/webapp/},/opt/rudder/share/webapps/rudder.war
[2015-12-01 22:11:27] INFO  bootchecks - Check mandatory DIT entries
[2015-12-01 22:11:27] INFO  application - All the required DIT entries are present in the LDAP directory
[2015-12-01 22:11:27] INFO  bootchecks - Check initialization of User Technique Library
[2015-12-01 22:11:27] INFO  bootchecks - Check event log migration format 5 -> 6
[2015-12-01 22:11:27] INFO  migration - Start migration of EventLog from format '5' to '6'
[2015-12-01 22:25:52] WARN  application - [Store Agent Run Times] Task frequency is set too low! Last task took 11094 ms but tasks are scheduled every 10000 ms. Adjust rudder.batch.storeAgentRunTimes.updateInterval if this problem persists.
[2015-12-01 22:26:14] WARN  application - [Store Agent Run Times] Task frequency is set too low! Last task took 10862 ms but tasks are scheduled every 10000 ms. Adjust rudder.batch.storeAgentRunTimes.updateInterval if this problem persists.
[2015-12-01 22:26:36] WARN  application - [Store Agent Run Times] Task frequency is set too low! Last task took 12189 ms but tasks are scheduled every 10000 ms. Adjust rudder.batch.storeAgentRunTimes.updateInterval if this problem persists.
Exception in thread "Connection reader for connection 1 to localhost:389" Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" Exception in thread "Connection reader for connection 1 to localhost:389" Exception in thread "Connection reader for connection 5 to localhost:389" Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" java.lang.OutOfMemoryErrorjava.lang.OutOfMemoryErrorjava.lang.OutOfMemoryError: Java heap spacejava.lang.OutOfMemoryError: Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" java.lang.OutOfMemoryError
Exception in thread "qtp1715228818-35" : java.lang.OutOfMemoryError: java.lang.OutOfMemoryError: Java heap spacejava.lang.OutOfMemoryErrorjava.lang.OutOfMemoryError: java.lang.OutOfMemoryError
Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" Java heap space: Exception in thread "qtp1715228818-34" java.lang.OutOfMemoryErrorjava.lang.OutOfMemoryError
java.lang.OutOfMemoryErrorJava heap spaceJava heap space

Java heap space
Exception in thread "Connection reader for connection 0 to localhost:389" Exception in thread "Connection reader for connection 6 to localhost:389" java.lang.OutOfMemoryErrorException in thread "pool-2-thread-16" java.lang.OutOfMemoryErrorJava heap space: Java heap spaceException in thread "qtp1715228818-36" :
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: :
Java heap space
Java heap space
: Java heap space
        at java.net.Inet4Address.getAddress(Inet4Address.java:325)
        at java.net.Inet4Address.getHostAddress(Inet4Address.java:34    at      at Java heap space      at java.util.concurrent.locks.AbstractQueuedSynchronizerjava.net.Inet4Address:  at .    at
java.lang.OutOfMemoryErrorjava.net.Inet4AddressJava heap space
        at com.unboundid.ldap.sdk.LDAPConnectionPool    at 1addWaitercom.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classException in thread "HashSessionScavenger-1"    at : java.lang.OutOfMemoryErrorException in thread "HashSessionScavenger-0" Exception in thread "Scanner-0"  at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classjava.lang.OutOfMemoryError: Java heap space.java.lang.OutOfMemoryError: Java heap space: (com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classJava heap spacejava.lang.OutOfMemoryError
java.net.Inet4Address..com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$class
invalidate: .: .Java heap spaceAbstractQueuedSynchronizer.java  at getAddress(Inet4Address.java)java.lang.OutOfMemoryError
        at .invalidate:Java heap space
        at      at :invalidate: Java heap space: (java.util.concurrent.ConcurrentHashMap$Values

doHealthCheck(com.unboundid.ldap.sdk.LDAPConnectionPoolException in thread "pool-2-thread-7" invalidate(Java heap space
java.net.SocketInputStreamjava.lang.OutOfMemoryError    at Exception in thread "pool-2-thread-15"       at java.net.Inet4Address.(.     at ReportingServiceImpl.scalaReportingServiceImpl.scala.ReportingServiceImpl.scala(.getAddress       at iteratorcom.unboundid.ldap.sdk.LDAPConnectionPoolLDAPConnectionPool.javagetAddress:. at (Java heap space3ConcurrentHashMap.java6com.unboundid.ldap.sdk.LDAPConnectionPool(
1socketRead0Inet4Address.java
        at :1.15:2getAddressjava.io.UnixFileSystem..(com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classdoHealthCheck:823.)canonicalize:2(5invalidate2doHealthCheck: 1(Java heap space(LDAPConnectionPool.java(Native Method)com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classLDAPConnectionPool.java)3
.:1525doHealthCheck5ReportingServiceImpl.scala)java.lang.OutOfMemoryError4ReportingServiceImpl.scala    at (.: 1:
java.util.concurrent.ConcurrentHashMap$Values2Inet4Address.java2(:):: Java heap space:java.util.concurrent.locks.AbstractQueuedSynchronizer.
        at 322)
1.5
4:invalidate5
(2com.normation.rudder.services.reports.CachedReportingServiceImpl14Inet4Address.java
doAcquireInterruptibly:UnixFileSystem.java:     at 32Java heap space)((ReportingServiceImpl.scala       at      at

)2:     at 1
iteratorjava.util.concurrent.locks.AbstractQueuedSynchronizer   at      at ).4org.eclipse.jetty.server.session.HashSessionManagerjava.util.concurrent.locks.AbstractQueuedSynchronizer  at java.net.Inet4Address4(LDAPConnectionPool.java)net.liftweb.actor.SpecializedLiftActor$class       at      at .java.net.SocketInputStream.java.net.Inet4Address4   at .
81net.liftweb.actor.SpecializedLiftActor$class. at .getHostAddress
        at addWaitercom.normation.rudder.services.reports.CachedReportingServiceImpl:.  at com.normation.rudder.services.reports.CachedReportingServiceImplConcurrentHashMap.java5(com.normation.rudder.services.reports.CachedReportingServiceImpljava.net.Inet4Address.25(read)5.) at 5addWaiter5AbstractQueuedSynchronizer.java.
net$liftweb$actor$SpecializedLiftActor$$processMailboxgetHostAddress...
4invalidate():4scavenge5com.unboundid.ldap.sdk.SingleServerSet)getHostAddress
        at (invalidate(LiftActor.scalajava.net.Inet4AddressSocketInputStream.javainvalidate:
:.      at 2getHostAddress21net$liftweb$actor$SpecializedLiftActor$$processMailbox:     at (8(5)Inet4Address.javaReportingServiceImpl.scalaInet4Address.java:
1((4invalidate1((:      at ()HashSessionManager.java0Inet4Address.javaReportingServiceImpl.scala.       at com.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThread3AbstractQueuedSynchronizer.java:9:3.
:Inet4Address.javaLiftActor.scalacom.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThreadcom.normation.rudder.services.reports.CachedReportingServiceImpl3)6  at :
72AbstractQueuedSynchronizer.java(      at com.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThread53)com.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThread1.getConnection1)run.8ReportingServiceImpl.scalarun446
4)java.io.File:
2:213   at 7((SingleServerSet.java:.1:(6run1.)getCanonicalPath7
2)1     at 2java.net.SocketInputStream)LDAPConnectionPoolHealthCheckThread.java(1)
:net.liftweb.actor.SpecializedLiftActor$$anonfun$2$$anonfun$apply$mcV$sp$1LDAPConnectionPoolHealthCheckThread.java.)40))

(:)ReportingServiceImpl.scala   at invalidaterun        at
net.liftweb.actor.SpecializedLiftActor$class(:LDAPConnectionPoolHealthCheckThread.java(9).8     at com.normation.rudder.services.reports.CachedReportingServiceImpl(24com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1File.java:.3        at )..1
3net$liftweb$actor$SpecializedLiftActor$$processMailbox)        at :
socketRead0
((Native Method)
):ReportingServiceImpl.scalainvalidate
9apply$mcV$sp9LDAPConnectionPoolHealthCheckThread.java44(:      at )ReportingServiceImpl.scala(
        at ReportsExecutionService.scalaLiftActor.scala:org.eclipse.jetty.server.session.HashSessionManagerjava.util.concurrent.locks.AbstractQueuedSynchronizer        at 7    at net.liftweb.actor.SpecializedLiftActor$$anonfun$2$$anonfun$apply$mcV$sp$1..):java.net.SocketInputStream.
:read   at      at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$17java.net.SocketInputStream)92.(
:4
        at java.net.SocketInputStream.doAcquireInterruptiblyapply$mcV$sp(.SocketInputStream.java:1apply$mcV$sp.2:2      at .)AbstractQueuedSynchronizer.java((78socketRead0com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1java.util.concurrent.locks.AbstractQueuedSynchronizer
LiftActor.scala).(Native Method)12
9:8acquireInterruptibly6java.net.SocketInputStreamscavenge4296(:)apply$mcV$sp1LiftActor.scala:  at java.util.concurrent.locks.AbstractQueuedSynchronizer)
1.13org.eclipse.jetty.server.session.HashSessionManager$2(27)
.doAcquireInterruptibly1
((0.AbstractQueuedSynchronizer.javarun(:HashSessionManager.java764:     at ))

ReportsExecutionService.scalaAbstractQueuedSynchronizer.java.HashSessionManager.java0   at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1:socketRead0      at 3:
        at (Native Method)3     at java.net.SocketInputStream).1:)apply$mcV$sp
apply$mcV$sp
3
2
8((     at 8)9ReportsExecutionService.scala     at 5)12ReportsExecutionService.scalaread3.socketRead0com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1
com.unboundid.ldap.sdk.ServerSet
        at :1)3java.util.concurrent.locks.AbstractQueuedSynchronizer)
java.net.SocketInputStream)5((Native Method)18
.6.:
apply$mcV$sp8java.net.SocketInputStream6(

        at 1)SocketInputStream.java4    at 8    at
        at      at org.eclipse.jetty.server.session.HashSessionManager$2        at )java.net.SocketInputStream.
)acquireInterruptibly.  at :1com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1ReportsExecutionService.scalanet.liftweb.actor.SpecializedLiftActor$$anonfun$2$$anonfun$apply$mcV$sp$1net.liftweb.actor.LAScheduler$$anonfun$9$$anon$2$$anon$3.
6.java.net.SocketInputStream.   at :.readreadrunjava.util.concurrent.locks.AbstractQueuedSynchronizer(.3net.liftweb.actor.LAScheduler$$anonfun$9$$anon$2$$anon$3.
(.(acquireInterruptibly at HashSessionManager.javaruncom.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1read(com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1.[2015-12-01 22:30:41] ERROR migration - Failure(Could not correctly finish the migration for EventLog fileFormat from '5' to '6'. Check logs for errors. The process can be trigger by restarting the application.,Empty,Full(Failure(Java heap space,Full(java.lang.OutOfMemoryError: Java heap space),Empty)))
read)(LiftActor.scala:
.apply(apply.1AbstractQueuedSynchronizer.java8: at .AbstractQueuedSynchronizer.java3java.io.File        at SocketInputStream.javarun.):((       at 12(:getConnectionReportsExecutionService.scala[2015-12-01 22:30:41] INFO  migration - Start migration of ChangeRequest from format '5' to '6'
com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1SocketInputStream.java4.getCanonicalFileapply$mcV$sp:
ReportsExecutionService.scala6apply$mcV$sp:)1(ServerSet.java(()(11ReportsExecutionService.scalaLiftActor.scala86:
:(
SocketInputStream.javaapplyFile.java:SocketInputStream.java:    at 1(1java.io.BufferedInputStreamcom.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1(9474)      at ):LiftActor.scala:86.8:
java.util.concurrent.locks.ReentrantLock)6.1
2
236     at lockInterruptibly4.ReportsExecutionService.scala:)
46:     at 6:(apply(3   at fill1)4java.net.SocketInputStream)113java.util.TimerThread3)36ReportsExecutionService.scalaReentrantLock.java)):3com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1
.mainLoop.1     at 5java.util.TimerThreadapply54        at 1:8)

(
        at )    at )
com.unboundid.ldap.sdk.LDAPConnectionPool.(3java.util.concurrent.ThreadPoolExecutormainLoop
86582)java.net.SocketInputStream        at      at
..
        at java.util.concurrent.ThreadPoolExecutor.)6java.util.concurrent.locks.ReentrantLock4)
(net.liftweb.actor.LAScheduler$$anonfun$9$$anon$2$$anon$3.86)ReportsExecutionService.scala:Timer.java)runWorker at (6(readThreadPoolExecutor.java.      at

.       at runWorkerorg.eclipse.jetty.util.BlockingArrayQueue:  at )    at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1  at
(ThreadPoolExecutor.java        at 1.:
runBufferedInputStream.java(.1com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1       at Timer.java(.java.net.SocketInputStream.apply)lockInterruptiblyorg.eclipse.jetty.util.Scanner
SocketInputStream.java1.com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1:java.util.concurrent.locks.ReentrantLockscanFiles((read1ReportsExecutionService.scala   at :read.java.io.BufferedInputStreamReentrantLock.java.:.        at (:21SocketInputStream.java(apply5Scanner.java51654(:.7ReportsExecutionService.scalacreateConnectionlockInterruptibly:157(LiftActor.scalacom.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1()6(ReentrantLock.java8
SocketInputStream.java7 at :77applyLDAPConnectionPool.java::))
.1:     at 6:()3)481
poll6java.util.TimerThreadapply6).:1filljava.util.concurrent.ThreadPoolExecutor$Worker4):33.((com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1BlockingArrayQueue.java843
runBufferedInputStream.java1(.6ReportsExecutionService.scalaTimer.java3
        at 3)))4(4      at
05

8)      at      at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$15::       at org.eclipse.jetty.util.Scanner2      at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1ReportsExecutionService.scalascala.concurrent.impl.Future$PromiseCompletingRunnableapplyjava.io.BufferedInputStream...liftedTree1$154java.util.concurrent.ThreadPoolExecutor$Worker(
)81     at .fill.:

)       at )
        at
        at :java.io.BufferedInputStreamrun37.)fill(scan:.((1org.eclipse.jetty.util.BlockingArrayQueue8
ReportsExecutionService.scala
2java.io.BufferedInputStream1.2.ThreadPoolExecutor.java(java.io.BufferedInputStream)BufferedInputStream.java4:Scanner.java3     at :apply6).6

poll81:2read(2)(fill
        at 63   at (java.util.concurrent.ThreadPoolExecutorFuture.scalacom.unboundid.ldap.sdk.LDAPConnectionPool.8BlockingArrayQueue.java)org.eclipse.jetty.util.BlockingArrayQueue:apply4      at 7BufferedInputStream.java:run)7BufferedInputStream.java6:(.():)4.
(getConnection: at runWorker0ReportsExecutionService.scala272
66ThreadPoolExecutor.javascala.concurrent.impl.Future$PromiseCompletingRunnable.32java.util.TimerThread4)
()2liftedTree1$1
ReportsExecutionService.scala(:3
)poll   at      at 3:   at      at (1)org.eclipse.jetty.util.thread.QueuedThreadPoolFuture.scalaLDAPConnectionPool.java
java.lang.Thread
:.      at )6)84        at
:java.io.BufferedInputStream
(7      at .2)2scala.concurrent.impl.Future$PromiseCompletingRunnableidleJobPoll        at
org.eclipse.jetty.util.Scanner$1.java.io.BufferedInputStream.BlockingArrayQueue.java(   at com.unboundid.asn1.ASN1StreamReader1:4liftedTree1$1. at run6(org.eclipse.jetty.util.thread.QueuedThreadPoolrunread(:.
Future.scalaBufferedInputStream.java3(QueuedThreadPool.java.(60:Thread.javarun(readscala.concurrent.impl.Future$PromiseCompletingRunnable3:(    at java.io.BufferedInputStream2(4)Timer.java3scala.concurrent.impl.Future$PromiseCompletingRunnable..ASN1StreamReader.java:read77:Scanner.java.
(liftedTree1$1:5)BufferedInputStream.java9.:1BufferedInputStream.java:9(0
8       at )idleJobPollread:(QueuedThreadPool.java(Future.scala8))
73)
27:5:262        at 26
):      at 46)BufferedInputStream.java))85
2org.eclipse.jetty.util.thread.QueuedThreadPool5ThreadPoolExecutor.java at 2java.io.BufferedInputStream:6com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider.
)       at .24java.lang.Thread)
run     at      at 3scala.concurrent.impl.Future$PromiseCompletingRunnable(
        at
com.unboundid.asn1.ASN1StreamReader16)Future.scala.idleJobPoll

))scala.concurrent.impl.Future$PromiseCompletingRunnable
com.unboundid.asn1.ASN1StreamReader:
..      at      at      at 6run6newConnection.(readrunorg.eclipse.jetty.util.thread.QueuedThreadPool(   at
        at      at QueuedThreadPool.javascala.concurrent.impl.Future$PromiseCompletingRunnable7((.java.util.TimerThread62.6):)
).readTypescala.concurrent.impl.Future$PromiseCompletingRunnablerunliftedTree1$15.4liftedTree1$12:
(LDAPConnectionProvider.scala)6scala.concurrent.impl.Future$PromiseCompletingRunnable(  at .ASN1StreamReader.java
(org.eclipse.jetty.util.thread.QueuedThreadPool.Future.scala(Future.scalacom.unboundid.asn1.ASN1StreamReader.
ASN1StreamReader.javaaccess$600run.:(   at Thread.java.).:(::Future.scala3221access$600
QueuedThreadPool.java(::723(9Future.scala::scala.concurrent.impl.ExecutionContextImpl$$anon$3mainLoop4read477QueuedThreadPool.java(     at read4com.unboundid.asn1.ASN1StreamReader84):).()(2Timer.java4
:)7
        at
4:54)2ASN1StreamReader.java     at
BufferedInputStream.java9scala.concurrent.impl.Future$PromiseCompletingRunnable at )    at 8:com.unboundid.asn1.ASN1StreamReader.4

))
6       at Future.scalaread7    at ::9(readType at 2org.eclipse.jetty.util.thread.QueuedThreadPool$3scala.concurrent.impl.Future$PromiseCompletingRunnable2.).ASN1StreamReader.java478.)6)exec8
)(
scala.concurrent.impl.ExecutionContextImpl$$anon$3      at      at org.eclipse.jetty.util.thread.QueuedThreadPool$3org.eclipse.jetty.util.thread.QueuedThreadPool..6:

runcom.unboundid.asn1.ASN1StreamReader  at )
.)
ASN1StreamReader.javaexec.(QueuedThreadPool.java(com.normation.ldap.sdk.PooledConnectionProvider$class
java.util.TimerThreadExecutionContextImpl.scalarun:.978getInternalConnection(:(5        at .ExecutionContextImpl.scalaLDAPConnectionProvider.scalaaccess$600runrunrun((:1.beginSequence310(     at QueuedThreadPool.java(:scala.concurrent.impl.ExecutionContextImpl$$anon$37.2      at (:() at scala.concurrent.impl.ExecutionContextImpl$$anon$3
        at com.unboundid.asn1.ASN1StreamReaderASN1StreamReader.javaexec27(..exec:Timer.java2Future.scalaExecutionContextImpl.scala6Future.scalareadType5(ExecutionContextImpl.scala7:)7::2::1(1)ASN1StreamReader.java:7)2com.unboundid.asn1.ASN1StreamReader903)027com.unboundid.asn1.ASN1StreamReader7:
4       at )).
)
java.lang.Thread.run)read(.     at 1ASN1StreamReader.javacom.unboundid.asn1.ASN1StreamReaderQueuedThreadPool.java75:2:

.
1       at beginSequence((      at
scala.concurrent.forkjoin.ForkJoinTaskASN1StreamReader.java     at 7    at 4scala.concurrent.forkjoin.ForkJoinTask0readTypeThread.java.0097com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider       at .:))7com.unboundid.asn1.ASN1StreamReader.
7()9
)       at 8:9  at 8)
scala.concurrent.impl.ExecutionContextImpl$$anon$30ASN1StreamReader.javagetInternalConnectiondoExec5.exec       at com.unboundid.ldap.protocol.LDAPMessage.scala.concurrent.forkjoin.ForkJoinTask0java.lang.Thread.:)()
4
3(beginSequencerun(24ExecutionContextImpl.scala.)       at )
Thread.java(
:com.unboundid.asn1.ASN1StreamReaderASN1StreamReader.java77.doExec
        at 7(.readType()org.eclipse.jetty.util.thread.QueuedThreadPool$3.:1(    at com.unboundid.ldap.protocol.LDAPMessageForkJoinTask.javarun  at ASN1StreamReader.javadoExec09(8:.LDAPConnectionProvider.scala:scala.concurrent.forkjoin.ForkJoinTask.doExecreadLDAPResponseFrom(ForkJoinTask.java:(7()2QueuedThreadPool.javaForkJoinTask.javaLDAPMessage.java6:::)51
27
9:2     at ForkJoinTask.java2   at scala.concurrent.impl.ExecutionContextImpl$$anon$3:0readLDAPResponseFrom6
com.unboundid.asn1.ASN1StreamReader0)(: at java.util.concurrent.ThreadPoolExecutor$Worker020LDAPMessage.java)6363
:)
.1)20)

exec61
1       at 7))  at .4(469)scala.concurrent.forkjoin.ForkJoinTask
0)6
.)
        at )

.       at
        at      at      at com.normation.ldap.sdk.LDAPConnectionProvider$class  at doExeccom.unboundid.ldap.sdk.LDAPConnectionReader    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueueExecutionContextImpl.scala:.(     at .beginSequence1      at runscala.concurrent.forkjoin.ForkJoinPool$WorkQueuecom.unboundid.asn1.ASN1StreamReader.scala.concurrent.forkjoin.ForkJoinPool$WorkQueue(.java.lang.Threadscala.concurrent.forkjoin.ForkJoinPool$WorkQueuerunTaskForkJoinTask.java0.withConcom.unboundid.ldap.protocol.LDAPMessage.(beginSequence..(runrunTaskcom.unboundid.ldap.sdk.LDAPConnectionReader.((LDAPConnectionProvider.scala7((ASN1StreamReader.javareadLDAPResponseFrom:ForkJoinPool.java.ASN1StreamReader.javaThreadPoolExecutor.java:LDAPConnectionReader.java2:ForkJoinPool.java::69:2runTaskrunTask0:1)
0(01run:15run64((       at 5scala.concurrent.forkjoin.ForkJoinTask((3))ForkJoinPool.javaThread.javaForkJoinPool.java
3
39)3::  at      at LDAPConnectionReader.java9scala.concurrent.forkjoin.ForkJoinPool$WorkQueue):7).9:LDAPMessage.java9com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider:doExec813
1.withCon.      at 1251337)scala.concurrent.forkjoin.ForkJoinPool4)23
        at (
9runTaskcom.unboundid.ldap.protocol.LDAPMessage.
9readLDAPResponseFromLDAPConnectionProvider.scala((
(:6)
ForkJoinPool.java7      at com.unboundid.ldap.sdk.LDAPConnectionReaderForkJoinTask.java369      at :1))LDAPMessage.java:1
        at
):scala.concurrent.forkjoin.ForkJoinPool.       at runWorker.runWorkercom.normation.ldap.sdk.LDAPConnectionProvider$class002.scala.concurrent.forkjoin.ForkJoinPool.runWorker.map(
()(ForkJoinPool.javaForkJoinPool.java3
        at 6    at :01scala.concurrent.forkjoin.ForkJoinPool3run7(ForkJoinPool.javajava.lang.Thread)9:.runWorker((4)1.LDAPConnectionProvider.scala6)
)
run(
LDAPConnectionReader.java9Thread.java)19:ForkJoinPool.java::    at 197

9:799193        at com.unboundid.ldap.protocol.LDAPMessage      at com.unboundid.ldap.sdk.LDAPConnectionReader)
        at 2)
        at )scala.concurrent.forkjoin.ForkJoinPool.75scala.concurrent.forkjoin.ForkJoinWorkerThread.
runWorkerscala.concurrent.forkjoin.ForkJoinWorkerThreadrun.     at com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider.readLDAPResponseFrom(97  at .)
LDAPMessage.java.run(run((scala.concurrent.forkjoin.ForkJoinPool$WorkQueue7:ForkJoinWorkerThread.java:ForkJoinPool.java(LDAPConnectionReader.java)10ForkJoinWorkerThread.java9
Exception in thread "Connection reader for connection 4 to localhost:389" java.lang.OutOfMemoryError:: :::)     at 171map(
4Java heap spaceLDAPConnectionProvider.scala17)21
90      at scala.concurrent.forkjoin.ForkJoinWorkerThread:7)
        at java.util.LinkedHashMap6).79)
357scala.concurrent.forkjoin.ForkJoinWorkerThread96..
9runTaskrun     at scala.concurrent.forkjoin.ForkJoinWorkerThread.)
run(ForkJoinWorkerThread.java:1run(ForkJoinPool.java.createEntry
(ForkJoinWorkerThread.java()
        at (8)
        at com.unboundid.ldap.sdk.LDAPConnectionReader:ForkJoinWorkerThread.java07)com.normation.rudder.services.nodes.NodeInfoServiceCachedImpl.isUpToDate$1.107):

(1LinkedHashMap.java::3391run(LDAPConnectionReader.java)
4:2     at 545NodeInfoService.scala:1)7)6
        at
scala.concurrent.forkjoin.ForkJoinPool.07)java.util.HashMap.addEntry(runWorker(9)HashMap.java:8
        at ForkJoinPool.java:1com.normation.rudder.services.nodes.NodeInfoServiceCachedImpl.withUpToDateCache(NodeInfoService.scala:
979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.276run(ForkJoinWorkerThread.java)
        at 9:107)
6com.normation.rudder.services.nodes.NodeInfoServiceCachedImpl.getAll(NodeInfoService.scala:311)
        at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1$$anonfun$apply$5.apply(ReportingConfiguration.scala:)
        at java.util.LinkedHashMap117)
        at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1$$anonfun$apply$5.apply.addEntry(LinkedHashMap.java:439)
        at java.util.HashMap.(ReportingConfiguration.scala:11put(HashMap.java:517)
        at com.unboundid.ldap.sdk.Entry.<init>(Entry.java:313)
        at com.unboundid.ldap.sdk.ReadOnlyEntry.<init>(ReadOnlyEntry.java:1436)
        at net.liftweb.common.Full)
        at com.unboundid.ldap.sdk.SearchResultEntry.flatMap(Box.scala:.<init>(SearchResultEntry.java:555)
        at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1.apply(ReportingConfiguration.scala:116)196)

        at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1.apply(ReportingConfiguration.scala:115)
        at      at com.unboundid.ldap.sdk.SearchResultEntry.readSearchEntryFrom(SearchResultEntry.java:295)
        at com.unboundid.ldap.sdk.InternalSDKHelper.readSearchResultEntryFrom(InternalSDKHelper.java:296)
        at com.unboundid.ldap.protocol.LDAPMessage.readLDAPResponseFrom(LDAPMessage.java:1199)
net.liftweb.common.Full.flatMap(Box.scala:555)
        at com.normation.rudder.reports.AgentRunIntervalServiceImpl     at com.unboundid.ldap.sdk.LDAPConnectionReader.run(LDAPConnectionReader.java:257)
.getNodeReportingConfigurations(ReportingConfiguration.scala:115)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17$$anonfun$apply$18.apply(ReportingServiceImpl.scala:327)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17$$anonfun$apply$18.apply(ReportingServiceImpl.scala:326)
        at net.liftweb.common.Full.flatMap(Box.scala:555)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17.apply(ReportingServiceImpl.scala:326)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17.apply(ReportingServiceImpl.scala:325)
        at net.liftweb.common.Full.flatMap(Box.scala:555)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1.apply(ReportingServiceImpl.scala:325)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1.apply(ReportingServiceImpl.scala:324)
        at net.liftweb.common.Full.flatMap(Box.scala:555)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$class.com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos(ReportingServiceImpl.scala:324)
        at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$class.findRuleNodeStatusReports(ReportingServiceImpl.scala:277)
        at com.normation.rudder.services.reports.ReportingServiceImpl.findRuleNodeStatusReports(ReportingServiceImpl.scala:62)
        at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$class.com$normation$rudder$services$reports$CachedFindRuleNodeStatusReports$$checkAndUpdateCache(ReportingServiceImpl.scala:189)
        at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$$anonfun$invalidate$2.apply(ReportingServiceImpl.scala:156)
        at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$$anonfun$invalidate$2.apply(ReportingServiceImpl.scala:156)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[2015-12-01 22:30:42] INFO  migration - Migration of ChangeRequest fileFormat from '5' to '6' done, 0 EventLogs migrated
[2015-12-01 22:30:42] ERROR migration - Error when migrating XML FileFormat' datas from format 5 to 6 in database <- Could not correctly finish the migration for EventLog fileFormat from '5' to '6'. Check logs for errors. The process can be trigger by restarting the application. <- Java heap space
[2015-12-01 22:30:42] ERROR migration - Exception was:
java.lang.OutOfMemoryError: Java heap space
[2015-12-01 22:30:42] INFO  bootchecks - Check existence of at least one archive of the configuration
[2015-12-01 22:30:42] INFO  bootchecks - First full archive of configuration-repository items done
[2015-12-01 22:30:42] INFO  bootchecks - Check rules archive directory in configuration-repository
[2015-12-01 22:30:42] INFO  bootchecks - Creating directory '/var/rudder/configuration-repository/ruleCategories' exists, done
[2015-12-01 22:30:42] INFO  bootchecks - Check that directive variables use the namespace 'rudder'
[2015-12-01 22:30:49] INFO  migration - Migration of inline variables in Directives to 'rudder' namespace already done, skipping
[2015-12-01 22:30:49] INFO  bootchecks - Check for force reload of Techniques library
[2015-12-01 22:30:49] INFO  bootchecks - Flag file '/opt/rudder/etc/force_technique_reload' does not exists, do not Technique library will not be reloaded
[2015-12-01 22:30:49] INFO  bootchecks - Check that system groups 'hasPolicyServer-*' are dynamics and correct them
[2015-12-01 22:30:49] INFO  application - Initializing plugin 'node external report' [com.normation.plugins.nodeexternalreport.NodeExternalReportPluginDef]
[2015-12-01 22:30:49] INFO  com.normation.plugins.nodeexternalreport.NodeExternalReportPluginDef - Report plugin correctly loaded
[2015-12-01 22:30:49] INFO  application - Application Rudder started

Related issues 3 (0 open3 closed)

Related to Rudder - Bug #8001: create an index for detection of eventlog versionReleasedFrançois ARMAND2016-02-29Actions
Related to Rudder - User story #5617: Detecting and restarting Rudder on OOM (Out Of Memory Exception)ResolvedActions
Has duplicate Rudder - Bug #7988: Migrate eventlog crashes jetty with OOMRejected2016-02-26Actions
Actions #1

Updated by Florian Heigl almost 9 years ago

The culprit is the log conversion, I disabled it using fanf's suggestion of setitng a migration END date.

Actions #2

Updated by Florian Heigl almost 9 years ago

A note:

Background processing of this migration will not suffice to fix the issue.
It will enable startup of the API and Frontend and is important as such.

But it would still run out of memory. We bumped the server up from 8GB to 32GB without(!) success.
I haven't checked but I think that sum is more than the data to be processed.

You'll need to have a look at the way it is processed, it needs to be chunked somehow.

Actions #3

Updated by Florian Heigl almost 9 years ago

It is also uncertain if the agent should be allowed to kill rudder (because API down) while it is in startup phase.
on the other hand startup can get stuck if LDAP or pgsql are down, so I kind of like this feature.

Actions #4

Updated by François ARMAND over 8 years ago

  • Category changed from Server components to Performance and scalability
  • Status changed from New to In progress
  • Assignee set to François ARMAND
  • Priority changed from N/A to 1 (highest)
  • Target version set to 2.11.20

The correct solution is to batch update, perhaps starting from most recent logs toward older one. If we do migration by log of 1000 for ex, ram usage should still be quite low, and we just need to recurse after each batch. When we don't find any more log of one kind, skip to the next.

Actions #5

Updated by Janos Mattyasovszky over 8 years ago

  • Related to Bug #7988: Migrate eventlog crashes jetty with OOM added
Actions #6

Updated by Janos Mattyasovszky over 8 years ago

  • Related to deleted (Bug #7988: Migrate eventlog crashes jetty with OOM)
Actions #7

Updated by Janos Mattyasovszky over 8 years ago

  • Has duplicate Bug #7988: Migrate eventlog crashes jetty with OOM added
Actions #8

Updated by François ARMAND over 8 years ago

  • Related to Bug #8001: create an index for detection of eventlog version added
Actions #9

Updated by François ARMAND over 8 years ago

I have a WIP solution in https://github.com/fanf/rudder/tree/bug_7524/java_oom_during_java_s_log_migration. For now, I took care of the high memory consumption by improving events selected, and batching the selection by group of 1000 (more test needed to see if 5000 or 10 000 are better default choices).
I'm also wondering if it makes sense to let the user be able to customize that batch size.

Next: make the processing asynchrone so that it does not block boot.

The correction won't be in the next bug releases (2.11.19/3.0.14/3.1.8), but in the next one.

Actions #10

Updated by François ARMAND over 8 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Nicolas CHARLES
  • Pull Request set to https://github.com/Normation/rudder/pull/1068
Actions #11

Updated by François ARMAND over 8 years ago

There is two main point of that change:

  • make each migration be only by step of "max batch", so that we can't explose the memory consumption. I set up the batch size to 1000, because eventlog object use ~10kb, but can use to up to ~50kb for very big events (say, group update with thousand of nodes). So with that, we should keep the memory consumption below 100Mo for the migration
  • make the process async so that the event log migration, which is not necessary for Rudder to work, is done in background. It can be stop and restart as many time as needed, so no problem if Rudder stop before the end of migration
Actions #12

Updated by François ARMAND over 8 years ago

  • Status changed from Pending technical review to Pending release
  • % Done changed from 0 to 100
Actions #13

Updated by François ARMAND over 8 years ago

  • Related to User story #5617: Detecting and restarting Rudder on OOM (Out Of Memory Exception) added
Actions #14

Updated by Vincent MEMBRÉ over 8 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 2.11.20, 3.0.15, 3.1.9 and 3.2.2 which were released today.

Actions

Also available in: Atom PDF