Bug #7524
closedJava OOM during Java's log migration
Description
Please check the memory usage of "XML FileFormat migration from 5 to 6", as 32GB of Heap Size is apparently not enough:
[2015-12-01 22:30:42] ERROR migration - Error when migrating XML FileFormat' datas from format 5 to 6 in database <- Could not correctly finish the migration for EventLog fileFormat from '5' to '6'. Check logs for errors. The process can be trigger by restarting the application. <- Java heap space [2015-12-01 22:30:42] ERROR migration - Exception was: java.lang.OutOfMemoryError: Java heap space
On the Console where the rudder-jetty was originally started:
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait. JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait. JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait. JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait. JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait. JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/12/01 22:27:07 - please wait. JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0001.phd' in response to an event JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0001.phd JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0002.txt' in response to an event JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0002.txt JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0008.trc' in response to an event JVMDUMP010I Snap dump written to /opt/rudder/jetty7/Snap.20151201.222707.10834.0008.trc JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError". JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0003.phd' in response to an event JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0003.phd JVMDUMP032I JVM requested System dump using '/opt/rudder/jetty7/core.20151201.222707.10834.0004.dmp' in response to an event JVMDUMP012E Error in System dump: Unable to derive system core filename from pattern in /proc/sys/kernel/core_pattern: "/var/local/dumps/core.%e.%p.%h.%t". "%t" specifier is not supported. Look for core file generated by child process with pid = 13982 JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0005.phd' in response to an event JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0005.phd JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0011.trc' in response to an event JVMDUMP010I Snap dump written to {nothing to snap} JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError". JVMDUMP032I JVM requested Heap dump using '/opt/rudder/jetty7/heapdump.20151201.222707.10834.0006.phd' in response to an event JVMDUMP010I Heap dump written to /opt/rudder/jetty7/heapdump.20151201.222707.10834.0006.phd JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0012.trc' in response to an event JVMDUMP010I Snap dump written to {nothing to snap} JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError". JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0007.txt' in response to an event JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0007.txt JVMDUMP032I JVM requested Snap dump using '/opt/rudder/jetty7/Snap.20151201.222707.10834.0013.trc' in response to an event JVMDUMP010I Snap dump written to {nothing to snap} JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError". JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0009.txt' in response to an event JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0009.txt JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError". JVMDUMP032I JVM requested Java dump using '/opt/rudder/jetty7/javacore.20151201.222707.10834.0010.txt' in response to an event JVMDUMP010I Java dump written to /opt/rudder/jetty7/javacore.20151201.222707.10834.0010.txt JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
We have a core dump: core.HashSessionScav.13982.
gdb backtrace:
Core was generated by `/usr/bin/java -server -Xms32768m -Xmx32768m -XX:PermSize=128m -XX:MaxPermSize=2'. Program terminated with signal 11, Segmentation fault. #0 0x00007ff31b7deca2 in j9dump_create () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9prt26.so (gdb) bt #0 0x00007ff31b7deca2 in j9dump_create () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9prt26.so #1 0x00007ff31b18aaa7 in doSystemDump () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so #2 0x00007ff31b7ece3f in j9sig_protect () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9prt26.so #3 0x00007ff31b1895a8 in runDumpFunction () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so #4 0x00007ff31b18dd08 in runDumpAgent () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so #5 0x00007ff31b19f892 in triggerDumpAgents () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so #6 0x00007ff31b19e6a5 in rasDumpHookExceptionSysthrow () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9dmp26.so #7 0x00007ff31b91f13f in J9HookDispatch () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9hookable26.so #8 0x00007ff31bb4e56c in internalSetCurrentExceptionWithCause () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so #9 0x00007ff31bb4e633 in setCurrentExceptionWithCause () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so #10 0x00007ff31bb588aa in throwHeapOutOfMemoryError () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so #11 0x00007feaed2cfb60 in ?? () #12 0x00007feaed2cfce0 in ?? () #13 0x00007ff31ce05ed0 in main_arena () from /lib64/libc.so.6 #14 0x00007ff31ce05e80 in next_to_use.12783 () from /lib64/libc.so.6 #15 0x0000000000001000 in ?? () #16 0x0000000000001040 in ?? () #17 0x00007ff31bbba125 in ?? () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libj9vm26.so #18 0x00007ff31beff5c0 in ?? () from /usr/lib64/jvm/java-1.7.0-ibm-1.7.0/jre/lib/amd64/default/libjvm.so #19 0x0000000000000001 in ?? () #20 0x00007ff31cb09a5e in _int_malloc () from /lib64/libc.so.6 #21 0x000000000165c4e0 in ?? () #22 0x00007feae9d8ea00 in ?? () #23 0x0000000000000000 in ?? ()
Full trace:
# tail -f /var/log/rudder/webapp/2015_12_01.stderrout.log [2015-12-01 22:11:26] INFO application - JVM property -Drudder.plugin.externalNodeInformation.config is not defined, use configuration file at '/opt/rudder/etc/plugins/external-node-information.properties' [2015-12-01 22:11:26] INFO bootstrap.rudder.plugin.NodeExternalReportConf$$EnhancerBySpringCGLIB$$7e80cd - Extension initialized [2015-12-01 22:11:26] INFO application - Use configuration file defined by JVM property -Drudder.authFile : /opt/rudder/etc/rudder-users.xml [2015-12-01 22:11:26] INFO application - Use configuration file defined by JVM property -Drudder.authFile : /opt/rudder/etc/rudder-users.xml 2015-12-01 22:11:27.054:INFO:oejsh.ContextHandler:started o.e.j.w.WebAppContext{/rudder,file:/tmp/jetty-127.0.0.1-8080-rudder.war-_rudder-any-/webapp/},/opt/rudder/share/webapps/rudder.war [2015-12-01 22:11:27] INFO bootchecks - Check mandatory DIT entries [2015-12-01 22:11:27] INFO application - All the required DIT entries are present in the LDAP directory [2015-12-01 22:11:27] INFO bootchecks - Check initialization of User Technique Library [2015-12-01 22:11:27] INFO bootchecks - Check event log migration format 5 -> 6 [2015-12-01 22:11:27] INFO migration - Start migration of EventLog from format '5' to '6' [2015-12-01 22:25:52] WARN application - [Store Agent Run Times] Task frequency is set too low! Last task took 11094 ms but tasks are scheduled every 10000 ms. Adjust rudder.batch.storeAgentRunTimes.updateInterval if this problem persists. [2015-12-01 22:26:14] WARN application - [Store Agent Run Times] Task frequency is set too low! Last task took 10862 ms but tasks are scheduled every 10000 ms. Adjust rudder.batch.storeAgentRunTimes.updateInterval if this problem persists. [2015-12-01 22:26:36] WARN application - [Store Agent Run Times] Task frequency is set too low! Last task took 12189 ms but tasks are scheduled every 10000 ms. Adjust rudder.batch.storeAgentRunTimes.updateInterval if this problem persists. Exception in thread "Connection reader for connection 1 to localhost:389" Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" Exception in thread "Connection reader for connection 1 to localhost:389" Exception in thread "Connection reader for connection 5 to localhost:389" Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" java.lang.OutOfMemoryErrorjava.lang.OutOfMemoryErrorjava.lang.OutOfMemoryError: Java heap spacejava.lang.OutOfMemoryError: Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" java.lang.OutOfMemoryError Exception in thread "qtp1715228818-35" : java.lang.OutOfMemoryError: java.lang.OutOfMemoryError: Java heap spacejava.lang.OutOfMemoryErrorjava.lang.OutOfMemoryError: java.lang.OutOfMemoryError Exception in thread "Health Check Thread for LDAPConnectionPool(serverSet=SingleServerSet(server=localhost:389), maxConnections=2)" Java heap space: Exception in thread "qtp1715228818-34" java.lang.OutOfMemoryErrorjava.lang.OutOfMemoryError java.lang.OutOfMemoryErrorJava heap spaceJava heap space Java heap space Exception in thread "Connection reader for connection 0 to localhost:389" Exception in thread "Connection reader for connection 6 to localhost:389" java.lang.OutOfMemoryErrorException in thread "pool-2-thread-16" java.lang.OutOfMemoryErrorJava heap space: Java heap spaceException in thread "qtp1715228818-36" : java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: : Java heap space Java heap space : Java heap space at java.net.Inet4Address.getAddress(Inet4Address.java:325) at java.net.Inet4Address.getHostAddress(Inet4Address.java:34 at at Java heap space at java.util.concurrent.locks.AbstractQueuedSynchronizerjava.net.Inet4Address: at . at java.lang.OutOfMemoryErrorjava.net.Inet4AddressJava heap space at com.unboundid.ldap.sdk.LDAPConnectionPool at 1addWaitercom.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classException in thread "HashSessionScavenger-1" at : java.lang.OutOfMemoryErrorException in thread "HashSessionScavenger-0" Exception in thread "Scanner-0" at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classjava.lang.OutOfMemoryError: Java heap space.java.lang.OutOfMemoryError: Java heap space: (com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classJava heap spacejava.lang.OutOfMemoryError java.net.Inet4Address..com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$class invalidate: .: .Java heap spaceAbstractQueuedSynchronizer.java at getAddress(Inet4Address.java)java.lang.OutOfMemoryError at .invalidate:Java heap space at at :invalidate: Java heap space: (java.util.concurrent.ConcurrentHashMap$Values doHealthCheck(com.unboundid.ldap.sdk.LDAPConnectionPoolException in thread "pool-2-thread-7" invalidate(Java heap space java.net.SocketInputStreamjava.lang.OutOfMemoryError at Exception in thread "pool-2-thread-15" at java.net.Inet4Address.(. at ReportingServiceImpl.scalaReportingServiceImpl.scala.ReportingServiceImpl.scala(.getAddress at iteratorcom.unboundid.ldap.sdk.LDAPConnectionPoolLDAPConnectionPool.javagetAddress:. at (Java heap space3ConcurrentHashMap.java6com.unboundid.ldap.sdk.LDAPConnectionPool( 1socketRead0Inet4Address.java at :1.15:2getAddressjava.io.UnixFileSystem..(com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classdoHealthCheck:823.)canonicalize:2(5invalidate2doHealthCheck: 1(Java heap space(LDAPConnectionPool.java(Native Method)com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$classLDAPConnectionPool.java)3 .:1525doHealthCheck5ReportingServiceImpl.scala)java.lang.OutOfMemoryError4ReportingServiceImpl.scala at (.: 1: java.util.concurrent.ConcurrentHashMap$Values2Inet4Address.java2(:):: Java heap space:java.util.concurrent.locks.AbstractQueuedSynchronizer. at 322) 1.5 4:invalidate5 (2com.normation.rudder.services.reports.CachedReportingServiceImpl14Inet4Address.java doAcquireInterruptibly:UnixFileSystem.java: at 32Java heap space)((ReportingServiceImpl.scala at at )2: at 1 iteratorjava.util.concurrent.locks.AbstractQueuedSynchronizer at at ).4org.eclipse.jetty.server.session.HashSessionManagerjava.util.concurrent.locks.AbstractQueuedSynchronizer at java.net.Inet4Address4(LDAPConnectionPool.java)net.liftweb.actor.SpecializedLiftActor$class at at .java.net.SocketInputStream.java.net.Inet4Address4 at . 81net.liftweb.actor.SpecializedLiftActor$class. at .getHostAddress at addWaitercom.normation.rudder.services.reports.CachedReportingServiceImpl:. at com.normation.rudder.services.reports.CachedReportingServiceImplConcurrentHashMap.java5(com.normation.rudder.services.reports.CachedReportingServiceImpljava.net.Inet4Address.25(read)5.) at 5addWaiter5AbstractQueuedSynchronizer.java. net$liftweb$actor$SpecializedLiftActor$$processMailboxgetHostAddress... 4invalidate():4scavenge5com.unboundid.ldap.sdk.SingleServerSet)getHostAddress at (invalidate(LiftActor.scalajava.net.Inet4AddressSocketInputStream.javainvalidate: :. at 2getHostAddress21net$liftweb$actor$SpecializedLiftActor$$processMailbox: at (8(5)Inet4Address.javaReportingServiceImpl.scalaInet4Address.java: 1((4invalidate1((: at ()HashSessionManager.java0Inet4Address.javaReportingServiceImpl.scala. at com.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThread3AbstractQueuedSynchronizer.java:9:3. :Inet4Address.javaLiftActor.scalacom.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThreadcom.normation.rudder.services.reports.CachedReportingServiceImpl3)6 at : 72AbstractQueuedSynchronizer.java( at com.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThread53)com.unboundid.ldap.sdk.LDAPConnectionPoolHealthCheckThread1.getConnection1)run.8ReportingServiceImpl.scalarun446 4)java.io.File: 2:213 at 7((SingleServerSet.java:.1:(6run1.)getCanonicalPath7 2)1 at 2java.net.SocketInputStream)LDAPConnectionPoolHealthCheckThread.java(1) :net.liftweb.actor.SpecializedLiftActor$$anonfun$2$$anonfun$apply$mcV$sp$1LDAPConnectionPoolHealthCheckThread.java.)40)) (:)ReportingServiceImpl.scala at invalidaterun at net.liftweb.actor.SpecializedLiftActor$class(:LDAPConnectionPoolHealthCheckThread.java(9).8 at com.normation.rudder.services.reports.CachedReportingServiceImpl(24com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1File.java:.3 at )..1 3net$liftweb$actor$SpecializedLiftActor$$processMailbox) at : socketRead0 ((Native Method) ):ReportingServiceImpl.scalainvalidate 9apply$mcV$sp9LDAPConnectionPoolHealthCheckThread.java44(: at )ReportingServiceImpl.scala( at ReportsExecutionService.scalaLiftActor.scala:org.eclipse.jetty.server.session.HashSessionManagerjava.util.concurrent.locks.AbstractQueuedSynchronizer at 7 at net.liftweb.actor.SpecializedLiftActor$$anonfun$2$$anonfun$apply$mcV$sp$1..):java.net.SocketInputStream. :read at at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$17java.net.SocketInputStream)92.( :4 at java.net.SocketInputStream.doAcquireInterruptiblyapply$mcV$sp(.SocketInputStream.java:1apply$mcV$sp.2:2 at .)AbstractQueuedSynchronizer.java((78socketRead0com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1java.util.concurrent.locks.AbstractQueuedSynchronizer LiftActor.scala).(Native Method)12 9:8acquireInterruptibly6java.net.SocketInputStreamscavenge4296(:)apply$mcV$sp1LiftActor.scala: at java.util.concurrent.locks.AbstractQueuedSynchronizer) 1.13org.eclipse.jetty.server.session.HashSessionManager$2(27) .doAcquireInterruptibly1 ((0.AbstractQueuedSynchronizer.javarun(:HashSessionManager.java764: at )) ReportsExecutionService.scalaAbstractQueuedSynchronizer.java.HashSessionManager.java0 at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1:socketRead0 at 3: at (Native Method)3 at java.net.SocketInputStream).1:)apply$mcV$sp apply$mcV$sp 3 2 8(( at 8)9ReportsExecutionService.scala at 5)12ReportsExecutionService.scalaread3.socketRead0com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1 com.unboundid.ldap.sdk.ServerSet at :1)3java.util.concurrent.locks.AbstractQueuedSynchronizer) java.net.SocketInputStream)5((Native Method)18 .6.: apply$mcV$sp8java.net.SocketInputStream6( at 1)SocketInputStream.java4 at 8 at at at org.eclipse.jetty.server.session.HashSessionManager$2 at )java.net.SocketInputStream. )acquireInterruptibly. at :1com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1ReportsExecutionService.scalanet.liftweb.actor.SpecializedLiftActor$$anonfun$2$$anonfun$apply$mcV$sp$1net.liftweb.actor.LAScheduler$$anonfun$9$$anon$2$$anon$3. 6.java.net.SocketInputStream. at :.readreadrunjava.util.concurrent.locks.AbstractQueuedSynchronizer(.3net.liftweb.actor.LAScheduler$$anonfun$9$$anon$2$$anon$3. (.(acquireInterruptibly at HashSessionManager.javaruncom.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1read(com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1.[2015-12-01 22:30:41] ERROR migration - Failure(Could not correctly finish the migration for EventLog fileFormat from '5' to '6'. Check logs for errors. The process can be trigger by restarting the application.,Empty,Full(Failure(Java heap space,Full(java.lang.OutOfMemoryError: Java heap space),Empty))) read)(LiftActor.scala: .apply(apply.1AbstractQueuedSynchronizer.java8: at .AbstractQueuedSynchronizer.java3java.io.File at SocketInputStream.javarun.):(( at 12(:getConnectionReportsExecutionService.scala[2015-12-01 22:30:41] INFO migration - Start migration of ChangeRequest from format '5' to '6' com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1SocketInputStream.java4.getCanonicalFileapply$mcV$sp: ReportsExecutionService.scala6apply$mcV$sp:)1(ServerSet.java(()(11ReportsExecutionService.scalaLiftActor.scala86: :( SocketInputStream.javaapplyFile.java:SocketInputStream.java: at 1(1java.io.BufferedInputStreamcom.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1(9474) at ):LiftActor.scala:86.8: java.util.concurrent.locks.ReentrantLock)6.1 2 236 at lockInterruptibly4.ReportsExecutionService.scala:) 46: at 6:(apply(3 at fill1)4java.net.SocketInputStream)113java.util.TimerThread3)36ReportsExecutionService.scalaReentrantLock.java)):3com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1 .mainLoop.1 at 5java.util.TimerThreadapply54 at 1:8) ( at ) at ) com.unboundid.ldap.sdk.LDAPConnectionPool.(3java.util.concurrent.ThreadPoolExecutormainLoop 86582)java.net.SocketInputStream at at .. at java.util.concurrent.ThreadPoolExecutor.)6java.util.concurrent.locks.ReentrantLock4) (net.liftweb.actor.LAScheduler$$anonfun$9$$anon$2$$anon$3.86)ReportsExecutionService.scala:Timer.java)runWorker at (6(readThreadPoolExecutor.java. at . at runWorkerorg.eclipse.jetty.util.BlockingArrayQueue: at ) at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1 at (ThreadPoolExecutor.java at 1.: runBufferedInputStream.java(.1com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1 at Timer.java(.java.net.SocketInputStream.apply)lockInterruptiblyorg.eclipse.jetty.util.Scanner SocketInputStream.java1.com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1:java.util.concurrent.locks.ReentrantLockscanFiles((read1ReportsExecutionService.scala at :read.java.io.BufferedInputStreamReentrantLock.java.:. at (:21SocketInputStream.java(apply5Scanner.java51654(:.7ReportsExecutionService.scalacreateConnectionlockInterruptibly:157(LiftActor.scalacom.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1()6(ReentrantLock.java8 SocketInputStream.java7 at :77applyLDAPConnectionPool.java::)) .1: at 6:()3)481 poll6java.util.TimerThreadapply6).:1filljava.util.concurrent.ThreadPoolExecutor$Worker4):33.((com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1BlockingArrayQueue.java843 runBufferedInputStream.java1(.6ReportsExecutionService.scalaTimer.java3 at 3)))4(4 at 05 8) at at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$15:: at org.eclipse.jetty.util.Scanner2 at com.normation.rudder.reports.execution.ReportsExecutionService$$anonfun$asyncHook$1ReportsExecutionService.scalascala.concurrent.impl.Future$PromiseCompletingRunnableapplyjava.io.BufferedInputStream...liftedTree1$154java.util.concurrent.ThreadPoolExecutor$Worker( )81 at .fill.: ) at ) at at :java.io.BufferedInputStreamrun37.)fill(scan:.((1org.eclipse.jetty.util.BlockingArrayQueue8 ReportsExecutionService.scala 2java.io.BufferedInputStream1.2.ThreadPoolExecutor.java(java.io.BufferedInputStream)BufferedInputStream.java4:Scanner.java3 at :apply6).6 poll81:2read(2)(fill at 63 at (java.util.concurrent.ThreadPoolExecutorFuture.scalacom.unboundid.ldap.sdk.LDAPConnectionPool.8BlockingArrayQueue.java)org.eclipse.jetty.util.BlockingArrayQueue:apply4 at 7BufferedInputStream.java:run)7BufferedInputStream.java6:(.():)4. (getConnection: at runWorker0ReportsExecutionService.scala272 66ThreadPoolExecutor.javascala.concurrent.impl.Future$PromiseCompletingRunnable.32java.util.TimerThread4) ()2liftedTree1$1 ReportsExecutionService.scala(:3 )poll at at 3: at at (1)org.eclipse.jetty.util.thread.QueuedThreadPoolFuture.scalaLDAPConnectionPool.java java.lang.Thread :. at )6)84 at :java.io.BufferedInputStream (7 at .2)2scala.concurrent.impl.Future$PromiseCompletingRunnableidleJobPoll at org.eclipse.jetty.util.Scanner$1.java.io.BufferedInputStream.BlockingArrayQueue.java( at com.unboundid.asn1.ASN1StreamReader1:4liftedTree1$1. at run6(org.eclipse.jetty.util.thread.QueuedThreadPoolrunread(:. Future.scalaBufferedInputStream.java3(QueuedThreadPool.java.(60:Thread.javarun(readscala.concurrent.impl.Future$PromiseCompletingRunnable3:( at java.io.BufferedInputStream2(4)Timer.java3scala.concurrent.impl.Future$PromiseCompletingRunnable..ASN1StreamReader.java:read77:Scanner.java. (liftedTree1$1:5)BufferedInputStream.java9.:1BufferedInputStream.java:9(0 8 at )idleJobPollread:(QueuedThreadPool.java(Future.scala8)) 73) 27:5:262 at 26 ): at 46)BufferedInputStream.java))85 2org.eclipse.jetty.util.thread.QueuedThreadPool5ThreadPoolExecutor.java at 2java.io.BufferedInputStream:6com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider. ) at .24java.lang.Thread) run at at 3scala.concurrent.impl.Future$PromiseCompletingRunnable( at com.unboundid.asn1.ASN1StreamReader16)Future.scala.idleJobPoll ))scala.concurrent.impl.Future$PromiseCompletingRunnable com.unboundid.asn1.ASN1StreamReader: .. at at at 6run6newConnection.(readrunorg.eclipse.jetty.util.thread.QueuedThreadPool( at at at QueuedThreadPool.javascala.concurrent.impl.Future$PromiseCompletingRunnable7((.java.util.TimerThread62.6):) ).readTypescala.concurrent.impl.Future$PromiseCompletingRunnablerunliftedTree1$15.4liftedTree1$12: (LDAPConnectionProvider.scala)6scala.concurrent.impl.Future$PromiseCompletingRunnable( at .ASN1StreamReader.java (org.eclipse.jetty.util.thread.QueuedThreadPool.Future.scala(Future.scalacom.unboundid.asn1.ASN1StreamReader. ASN1StreamReader.javaaccess$600run.:( at Thread.java.).:(::Future.scala3221access$600 QueuedThreadPool.java(::723(9Future.scala::scala.concurrent.impl.ExecutionContextImpl$$anon$3mainLoop4read477QueuedThreadPool.java( at read4com.unboundid.asn1.ASN1StreamReader84):).()(2Timer.java4 :)7 at 4:54)2ASN1StreamReader.java at BufferedInputStream.java9scala.concurrent.impl.Future$PromiseCompletingRunnable at ) at 8:com.unboundid.asn1.ASN1StreamReader.4 )) 6 at Future.scalaread7 at ::9(readType at 2org.eclipse.jetty.util.thread.QueuedThreadPool$3scala.concurrent.impl.Future$PromiseCompletingRunnable2.).ASN1StreamReader.java478.)6)exec8 )( scala.concurrent.impl.ExecutionContextImpl$$anon$3 at at org.eclipse.jetty.util.thread.QueuedThreadPool$3org.eclipse.jetty.util.thread.QueuedThreadPool..6: runcom.unboundid.asn1.ASN1StreamReader at ) .) ASN1StreamReader.javaexec.(QueuedThreadPool.java(com.normation.ldap.sdk.PooledConnectionProvider$class java.util.TimerThreadExecutionContextImpl.scalarun:.978getInternalConnection(:(5 at .ExecutionContextImpl.scalaLDAPConnectionProvider.scalaaccess$600runrunrun((:1.beginSequence310( at QueuedThreadPool.java(:scala.concurrent.impl.ExecutionContextImpl$$anon$37.2 at (:() at scala.concurrent.impl.ExecutionContextImpl$$anon$3 at com.unboundid.asn1.ASN1StreamReaderASN1StreamReader.javaexec27(..exec:Timer.java2Future.scalaExecutionContextImpl.scala6Future.scalareadType5(ExecutionContextImpl.scala7:)7::2::1(1)ASN1StreamReader.java:7)2com.unboundid.asn1.ASN1StreamReader903)027com.unboundid.asn1.ASN1StreamReader7: 4 at )). ) java.lang.Thread.run)read(. at 1ASN1StreamReader.javacom.unboundid.asn1.ASN1StreamReaderQueuedThreadPool.java75:2: . 1 at beginSequence(( at scala.concurrent.forkjoin.ForkJoinTaskASN1StreamReader.java at 7 at 4scala.concurrent.forkjoin.ForkJoinTask0readTypeThread.java.0097com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider at .:))7com.unboundid.asn1.ASN1StreamReader. 7()9 ) at 8:9 at 8) scala.concurrent.impl.ExecutionContextImpl$$anon$30ASN1StreamReader.javagetInternalConnectiondoExec5.exec at com.unboundid.ldap.protocol.LDAPMessage.scala.concurrent.forkjoin.ForkJoinTask0java.lang.Thread.:)() 4 3(beginSequencerun(24ExecutionContextImpl.scala.) at ) Thread.java( :com.unboundid.asn1.ASN1StreamReaderASN1StreamReader.java77.doExec at 7(.readType()org.eclipse.jetty.util.thread.QueuedThreadPool$3.:1( at com.unboundid.ldap.protocol.LDAPMessageForkJoinTask.javarun at ASN1StreamReader.javadoExec09(8:.LDAPConnectionProvider.scala:scala.concurrent.forkjoin.ForkJoinTask.doExecreadLDAPResponseFrom(ForkJoinTask.java:(7()2QueuedThreadPool.javaForkJoinTask.javaLDAPMessage.java6:::)51 27 9:2 at ForkJoinTask.java2 at scala.concurrent.impl.ExecutionContextImpl$$anon$3:0readLDAPResponseFrom6 com.unboundid.asn1.ASN1StreamReader0)(: at java.util.concurrent.ThreadPoolExecutor$Worker020LDAPMessage.java)6363 :) .1)20) exec61 1 at 7)) at .4(469)scala.concurrent.forkjoin.ForkJoinTask 0)6 .) at ) . at at at at com.normation.ldap.sdk.LDAPConnectionProvider$class at doExeccom.unboundid.ldap.sdk.LDAPConnectionReader at scala.concurrent.forkjoin.ForkJoinPool$WorkQueueExecutionContextImpl.scala:.( at .beginSequence1 at runscala.concurrent.forkjoin.ForkJoinPool$WorkQueuecom.unboundid.asn1.ASN1StreamReader.scala.concurrent.forkjoin.ForkJoinPool$WorkQueue(.java.lang.Threadscala.concurrent.forkjoin.ForkJoinPool$WorkQueuerunTaskForkJoinTask.java0.withConcom.unboundid.ldap.protocol.LDAPMessage.(beginSequence..(runrunTaskcom.unboundid.ldap.sdk.LDAPConnectionReader.((LDAPConnectionProvider.scala7((ASN1StreamReader.javareadLDAPResponseFrom:ForkJoinPool.java.ASN1StreamReader.javaThreadPoolExecutor.java:LDAPConnectionReader.java2:ForkJoinPool.java::69:2runTaskrunTask0:1) 0(01run:15run64(( at 5scala.concurrent.forkjoin.ForkJoinTask((3))ForkJoinPool.javaThread.javaForkJoinPool.java 3 39)3:: at at LDAPConnectionReader.java9scala.concurrent.forkjoin.ForkJoinPool$WorkQueue):7).9:LDAPMessage.java9com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider:doExec813 1.withCon. at 1251337)scala.concurrent.forkjoin.ForkJoinPool4)23 at ( 9runTaskcom.unboundid.ldap.protocol.LDAPMessage. 9readLDAPResponseFromLDAPConnectionProvider.scala(( (:6) ForkJoinPool.java7 at com.unboundid.ldap.sdk.LDAPConnectionReaderForkJoinTask.java369 at :1))LDAPMessage.java:1 at ):scala.concurrent.forkjoin.ForkJoinPool. at runWorker.runWorkercom.normation.ldap.sdk.LDAPConnectionProvider$class002.scala.concurrent.forkjoin.ForkJoinPool.runWorker.map( ()(ForkJoinPool.javaForkJoinPool.java3 at 6 at :01scala.concurrent.forkjoin.ForkJoinPool3run7(ForkJoinPool.javajava.lang.Thread)9:.runWorker((4)1.LDAPConnectionProvider.scala6) ) run( LDAPConnectionReader.java9Thread.java)19:ForkJoinPool.java:: at 197 9:799193 at com.unboundid.ldap.protocol.LDAPMessage at com.unboundid.ldap.sdk.LDAPConnectionReader) at 2) at )scala.concurrent.forkjoin.ForkJoinPool.75scala.concurrent.forkjoin.ForkJoinWorkerThread. runWorkerscala.concurrent.forkjoin.ForkJoinWorkerThreadrun. at com.normation.ldap.sdk.ROPooledSimpleAuthConnectionProvider.readLDAPResponseFrom(97 at .) LDAPMessage.java.run(run((scala.concurrent.forkjoin.ForkJoinPool$WorkQueue7:ForkJoinWorkerThread.java:ForkJoinPool.java(LDAPConnectionReader.java)10ForkJoinWorkerThread.java9 Exception in thread "Connection reader for connection 4 to localhost:389" java.lang.OutOfMemoryError:: :::) at 171map( 4Java heap spaceLDAPConnectionProvider.scala17)21 90 at scala.concurrent.forkjoin.ForkJoinWorkerThread:7) at java.util.LinkedHashMap6).79) 357scala.concurrent.forkjoin.ForkJoinWorkerThread96.. 9runTaskrun at scala.concurrent.forkjoin.ForkJoinWorkerThread.) run(ForkJoinWorkerThread.java:1run(ForkJoinPool.java.createEntry (ForkJoinWorkerThread.java() at (8) at com.unboundid.ldap.sdk.LDAPConnectionReader:ForkJoinWorkerThread.java07)com.normation.rudder.services.nodes.NodeInfoServiceCachedImpl.isUpToDate$1.107): (1LinkedHashMap.java::3391run(LDAPConnectionReader.java) 4:2 at 545NodeInfoService.scala:1)7)6 at scala.concurrent.forkjoin.ForkJoinPool.07)java.util.HashMap.addEntry(runWorker(9)HashMap.java:8 at ForkJoinPool.java:1com.normation.rudder.services.nodes.NodeInfoServiceCachedImpl.withUpToDateCache(NodeInfoService.scala: 979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.276run(ForkJoinWorkerThread.java) at 9:107) 6com.normation.rudder.services.nodes.NodeInfoServiceCachedImpl.getAll(NodeInfoService.scala:311) at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1$$anonfun$apply$5.apply(ReportingConfiguration.scala:) at java.util.LinkedHashMap117) at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1$$anonfun$apply$5.apply.addEntry(LinkedHashMap.java:439) at java.util.HashMap.(ReportingConfiguration.scala:11put(HashMap.java:517) at com.unboundid.ldap.sdk.Entry.<init>(Entry.java:313) at com.unboundid.ldap.sdk.ReadOnlyEntry.<init>(ReadOnlyEntry.java:1436) at net.liftweb.common.Full) at com.unboundid.ldap.sdk.SearchResultEntry.flatMap(Box.scala:.<init>(SearchResultEntry.java:555) at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1.apply(ReportingConfiguration.scala:116)196) at com.normation.rudder.reports.AgentRunIntervalServiceImpl$$anonfun$getNodeReportingConfigurations$1.apply(ReportingConfiguration.scala:115) at at com.unboundid.ldap.sdk.SearchResultEntry.readSearchEntryFrom(SearchResultEntry.java:295) at com.unboundid.ldap.sdk.InternalSDKHelper.readSearchResultEntryFrom(InternalSDKHelper.java:296) at com.unboundid.ldap.protocol.LDAPMessage.readLDAPResponseFrom(LDAPMessage.java:1199) net.liftweb.common.Full.flatMap(Box.scala:555) at com.normation.rudder.reports.AgentRunIntervalServiceImpl at com.unboundid.ldap.sdk.LDAPConnectionReader.run(LDAPConnectionReader.java:257) .getNodeReportingConfigurations(ReportingConfiguration.scala:115) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17$$anonfun$apply$18.apply(ReportingServiceImpl.scala:327) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17$$anonfun$apply$18.apply(ReportingServiceImpl.scala:326) at net.liftweb.common.Full.flatMap(Box.scala:555) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17.apply(ReportingServiceImpl.scala:326) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1$$anonfun$apply$17.apply(ReportingServiceImpl.scala:325) at net.liftweb.common.Full.flatMap(Box.scala:555) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1.apply(ReportingServiceImpl.scala:325) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$$anonfun$com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos$1.apply(ReportingServiceImpl.scala:324) at net.liftweb.common.Full.flatMap(Box.scala:555) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$class.com$normation$rudder$services$reports$DefaultFindRuleNodeStatusReports$$getNodeRunInfos(ReportingServiceImpl.scala:324) at com.normation.rudder.services.reports.DefaultFindRuleNodeStatusReports$class.findRuleNodeStatusReports(ReportingServiceImpl.scala:277) at com.normation.rudder.services.reports.ReportingServiceImpl.findRuleNodeStatusReports(ReportingServiceImpl.scala:62) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$class.com$normation$rudder$services$reports$CachedFindRuleNodeStatusReports$$checkAndUpdateCache(ReportingServiceImpl.scala:189) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$$anonfun$invalidate$2.apply(ReportingServiceImpl.scala:156) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$$anonfun$invalidate$2.apply(ReportingServiceImpl.scala:156) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [2015-12-01 22:30:42] INFO migration - Migration of ChangeRequest fileFormat from '5' to '6' done, 0 EventLogs migrated [2015-12-01 22:30:42] ERROR migration - Error when migrating XML FileFormat' datas from format 5 to 6 in database <- Could not correctly finish the migration for EventLog fileFormat from '5' to '6'. Check logs for errors. The process can be trigger by restarting the application. <- Java heap space [2015-12-01 22:30:42] ERROR migration - Exception was: java.lang.OutOfMemoryError: Java heap space [2015-12-01 22:30:42] INFO bootchecks - Check existence of at least one archive of the configuration [2015-12-01 22:30:42] INFO bootchecks - First full archive of configuration-repository items done [2015-12-01 22:30:42] INFO bootchecks - Check rules archive directory in configuration-repository [2015-12-01 22:30:42] INFO bootchecks - Creating directory '/var/rudder/configuration-repository/ruleCategories' exists, done [2015-12-01 22:30:42] INFO bootchecks - Check that directive variables use the namespace 'rudder' [2015-12-01 22:30:49] INFO migration - Migration of inline variables in Directives to 'rudder' namespace already done, skipping [2015-12-01 22:30:49] INFO bootchecks - Check for force reload of Techniques library [2015-12-01 22:30:49] INFO bootchecks - Flag file '/opt/rudder/etc/force_technique_reload' does not exists, do not Technique library will not be reloaded [2015-12-01 22:30:49] INFO bootchecks - Check that system groups 'hasPolicyServer-*' are dynamics and correct them [2015-12-01 22:30:49] INFO application - Initializing plugin 'node external report' [com.normation.plugins.nodeexternalreport.NodeExternalReportPluginDef] [2015-12-01 22:30:49] INFO com.normation.plugins.nodeexternalreport.NodeExternalReportPluginDef - Report plugin correctly loaded [2015-12-01 22:30:49] INFO application - Application Rudder started
Updated by Florian Heigl about 9 years ago
The culprit is the log conversion, I disabled it using fanf's suggestion of setitng a migration END date.
Updated by Florian Heigl about 9 years ago
A note:
Background processing of this migration will not suffice to fix the issue.
It will enable startup of the API and Frontend and is important as such.
But it would still run out of memory. We bumped the server up from 8GB to 32GB without(!) success.
I haven't checked but I think that sum is more than the data to be processed.
You'll need to have a look at the way it is processed, it needs to be chunked somehow.
Updated by Florian Heigl about 9 years ago
It is also uncertain if the agent should be allowed to kill rudder (because API down) while it is in startup phase.
on the other hand startup can get stuck if LDAP or pgsql are down, so I kind of like this feature.
Updated by François ARMAND almost 9 years ago
- Category changed from Server components to Performance and scalability
- Status changed from New to In progress
- Assignee set to François ARMAND
- Priority changed from N/A to 1 (highest)
- Target version set to 2.11.20
The correct solution is to batch update, perhaps starting from most recent logs toward older one. If we do migration by log of 1000 for ex, ram usage should still be quite low, and we just need to recurse after each batch. When we don't find any more log of one kind, skip to the next.
Updated by Janos Mattyasovszky almost 9 years ago
- Related to Bug #7988: Migrate eventlog crashes jetty with OOM added
Updated by Janos Mattyasovszky almost 9 years ago
- Related to deleted (Bug #7988: Migrate eventlog crashes jetty with OOM)
Updated by Janos Mattyasovszky almost 9 years ago
- Has duplicate Bug #7988: Migrate eventlog crashes jetty with OOM added
Updated by François ARMAND almost 9 years ago
- Related to Bug #8001: create an index for detection of eventlog version added
Updated by François ARMAND almost 9 years ago
I have a WIP solution in https://github.com/fanf/rudder/tree/bug_7524/java_oom_during_java_s_log_migration. For now, I took care of the high memory consumption by improving events selected, and batching the selection by group of 1000 (more test needed to see if 5000 or 10 000 are better default choices).
I'm also wondering if it makes sense to let the user be able to customize that batch size.
Next: make the processing asynchrone so that it does not block boot.
The correction won't be in the next bug releases (2.11.19/3.0.14/3.1.8), but in the next one.
Updated by François ARMAND over 8 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from François ARMAND to Nicolas CHARLES
- Pull Request set to https://github.com/Normation/rudder/pull/1068
Updated by François ARMAND over 8 years ago
There is two main point of that change:
- make each migration be only by step of "max batch", so that we can't explose the memory consumption. I set up the batch size to 1000, because eventlog object use ~10kb, but can use to up to ~50kb for very big events (say, group update with thousand of nodes). So with that, we should keep the memory consumption below 100Mo for the migration
- make the process async so that the event log migration, which is not necessary for Rudder to work, is done in background. It can be stop and restart as many time as needed, so no problem if Rudder stop before the end of migration
Updated by François ARMAND over 8 years ago
- Status changed from Pending technical review to Pending release
- % Done changed from 0 to 100
Applied in changeset rudder|e0c4b34ee8c4d9b633ce116bd0848fbd97f0fc72.
Updated by François ARMAND over 8 years ago
- Related to User story #5617: Detecting and restarting Rudder on OOM (Out Of Memory Exception) added
Updated by Vincent MEMBRÉ over 8 years ago
- Status changed from Pending release to Released