Bug #16557
closedCachedFindRuleNodeStatusReports is a huge source of contention
Description
As soon as we have a loaded rudder, with regular policy generation and runs incoming and user clicking in the UI, the compliance cache because a HUGE source of contention. This is because we have a lock on both read and write.
There is no reason to do so, because we don't care to show slighly outdated compliance for nodes: we have timestamp on it and are able to say it to the user. But we do really do care to display something, and to not block the whole app because of that.
So we can have a logic similar to:
- we have an unlocked cache, both in write and read
- all writes pass through a queue of invalidation (a set of nodes to update). It async compute compliance for updated nodes and update the cache values. I'm not even sure we need a lock here.
- all read are free from locking.
Typical contention:
"zio-rudder-mix-10" - Thread t@71 java.lang.Thread.State: BLOCKED at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports.$anonfun$invalidate$1(ReportingServiceImpl.scala:215) - waiting to lock <2cf07dc8> (a com.normation.rudder.services.reports.CachedReportingServiceImpl) owned by "pool-5-thread-12" t@483 at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$$Lambda$2089/243809471.apply(Unknown Source) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:57) at scala.concurrent.package$.blocking(package.scala:146) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports.invalidate(ReportingServiceImpl.scala:214) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports.invalidate$(ReportingServiceImpl.scala:214) at com.normation.rudder.services.reports.CachedReportingServiceImpl.invalidate(ReportingServiceImpl.scala:94) at com.normation.rudder.reports.execution.ReportsExecutionService.$anonfun$new$2(ReportsExecutionService.scala:87) at com.normation.rudder.reports.execution.ReportsExecutionService$$Lambda$2086/719053554.apply$mcV$sp(Unknown Source) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at zio.internal.FiberContext.evaluateNow(FiberContext.scala:404) at zio.internal.FiberContext.$anonfun$evaluateLater$1(FiberContext.scala:602) at zio.internal.FiberContext$$Lambda$230/219812012.run(Unknown Source) at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) "scala-execution-context-global-597" - Thread t@597 java.lang.Thread.State: BLOCKED at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports.$anonfun$checkAndUpdateCache$1(ReportingServiceImpl.scala:229) - waiting to lock <2cf07dc8> (a com.normation.rudder.services.reports.CachedReportingServiceImpl) owned by "pool-5-thread-12" t@483 at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports$$Lambda$2096/1223677606.apply(Unknown Source) at scala.concurrent.impl.ExecutionContextImpl$DefaultThreadFactory$$anon$1$$anon$2.block(ExecutionContextImpl.scala:75) at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3313) at scala.concurrent.impl.ExecutionContextImpl$DefaultThreadFactory$$anon$1.blockOn(ExecutionContextImpl.scala:87) at scala.concurrent.package$.blocking(package.scala:146) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports.checkAndUpdateCache(ReportingServiceImpl.scala:228) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports.findRuleNodeStatusReports(ReportingServiceImpl.scala:283) at com.normation.rudder.services.reports.CachedFindRuleNodeStatusReports.findRuleNodeStatusReports$(ReportingServiceImpl.scala:280) at com.normation.rudder.services.reports.CachedReportingServiceImpl.findRuleNodeStatusReports(ReportingServiceImpl.scala:94) at com.normation.rudder.web.services.AsyncComplianceService$NodeCompliance.computeCompliance(AsyncComplianceService.scala:122) at com.normation.rudder.web.services.AsyncComplianceService$ComplianceBy.$anonfun$futureCompliance$1(AsyncComplianceService.scala:100) at com.normation.rudder.web.services.AsyncComplianceService$ComplianceBy$$Lambda$7099/308232004.apply(Unknown Source) at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659) at scala.concurrent.Future$$$Lambda$1629/561133045.apply(Unknown Source) at scala.util.Success.$anonfun$map$1(Try.scala:255) at scala.util.Success.map(Try.scala:213) at scala.concurrent.Future.$anonfun$map$1(Future.scala:292) at scala.concurrent.Future$$Lambda$1631/416579056.apply(Unknown Source) at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) at scala.concurrent.impl.Promise$$Lambda$1635/643434827.apply(Unknown Source) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Updated by François ARMAND almost 5 years ago
- Status changed from New to In progress
Updated by François ARMAND almost 5 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from François ARMAND to Nicolas CHARLES
- Pull Request set to https://github.com/Normation/rudder/pull/2717
Updated by François ARMAND almost 5 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder|cd450bb78764fac7fa438976e08181a2aedefcea.
Updated by François ARMAND almost 5 years ago
- Related to Bug #16382: Improve performance of policy generation writer added
Updated by François ARMAND almost 5 years ago
- Fix check changed from To do to Checked
Updated by Vincent MEMBRÉ over 4 years ago
- Status changed from Pending release to Released
This bug has been fixed in Rudder 6.0.3 which was released today.
Updated by Vincent MEMBRÉ over 4 years ago
- Related to Bug #17341: Compliance data for reporting plugin are not generated anymore added