Project

General

Profile

Actions

Bug #26464

open

Stackoverflow in NodeStatusReports event computing

Added by François ARMAND 6 days ago. Updated 2 days ago.

Status:
Pending release
Priority:
1 (highest)
Category:
Web - Compliance & node report
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
0
Name check:
To do
Fix check:
To do
Regression:
No

Description

On load, we get an XSS in ComputeNodeStatusReportServiceImpl:

2025-03-04 06:40:42+0100 INFO  policy.generation.timing - Policy generation succeeded in: 1 min 49 s
2025-03-04 06:40:42+0100 INFO  policy.generation.manager - Successful policy update '740168' [started 2025-03-04 06:38:53 - ended 2025-03-04 06:40:42]
java.lang.StackOverflowError
        at java.base/java.lang.invoke.DirectMethodHandle.allocateInstance(DirectMethodHandle.java:520)
        at com.normation.rudder.services.reports.ComputeNodeStatusReportServiceImpl.$anonfun$groupQueueActionByType$1(ComputeNodeStatusReportService.scala:373)
        at scala.Option.map(Option.scala:242)
        at com.normation.rudder.services.reports.ComputeNodeStatusReportServiceImpl.groupQueueActionByType(ComputeNodeStatusReportService.scala:372)
        at com.normation.rudder.services.reports.ComputeNodeStatusReportServiceImpl.$anonfun$groupQueueActionByType$1(ComputeNodeStatusReportService.scala:373)
        at scala.Option.map(Option.scala:242)
        at com.normation.rudder.services.reports.ComputeNodeStatusReportServiceImpl.groupQueueActionByType(ComputeNodeStatusReportService.scala:372)
        at com.normation.rudder.services.reports.ComputeNodeStatusReportServiceImpl.$anonfun$groupQueueActionByType$1(ComputeNodeStatusReportService.scala:373)
        at scala.Option.map(Option.scala:242)
        at com.normation.rudder.services.reports.ComputeNodeStatusReportServiceImpl.groupQueueActionByType(ComputeNodeStatusReportService.scala:372)
        at com.normation.rudder.services.reports.ComputeNodeStatusReportServiceImpl.$anonfun$groupQueueActionByType$1(ComputeNodeStatusReportService.scala:373)
        at scala.Option.map(Option.scala:242)
(and loop on the last 3 lines)

WORKAROUND
This can be workarounded by increasing the stack size - which also point to a real system contention, and not a logic bug:
=> add -Xss64m to the GC parameters in @/etc/default/rudder-jetty@alexandre.brianceau
It then may happen that jetty refuse to start because it is killed by systemd before having fully processed the old things.
You may need to force stop jetty, and perhaps wait for the agent to repair things, and perhaps wait a couple of generation/report processing before compliance converge back to green.

It looks like a real XSS because the groupBy is not stack safe, but we need to investigate, understand to root cause, and correct it.
The observed instance was Rudder 8.2.4 but nothing changed in more recent version here.

Actions

Also available in: Atom PDF