Bug #24652
closed
Rudder 8.1 slows down over time
Added by Nicolas CHARLES 8 months ago.
Updated 5 months ago.
Category:
Performance and scalability
Description
2024_03_16.stderrout.log.gz:2024-03-16 06:28:42+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 3893943 ms
2024_03_16.stderrout.log.gz:2024-03-16 07:32:03+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 3601104 ms
2024_03_16.stderrout.log.gz:2024-03-16 08:02:09+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 720472 ms
2024_03_16.stderrout.log.gz:2024-03-16 10:00:48+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 5868962 ms
2024_03_16.stderrout.log.gz:2024-03-16 12:01:45+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 6048786 ms
2024_03_16.stderrout.log.gz:2024-03-16 14:02:46+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 6057161 ms
2024_03_16.stderrout.log.gz:2024-03-16 17:49:57+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 7617380 ms
2024_03_16.stderrout.log.gz:2024-03-16 20:03:33+0000 DEBUG dynamic-group.timing - Computing dynamic groups without dependencies finished in 7807798 ms
There are objects ( com.normation.rudder.services.reports.CacheComplianceQueueAction$ExpiredCompliance ) piling up in the heap dump, which might be related
Files
- Assignee set to Nicolas CHARLES
There are roughly 1 million of com.normation.rudder.services.reports.CacheComplianceQueueAction$ExpiredCompliance piling up in 30 minutes
- Target version changed from 8.1.0~rc1 to 8.1.0
- Target version changed from 8.1.0 to 8.1.1
Creating an other ticket for the ExpiredCompliance which are pilling up, it might not be the root cause but something else to investigate
- Related to Bug #24712: ExpiredCompliance events are pilling up added
- Related to Bug #24713: Dynamic groups are slow to compute in Rudder 8.1 added
We will need to see if the two linked tickets are enought to remove the slowdown over time.
After 24h on hour test machine, it seems to be ok with memory correctly reclamed when needed.
There is a lot of CPU spikes, they correlate with generation due to system-update campaign starting/ending.
- Target version changed from 8.1.1 to 8.1.2
- Target version changed from 8.1.2 to 8.1.3
- Target version changed from 8.1.3 to 8.1.4
Perhaps we have a track to explore: the node score `handleEvent` that does a database insert is done in the `performAction` linked to the `invalidateComplianceReport` queue.
I'm going to try to move it in `fetchRunAndCompliance` around the place where `ComplianceRepository.saveRunCompliance` is done, which is known to be slow.
- Status changed from New to In progress
- Assignee changed from Nicolas CHARLES to François ARMAND
- Status changed from In progress to Pending technical review
- Assignee changed from François ARMAND to Clark ANDRIANASOLO
- Pull Request set to https://github.com/Normation/rudder/pull/5737
- Target version changed from 8.1.4 to 8.1.5
- Assignee changed from Clark ANDRIANASOLO to Nicolas CHARLES
- Status changed from Pending technical review to Pending release
- Fix check changed from To do to Checked
- Status changed from Pending release to Released
This bug has been fixed in Rudder 8.1.5 which was released today.
Also available in: Atom
PDF