Project

General

Profile

Actions

Bug #16612

closed

Bug #16557: CachedFindRuleNodeStatusReports is a huge source of contention

Bug #16565: New cache doesn't return all compliance

New compliance cache must return expired data

Added by François ARMAND almost 5 years ago. Updated almost 5 years ago.

Status:
Released
Priority:
N/A
Category:
Performance and scalability
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
0
Name check:
Reviewed
Fix check:
Checked
Regression:

Description

Today, we chose to remove expired compliance from the values returned by the cache. But the resulting behavior is highly strange and suprising when rudder is loaded and compliance take time to be computed.

Imagine reports are coming regularly, and compliances need continual update. It may happen (espcially after a full cache invalidate following a full generation) that not all cache are already updated after 5 minutes. In that case, all the node not yet updated will disappear from compliance queries, while in fact, they are most likely just in processing.

This leads to huge variation on node numbers, and it's scary and unworkable for user.

We propose to add the same latency as it exists in other part of rudder before declaring expiration (ie expiration+2 runs). During that period, cached data are returned. After that, we return a NoReportInInterval status (ie we keep the knowledge that we do have data in cache, but that they are not relevant anymore).
Once the compliance is finally calculated for the node, if really there was not run in interval, we will have that info in cache and will stop to ask for compliance update from the cache (next update will be caused by a run or policy generation)

Actions #1

Updated by François ARMAND almost 5 years ago

  • Subject changed from New cache must return expired data to New compliance cache must return expired data
Actions #2

Updated by François ARMAND almost 5 years ago

  • Status changed from New to In progress
Actions #3

Updated by François ARMAND almost 5 years ago

In fact, expiration date already has a grace period.
But we need to sort node by expiration date (older first) when we calcul compliance.

Actions #4

Updated by François ARMAND almost 5 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Nicolas CHARLES
  • Pull Request set to https://github.com/Normation/rudder/pull/2737
Actions #5

Updated by François ARMAND almost 5 years ago

  • Status changed from Pending technical review to Pending release
Actions #6

Updated by François ARMAND almost 5 years ago

  • Fix check changed from To do to Checked
Actions #7

Updated by Alexis Mousset almost 5 years ago

  • Name check changed from To do to Reviewed
Actions #8

Updated by Vincent MEMBRÉ almost 5 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 6.0.3 which was released today.

Actions

Also available in: Atom PDF