Project

General

Profile

Actions

Architecture #14923

open

Dynamic groups with regex on software are long to build delaying generation

Added by François ARMAND over 5 years ago. Updated almost 3 years ago.

Status:
New
Priority:
N/A
Category:
Performance and scalability
Target version:
-
Effort required:
Name check:
Fix check:
Regression:

Description

When we have a dynamic group built from a regex on software (name or version), the group can be long to resolve (several seconds).
If you have a hundred such group, rebuilding all dynamic group time explodes.
This is especialy problematic because since #

The reason is that we can't handle regex on the server side, so the logic is:

- 1/ get all software (which mean: LDAP request, get result, translate them to entries, filter entries - so we have a lot of data on net, GC churn, etc)
- 2/ filter by regex in rudder.

Softwares are problematic because we commonly have several tens of thousand of them.

This is even worse because we never trim software, so we may have irrelevant entries.

We already change trivial regex like .*something.* into "match substring" requests.

I don't see any simple way to enhance the performance on one request, but two things that can be better for sequence of requests:

- trim software. Less software mean better perf, and it's just pure technical debt to not do it.
- add a local cache of software for the duration of updating dyngroups as soon as at least one dyn group need it (ie: a lazy evaluated parameter for that). It does not seems harder to do it for all search request. It is a big change.

Also, we should be able to know exactly what groups need to be regenerated at any point in time.
Here, our strategy is "rebuild everything" which is correct but extremelly inefficient (first level).
The second level is "only rebuild dyn groups if there's an indice they need to (ex: if no modification in any part of node inventories / nodes properties since last groupe update... we can just skip it).
The third level is to have the real dependency graph, and only rebuild groups which are changed by a mod as a consequence of that change.

Level 3 is a big change. It will go with commitid logic.
Level 2 is really not that hard, we already have the information about changes in nodes, so it's just a matter of using it correctly.

This ticket is an header ticket, I will open children for the ones we want to adress independently.


Subtasks 2 (0 open2 closed)

Bug #14924: Cleanup unreferenced software from past inventoriesReleasedNicolas CHARLESActions
Bug #15069: Merge error in parent broke the buildReleasedFrançois ARMANDActions

Related issues 2 (1 open1 closed)

Related to Rudder - Bug #9609: Deleted node should be periodically fully erased in LDAP (after some ttl)ReleasedNicolas CHARLESActions
Related to Rudder - Architecture #14939: Inefficient storage of software in LDAP directory NewActions
Actions

Also available in: Atom PDF