Bug #16773
closedBatch of new nodes can overflow rudder server with inventories
Description
We decided to make non accepted nodes send their inventories more often: https://issues.rudder.io/issues/9676 "An agent run with initial promises should send its inventory more often"
The unforseen effect of that decision is that if you had a bunch of nodes at the same time (in the hundreds), they start spamming Rudder server with inventories. And inventories will be rejected because the processing queue is full quite often.
If you are not lucky, it will always be the same node that will be processed.
We should add safeguards on the server side to reject inventories for new nodes that are already in the processing queue (and only new nodes, I believe).
We should also make nodes send their inventory more often only for one or two hours. Problems descibed in ticket #9676 don't matche the case of a node still not accepted after, say, 3 days.
Files
Updated by François ARMAND over 4 years ago
- Related to User story #9676: An agent run with initial promises should send its inventory more often added
Updated by Vincent MEMBRÉ over 4 years ago
- Target version changed from 5.0.17 to 5.0.18
Updated by François ARMAND over 4 years ago
- Category set to Performance and scalability
- Target version changed from 5.0.18 to 6.2.0~beta1
- Severity set to Major - prevents use of part of Rudder | no simple workaround
- User visibility set to Operational - other Techniques | Rudder settings | Plugins
- Priority changed from 0 to 24
This is a change perhaps to big for a patch version.
Updated by Vincent MEMBRÉ about 4 years ago
- Target version changed from 6.2.0~beta1 to 6.2.0~rc1
- Priority changed from 24 to 46
Updated by François ARMAND about 4 years ago
- Assignee set to François ARMAND
- Priority changed from 46 to 45
Updated by François ARMAND about 4 years ago
- File clipboard-202011132228-mpb2z.png clipboard-202011132228-mpb2z.png added
- Target version changed from 6.2.0~rc1 to 6.1.7
Actually, it seems like a bug and something that can be corrected in 6.1. It just misses a buffer.
(the AddToQL part can deduplicate inventories)
Updated by François ARMAND about 4 years ago
OK, so that's no so simple because of the fact that we return a "inventory status" to callers and that that status needs signature check, which means that we need to parse both signature file (ok, small) and inventory (not ok). The inventory parsing is need to get:
- nodeId (used to check certificate subject),
- certificate (for public key).
So, we can make all of that MUCH simpler, but it will be an API change:
- rest API only copy inventory / signature file to /var/inventories/received (no special treatment for it),
- same logic for inotify and periodic catch-up,
- only one big queue of the small InventoryFileInfo structure (can likely hold 10 000 elements for the cost of one parsed inventory)
- dequeue does the xml parsing, signature check, etc. (but here, see, returns Unit, nobody knows when it will happen).
For 6.1, we can try to add a buffer on the standard path (ie: not the rest API), and only there.
Updated by François ARMAND about 4 years ago
- Status changed from New to In progress
Updated by François ARMAND about 4 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from François ARMAND to Nicolas CHARLES
- Pull Request set to https://github.com/Normation/rudder/pull/3367
Updated by François ARMAND about 4 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder|282148b4c899c7a4ed621c9798481aebc794e2bb.
Updated by François ARMAND almost 4 years ago
- Fix check changed from To do to Checked
Updated by Vincent MEMBRÉ almost 4 years ago
- Status changed from Pending release to Released
This bug has been fixed in Rudder 6.1.7 which was released today.