Project

General

Profile

Actions

Bug #4497

closed

Rudder web UI freezes when too many inventory are received at the same time

Added by François ARMAND about 10 years ago. Updated about 10 years ago.

Status:
Released
Priority:
N/A
Category:
Performance and scalability
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

When testing the scalability of Rudder in the number of nodes axis, we demonstrated that the endpoint (the war in charge of parsing and saving inventories) may consume the whole memory allocated to Jetty web server, and so deeply impact the usability of Rudder UI (the second war), leading to moustrous response time, or even the complete stale of the web UI.

The reason is that the endpoint application processing of an incoming inventory is splitted in 3 parts:

- 1/ handling of the HTTP request. That part is responsible of validating that we have a correct HTTP request (Post, good url, posted document available, etc)
- 2/ checking that the posted document is actually an XML file that we can parse as a Fusion Inventory report, and which contains the requiered tags for Rudder (UUID, etc)
- 3/ actually saving the report in the correct status in our LDAP database (checking if it is already present, updating what needs to be, etc).

The part 1/ is handled by jetty, nothing much to say about it.
The part 2/ and 3/ are asynchrone, so that at the end of part 2/, we are already able to answer to the HTTP request ("ok, I'm processing your inventory" or "failed precondition" or other error status). So we have a queue used to communicate between step 2/ and 3/.
The problem is that 2/ is much quicker than 3/. So, parsed documents are accumulating in the queue, and a parsed XML may take quite a lot of memory (from some Mo to tens of them for big inventories).

At that point, we reach a classical JVM memory exauhstion, where the GC can't free sufficient memory compared to what is needed by for the next action, and so it spend more and more time trying to free memory when there is even less and less available.

If we send inventory less frequently (one every ten seconds in our tests) so that step 3/ can be completed before a new inventory arrive, we were able to sustain the rececption of hundreds of inventories without any impact on performances.

We suspect that that problem may be the root cause of several reported staling of Rudder web application, like #4425

Solution:
The first (easy) step is to have two seperated java web server (jetty or other) so that each of them does not impact the other (for reference, an endpoint need less than 256Mo of heap space to work correcly when the queue is bounded).

A second possibility (or step) is to bound the max number of queued inventories allowed in the endpoint.

A third possibility (evolution) is to transform what is today the "endpoint" web application into a deamon in charge of reading inventory files in the incoming directory and processing them (actually limiting the number of processed inventory to "one", or an other max number chosen knowing the concurrent possibilities of the machine).


Subtasks 4 (0 open4 closed)

Bug #4522: Adapt send-clean.sh script to retry inventory sending when endpoint returns 503 codeReleasedJonathan CLARKE2014-02-25Actions
Bug #4523: Migration script to add waiting.inventory.queue.size property to configurationReleasedJonathan CLARKE2014-02-25Actions
Bug #4646: 'waiting.inventory.queue.size' property not added in correct property fileReleasedJonathan CLARKE2014-03-18Actions
Bug #4656: missing directory in spec, cannot build on rpmReleasedVincent MEMBRÉ2014-03-19Actions

Related issues 2 (0 open2 closed)

Related to Rudder - Bug #4425: Rudder should not query all nodes when checking loginRejectedFrançois ARMAND2014-01-31Actions
Related to Rudder - Bug #4349: Switching tabs in the webapp is extremely slowRejected2014-01-13Actions
Actions

Also available in: Atom PDF