Architecture #12621


Explore alternative format for compliance table

Added by François ARMAND almost 5 years ago. Updated almost 5 years ago.

Performance and scalability
Target version:
Effort required:


Today, we use a `nodecompliance` table where the compliance details is a `text` type which contains actual json in it. The idea was to store the biggest quantity of information in a format that very amendable to evolutions.

Now that we actually need to use that data, we see three problems:

- 1/ the least one is that we store intermediate compliance in `CompliancePercent` + total number of reports. The idea was that we would be able to reconstruct the source data because we store everything till cmoponents values and message, so it was better to have a preprocessed format, usable as is and human readable in the middle. In fact, it's better to have precise data at each level, else if you want to aggregate several compliance, either you need to process the whole data structure (inefficient and against optimisation / server side aggregation) or you need to make approximation from percent and total number of nodes.

- 2/ the second one is that compliance details takes a lot of space. In the range of 100-500kB per entry. A node generate around 300 entry (288) per day, so for a week on 100 nodes (quite small) and 150kB per compliance, we have to deal with 3Go of datas. It takes a lot of place, and moving so much data between postgresql and rudder for processing take a lot of time (even if it was only bounded by network).

- 3/ we are stuck with Postgres 9.2 by default for quite some time, since it's what comes with centos/rhel 7. And there is no simple way to update to a more recent postgres without (simple as in "no need to hand migrate datas/config file/etc and have everything works in rudder)

All these consideration let us think that we may need to change the data format of node compliance. Perhaps having two table, one for "archiving" (the current one with all datas). And one for querying, in a different format. Both would have separatly configurable clean-up periode.

These problems are related to performance bootlneck discovered in "reporting" plugin (for ex: #12620 and related ticket).

Subtasks 2 (0 open2 closed)

Architecture #12648: Add migration script for table nodecompliancelevelsReleasedFrançois ARMANDActions
Bug #12661: Rudder fails to boot because cleaning TTL property is added after rudder reboot by migration scriptReleasedVincent MEMBRÉActions

Related issues 1 (0 open1 closed)

Related to Rudder - Bug #14007: Indexes on nodecompliancelevels table are not valid and table ArchivedReportsExecution is never usedReleasedFrançois ARMANDActions

Also available in: Atom PDF