Project

General

Profile

Actions

Question #15436

open

How to clean the tables 'nodes' and 'nodes_info' of seemingly out of date informations ?

Added by Kévin Mèche over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
N/A
Assignee:
-
Category:
Web - Maintenance
Target version:
-
Regression:

Description

Hello,

After doing database maintenance operations (as per written in FAQ (1)), I explored a bit the database, to see which tables are taking the most space.
(For information, my RudderMachine has ~60GB of storage, and manages 113 nodes exactly.)

rudder=# SELECT pg_size_pretty( pg_database_size('rudder') );
 pg_size_pretty
----------------
 40 GB

So I checked for the heavyweight tables, and the winners are :

rudder=#  SELECT relname AS "Table", pg_size_pretty(pg_total_relation_size(relid)) AS "Size", pg_size_pretty(pg_relation_size(relid)) AS "Relation size",  pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS "External Size" FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC LIMIT 5;
          Table          |  Size   | Relation size | External Size
-------------------------+---------+---------------+---------------
 nodes_info              | 16 GB   | 4392 kB       | 16 GB
 nodes                   | 14 GB   | 3405 MB       | 10 GB
 archivedruddersysevents | 3237 MB | 2706 MB       | 530 MB
 ruddersysevents         | 2994 MB | 1429 MB       | 1565 MB
 archivednodecompliance  | 842 MB  | 699 MB        | 143 MB

So, I do have 2 tables using half of the DB weight.
I did a bit of exploration (and thanks to ticket #9518), and found out that some nodes have a long history stored in db, even for deleted reports or old configurations checks.

For the table `nodes`, it seems I can remove safely historical entries from nodes with a request like DELETE FROM nodes WHERE endtime < (now() - interval '7 days'), but I fear it might break something, so I won't do.

rudder=# select count(*) from nodes where endtime < (now() - interval '7 days');
  count
----------
 29083989

So, my question is:
- is the request written above (or a cleaner one) viable ?
- is there a possibility to clean the `nodes_info` table of what seems to be out of date values ?

Warm regards,

---
1. https://faq.rudder.io/knowledge-bases/2/articles/14-database-is-using-too-much-space


Related issues 1 (0 open1 closed)

Related to Rudder - Bug #17778: table nodes contains on entry per node per generation, which is too muchReleasedFrançois ARMANDActions
Actions #1

Updated by Kévin Mèche over 4 years ago

  • Tracker changed from Bug to Question
  • Priority deleted (0)
  • Name check deleted (To do)
  • Fix check deleted (To do)
Actions #2

Updated by Kévin Mèche over 4 years ago

  • Description updated (diff)
Actions #3

Updated by Kévin Mèche over 4 years ago

  • Description updated (diff)
Actions #4

Updated by Nicolas CHARLES over 4 years ago

Hi Kévin,

Thank you for your ticket.
I've never seen such large disk usage for table nodes, so there is definitively something odd here
The goal of table "nodes" is to store historical data for nodes, like knowing the name of nodes at certain time, or being able to know that there was a node with this name in the past - it is used mainly for the eventlogs
So, you should be able to delete old entries, but to be safe I wouldn't delete any entries younger than 1 or 2 months.

As for nodes_infos, this stores the config_ids that are still relevant (so all thoses with a current value), and are purged when a new config id appears.
There's a catch here: it seems that nodes_infos for deleted nodes are never purged (at least on my test system) - could it be that you deleted a lot lot of nodes?

If you run
select count(*) from nodes_infos;
you should have 113 result (one per node) - how many do you have ?

Actions #5

Updated by Kévin Mèche over 4 years ago

Hello Nicolas,

In fact, we added and deleted some nodes up to 10 in total (if my memory serves me right).

Does the fact that some node are switched on/off (for maintenance operations, like adding RAM/ROM to the VM) has some impact?

For your questions :

rudder=# select count(*) from nodes_info;
 count
-------
   113
(1 ligne)

For the deletion ... since it's a production environment, I request an overview :

DELETE FROM nodes WHERE endtime < (now() - interval '2 months');

But for `nodes_info` .. how can I clean this table, without breaking the history ?

In the worst case, since I stopped the service, I thought of deleting all informations related to the nodes (and nothing else : I want to keep the directive/rules/... setup), then re-accept the nodes afterwards.
For this worst case scenario, is there a way to do this cleanly?


A bit a data-exploration in the `nodes_info` table gave me that my oldests nodes all have a tremendeous amount of infos in their config_ids field.
I thought of keeping only the first and last entry of the given array. Is that a good way ?

Warm regards,

Actions #6

Updated by Nicolas CHARLES over 4 years ago

Thank you for your answer; i'm still strugling to understang why nodes_infos would be so large
which version of rudder are you using ?

Actions #7

Updated by Kévin Mèche over 4 years ago

We are using version 5.0.4 at the moment.
I plan to update once the database will have been cleaned.

Actions #8

Updated by Nicolas CHARLES over 4 years ago

For nodes_info, you can safely purge all old data (old data == older than one agent run, except the latest entry).
If you even agree to have compliance unknown for the lenght of an agent run, you can truncate table nodes_info, and trigger a full generation - that would be much easier

Deleting the nodes wouldn't help you here.
Once you'll update to 5.0.12 or later, you'll also have better cleaning of old data (especially in the ldap directory)

Actions #9

Updated by Kévin Mèche over 4 years ago

Hello,

Sorry for the late answer.
I will to the truncate action on the `nodes_info` table, and trigger the full regeneration (clearing the policy cache), this afternoon.

Regards,

Actions #10

Updated by Nicolas CHARLES almost 4 years ago

  • Related to Bug #17778: table nodes contains on entry per node per generation, which is too much added
Actions

Also available in: Atom PDF