Project

General

Profile

Actions

Bug #4419

closed

Missing Categories / Groups after a rudder-server-root migration

Added by Cedric JARDIN almost 11 years ago. Updated over 9 years ago.

Status:
Rejected
Priority:
3
Assignee:
-
Category:
Web - Nodes & inventories
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

Hello,

Some categories / groups are missing after a "migration"
Clean cashes don't help.

Doc used for the migration :

http://www.rudder-project.org/rudder-doc-2.6/rudder-doc.html#_server_migration

Test case :

Server A (source) : CentOS release 6.3 + rudder-server-root 2.6.10 [previously this server was in 2.5.4 then 2.5.5 and updated in 2.6] => all category/groups/rules are present.
Server B (new one) : CentOS release 6.5 + rudder-server-root 2.6.10 [fresh install]

Example of Categories / Groups (Categories/sub categories label include « - » ; groups label include « _ ») :

- toto
-- Dev-toto
----------dev_toto : static
-- Prod-toto
----------prod_toto : static

Example of error :

[2014-01-27 10:12:02] WARN com.normation.rudder.web.components.RuleGrid - Disabling rule '00_PROD - gauntlt' (ID: '230d9328-6276-4686-9447-ac3d7d66e4fd') because it refers missing objects. Go to rule's details and save, then enable it back to correct the problem.
[2014-01-27 10:12:02] WARN com.normation.rudder.web.components.RuleGrid - Rule '00_PROD - gauntlt' (ID: '230d9328-6276-4686-9447-ac3d7d66e4fd' target problem: Error when retrieving the entry for NodeGroup 'NodeGroupId(53fae266-c6b8-49a2-8820-0334e9a4cf73)'

Enable and save the rule doesn’t change anything (but we have the « success, your changes have been saved » popup).

I found that some Categories and some of groups have not been « re »created.

Here is the error message in the stderrout.log :

[2014-01-27 10:08:04] INFO com.normation.rudder.repository.xml.ItemArchiveManagerImpl - Importing groups archive with id 'HEAD'
[2014-01-27 10:08:14] ERROR com.normation.rudder.batch.AsyncDeploymentAgent$DeployerAgent - Error when doing deployment, reason Cannot build target configuration node <- Error when retrieving the entry for NodeGroup 'NodeGroupId(9216614d-e060-410a-94b9-ffa13fb5274b)'
[2014-01-27 10:08:14] ERROR com.normation.rudder.batch.AsyncDeploymentAgent - Deployment error for process '13' at 2014/01/27 10:08:14: Cannot build target configuration node

Workaround (too long and borring if you have like us a lot of category / groups):
- mannually recreate« Prod-toto » category with « prod_toto » group
- add the group to the affected group list by the rule.
- saved the rule and re-enabled it.

BR

Cedric

Actions #1

Updated by Matthieu CERDA almost 11 years ago

  • Category set to Techniques
  • Status changed from New to 8
  • Priority changed from N/A to 1 (highest)
  • Target version set to 2.6.11

Hi Cédric, I acknowledge this bug!

Passing to François, as he seems to know a bit more than me about this :)

Actions #2

Updated by Cedric JARDIN almost 11 years ago

Caches have been cleared on both rudder-server-root

Some Categories / Groups are also missing on the Server A (same that on server B).

I think that the issue cames during the "upgrade" of the server A [from 2.5 to 2.6.10] and it's due to a DB issue.

As you know, Server A is a clone of our prod server and server B is a fresh installed server.

I found a difference between these 3 servers (the DB size) :

Server prod [2.5.4] =>
[root@servprod base]# pwd
/var/lib/pgsql/data/base
[root@servprod base]# du -sh *
5,5M 1
5,5M 11563
5,5M 11564
81G 16385
4,0K pgsql_tmp

Server A [2.5.4 upgraded in 2.5.5 and finally in 2.6.10] =>
[root@servA base]# pwd
/var/lib/pgsql/data/base
[root@serv1 base]# du -sh *
5,4M 1
5,4M 11563
5,5M 11564
54G 16385
4,0K pgsql_tmp

Server B [2.6.10] =>
[root@servB base]# pwd
/var/lib/pgsql/data/base
[root@servB base]# du -sh *
5,3M 1
5,3M 11563
5,4M 11564
26M 16385

I hope these informations will help to find the solution ! :)

Actions #3

Updated by François ARMAND almost 11 years ago

  • Description updated (diff)
Actions #4

Updated by Cedric JARDIN almost 11 years ago

As requested, please find more informations :

[root@servprod report]# ./rudder_metrics_reporting.sh
Number of expected reports (components*directives*nodes): 32643
Number of rules: 119 => 129 defined
Number of directives: 261
Number of nodes: 119
Number of reports for one day: 8026788
Report database size: 76 GB
Number of lines in reports table: 6.3087e+07
Full database size: 80 GB
Archiving reports:
archive.TTL=30
delete.TTL=90
frequency=daily

[root@servA report]# ./rudder_metrics_reporting.sh
Number of expected reports (components*directives*nodes): 32184
Number of rules: 116 => 125 defined
Number of directives: 256
Number of nodes: 118
Number of reports for one day: 10174
Report database size: 49 GB
Number of lines in reports table: 33611
Full database size: 54 GB
Archiving reports:
archive.TTL=30
delete.TTL=90
frequency=daily

[root@servB report]# ./rudder_metrics_reporting.sh
Number of expected reports (components*directives*nodes): 31
Number of rules: 3 => 125 defined
Number of directives: 3
Number of nodes: 0
Number of reports for one day: 9887
Report database size: 17 MB
Number of lines in reports table: 32730
Full database size: 25 MB
Archiving reports:
archive.TTL=30
delete.TTL=90
frequency=daily

In bold the values displayed in the web interface...

I'm available to do new test, give more informations. I can also clone (again) the prod server to relanch the update from 2.5.4 to 2.6.10 and give you more logs.

I hope there is a solution because for the moment we can't abandon our 2.5.4 to a 2.6.10

Actions #5

Updated by Cedric JARDIN almost 11 years ago

I have cleaned reports before leaving the office on 31/01.
TTL has been changed on each /opt/rudder/etc/rudder-web.properties file.
DB size has not been reduce.

Other point, We still don't know why all categories / groups has not been recreated (our first issue)...

FYI (new metrics reports) =>

[root@servprod report]# ./rudder_metrics_reporting.sh
Number of expected reports (components*directives*nodes): 32639
Number of rules: 119
Number of directives: 266
Number of nodes: 119
Number of reports for one day: 8000424
Report database size: 76 GB
Number of lines in reports table: 2.24742e+07
Full database size: 80 GB
Archiving reports:
archive.TTL=6
delete.TTL=7

root@servA report]# ./rudder_metrics_reporting.sh
Number of expected reports (components*directives*nodes): 32184
Number of rules: 116
Number of directives: 256
Number of nodes: 118
Number of reports for one day: 10244
Report database size: 49 GB
Number of lines in reports table: 33399
Full database size: 54 GB
Archiving reports:
archive.TTL=6
delete.TTL=7
frequency=daily

[root@servB report]# ./rudder_metrics_reporting.sh
Number of expected reports (components*directives*nodes): 31
Number of rules: 3
Number of directives: 3
Number of nodes: 0
Number of reports for one day: 9852
Report database size: 16 MB
Number of lines in reports table: 31822
Full database size: 24 MB
Archiving reports:
archive.TTL=6
delete.TTL=7
frequency=daily

Actions #6

Updated by François ARMAND almost 11 years ago

Hello,

For the categories, I don't succeeded in reproducing the problem, so I'm wondering if there is something special with your dataset that I don't think to. If that is possible, could you send us (privatly if you want/need, and if it's big we can use a protected sharing site like dl.free.fr) the archive that produce the bug, so that we can try to import it ?

For the database size, there seems to be an incoherency between the two first reports:

- first base: 32639 components, 8 000 000 reports by day, 2e7 lines (so seems almost coherent for 6 or 7 days)
- second base: 32184 components, 10 000 reports by day, 33399 lines

Does the second base have been cleaned up recently ?

Also, something to know about postgress: it does not give back the memory used, you have to reclam it (else postgress keep the memory for itsefl and just mark it as "clean, can be used to store other things"). But has the clean-up process locks the base for the duration of the cleanup, we don't do it automatically. If you want to do it, please see that part of the manual: http://www.rudder-project.org/rudder-doc-2.9/rudder-doc.html#_database_maintenance

Hop it helps,

Actions #7

Updated by Vincent MEMBRÉ over 10 years ago

  • Target version changed from 2.6.11 to 2.6.12
Actions #8

Updated by Vincent MEMBRÉ over 10 years ago

  • Target version changed from 2.6.12 to 2.6.13
Actions #9

Updated by Vincent MEMBRÉ over 10 years ago

  • Target version changed from 2.6.13 to 2.6.14
Actions #10

Updated by Jonathan CLARKE over 10 years ago

  • Target version changed from 2.6.14 to 2.6.16
Actions #11

Updated by Jonathan CLARKE over 10 years ago

  • Target version changed from 2.6.16 to 2.6.17
Actions #12

Updated by Nicolas PERRON over 10 years ago

  • Target version changed from 2.6.17 to 2.6.18
Actions #13

Updated by Matthieu CERDA about 10 years ago

  • Target version changed from 2.6.18 to 2.6.19
Actions #14

Updated by Vincent MEMBRÉ about 10 years ago

  • Target version changed from 2.6.19 to 2.6.20
Actions #15

Updated by François ARMAND almost 10 years ago

  • Project changed from 24 to Rudder
  • Category deleted (Techniques)
  • Assignee deleted (François ARMAND)
  • Priority changed from 1 (highest) to 3
  • Target version changed from 2.6.20 to 2.10.10

Wiating for more information, can't be reproduce for now.

Actions #16

Updated by Vincent MEMBRÉ almost 10 years ago

  • Target version changed from 2.10.10 to 2.10.11
Actions #17

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.11 to 2.10.12
Actions #18

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.12 to 2.10.13
Actions #19

Updated by Benoît PECCATTE over 9 years ago

  • Status changed from 8 to New
Actions #20

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.13 to 2.10.14
Actions #21

Updated by Benoît PECCATTE over 9 years ago

  • Category set to Web - Nodes & inventories

Hello Cedric, do you still have this bug ?

Actions #22

Updated by Benoît PECCATTE over 9 years ago

  • Status changed from New to Rejected

This doesn't seem to be a bug anymore. Closing.
Feel free to reopen if needed.

Actions

Also available in: Atom PDF