Question #8176: All nodes compliance report unexpected/missing except root server. - Rudder - Issue Tracker

Actions

Copy link

Question #8176

closed

All nodes compliance report unexpected/missing except root server.

Added by siemen Meijssen over 9 years ago. Updated over 9 years ago.

Status:

Resolved

Priority:

N/A

Assignee:

Category:

Web - Compliance & node report

Target version:

Regression:

Description

All the nodes we have connected are reporting as 50%missing/50%unexpected for all compliance reports.
Example:

The CFengine binaries in /var/rudder/cfengine-community/bin are up to date
Unexpected

Files

Download all files

Rudder.PNG (35.4 KB) Rudder.PNG		siemen Meijssen, 2016-04-13 15:58
Rudder.PNG (109 KB) Rudder.PNG		siemen Meijssen, 2016-05-09 12:56
rudder2.PNG (59.2 KB) rudder2.PNG		siemen Meijssen, 2016-05-09 14:20
log.txt (554 KB) log.txt		siemen Meijssen, 2016-05-11 10:43
log2.txt (279 KB) log2.txt		siemen Meijssen, 2016-05-11 11:06

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by Vincent MEMBRÉ over 9 years ago

Hello Siemen, thank for reporting your issue!

Is there a time delay on the node and the root server ?

Can you show a screen of the entries in the techical log tab ?

Is the reporting on the root server ok ?

Actions

Copy link

Updated by siemen Meijssen over 9 years ago

Thanks for the quick reply,

What exactly do you mean with time delay? If you mean network wise, Ping times are less then 1msec, Time on all servers is configured correctly.

How would i be able to get this screen?

The clients report to the root server without a problem.( the last seen is updated every 5 minutes)

Actions

Copy link

Updated by Vincent MEMBRÉ over 9 years ago

Sorry about the late reply, I missed your answer.

For reporting to be ok, the date on the node and the server needs to be synchronized on both server and nodes ( run 'date' command on both, if you see a delay you have a problem)

About the "technical logs" tab, on a Node detail, click on the tab "technical log", one of the rightmost tabs and take a screenshot of the table displayed

Was it working before, or is it a new install ?

Actions

Copy link

Updated by siemen Meijssen over 9 years ago

File Rudder.PNG Rudder.PNG added

There indeed where some problems with the date settings. These have been corrected but it is still not working(After 40 minutes with reporting every 5 mins)
See the file attached.
This is an entirely new install on Debian.

After i set rudder agent reset. it will report correctly once. afterwards it reports as unexpected/missing again.

My apologies for the late reply

Actions

Copy link

Updated by siemen Meijssen over 9 years ago

File rudder2.PNG rudder2.PNG added

I noticed that the server which is not running correctly display the following error when manually running.(see attached)
I also noticed that the other server keeps repairs the same error but is running successfully otherwise and is now reporting like it should(for at least 25 mins)

Actions

Copy link

Updated by Vincent MEMBRÉ over 9 years ago

Thank for your screens!

So from your two screen, i can see the the node could not update it's policies so is still using an old reporting format.

Two questions:

when running 'rudder agent update' on the faulty node, do you get an error ?

If there is an error we have a tool on Rudder server: run 'rudder server debug <ip-faulty-node>' then run 'rudder agent run' on the node. can you share the output ?

A common update error is that Rudder serveur is not resolving correctly the node hostname, it may be the case here (to check run 'getent hostname-of-your-node' on the server )

Actions

Copy link

Updated by siemen Meijssen over 9 years ago

File log.txt log.txt added

I get the error:
error: Method 'update_action failed in some repairs

see attached

When i run getent i get the error:
Unknown database: <name of node>

I have also noticed that the servers switch around. whenever 1 is working the other one isn't

Actions

Copy link

Updated by Vincent MEMBRÉ over 9 years ago

oops, i tolds you wrong commands, sorry! :(

it's 'rudder agent update' you need to run after running rudder-server-debug and not rudder agent run!

and getent command is "getent hosts <hostname-of-your-node>"

Actions

Copy link

Updated by Vincent MEMBRÉ over 9 years ago

from logs, i can see that your node is identified as debian-test, is that correct ? or should it be the other one ?

Actions

Copy link

#10

Updated by siemen Meijssen over 9 years ago

Now it is my time to apologize. I uploaded the wrong log file. ill update you ASAP

Actions

Copy link

#11

Updated by siemen Meijssen over 9 years ago

File log2.txt log2.txt added

The getent command returns no output.

The server which is not reporting is Stream-Server

The Debian-Test server Also didnt report for some time but after running an apt-get dist-upgrade and rudder agent reset it is working now.

Actions

Copy link

#12

Updated by siemen Meijssen over 9 years ago

I did another reset/reinit on the Stream-Server.

It is reporting again for at least 20 mins now. Ill let you know if it stays that way.

Actions

Copy link

#13

Updated by Vincent MEMBRÉ over 9 years ago

Your server cannot determine that your stream-server ip is your stream-server, you need to htlp him finds out

easiest way is to define the line in the /etc/hosts of your server rudder about stream-server

Actions

Copy link

#14

Updated by siemen Meijssen over 9 years ago

That is weird. Because it is reporting correctly now.
If this was the issue it would fail all the time right?

Why would the rudder master need to have the name of the client hosts? I thought this was all done over IP.

Actions

Copy link

#15

Updated by siemen Meijssen over 9 years ago

I think this issue has been resolved. I have no clue what caused it to report that way but apperently it is fixed in the latest release.

Actions

Copy link

#16

Updated by Vincent MEMBRÉ over 9 years ago

Tracker changed from Bug to Question
Status changed from New to Resolved

Great that it's working now ... but you're right it's weird that you had to do all those things to make things right.

We rely on name resolution and rudder needs to know each node hostname to authorize it correctly. (we are thinking about changing that behavior but we are not there yet ... )

You can disable this dns lookups by unticking: 'Use reverse DNS lookups on nodes to reinforce authentication to policy server' in Administration/Settings page then Rudder will authenticate using their IP only ( so any node with the same ip will have access to its promise and can be a security issue)

If you still got problem in the future, feel free to reopen this issue.

Thanks :)

Actions

Copy link

#17

Updated by François ARMAND over 9 years ago

I'm wondering if it can't be linked to #8051, to ? The symptoms are quite alike.

Actions

Copy link

#18

Updated by François ARMAND over 9 years ago

Related to Bug #8051: Compliance is not correctly computed if we receive run agent right after generation added

Actions

Copy link

#19

Updated by siemen Meijssen over 9 years ago

I did indeed ran that command so it might be related.

Actions

Copy link

#20

Updated by François ARMAND almost 9 years ago

Related to Bug #7336: Node stuck in "Applying" status added

Actions

Copy link

Also available in: Atom PDF

	Related to Rudder - Bug #8051: Compliance is not correctly computed if we receive run agent right after generation	Released	Nicolas CHARLES	2016-05-19			Actions
	Related to Rudder - Bug #7336: Node stuck in "Applying" status	Rejected	François ARMAND				Actions

Project

General

Profile

Rudder

Custom queries

Question #8176

All nodes compliance report unexpected/missing except root server.

Updated by Vincent MEMBRÉ over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by Vincent MEMBRÉ over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by Vincent MEMBRÉ over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by Vincent MEMBRÉ over 9 years ago

Updated by Vincent MEMBRÉ over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by Vincent MEMBRÉ over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by Vincent MEMBRÉ over 9 years ago

Updated by François ARMAND over 9 years ago

Updated by François ARMAND over 9 years ago

Updated by siemen Meijssen over 9 years ago

Updated by François ARMAND almost 9 years ago