Project

General

Profile

Actions

Bug #15492

closed

rudder agent on Virtuozzo/openvz hypervisors uses broken vzps

Added by Victor Héry over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
N/A
Assignee:
-
Category:
Agent
Target version:
Severity:
Major - prevents use of part of Rudder | no simple workaround
UX impact:
User visibility:
Effort required:
Priority:
0
Name check:
To do
Fix check:
To do
Regression:

Description

Hello!

This is a reopen of bug #4498 (https://issues.rudder.io/issues/4498) but I have not found how to reopen, sorry.

I use rudder agent 5.0.12, and part of the problem is still here.

More precisely, the part regarding calling vzps with the option "thcount" instead of "nlwp" (see this comment: https://issues.rudder.io/issues/4498#note-4)

I have three openvz hypervisors, they are bare metal openvz and not Proxmox, as Alexis had noticed, Proxmox now uses LXC since v4.

On these hypervisors, all report are missing systematically, and rudder agent check always return:

WARNING: No disable file detected and no agent executor process either. Restarting agent service...ok: stop service rudder-agent succeeded
ok: start service rudder-agent succeeded
Done
ok: Rudder agent check ran without errors.

Notive the "WARNING" part, it's always present, and on the rudder server, the node is seen at the exact hour when tu rudder agent check ran, but without any report (100% missing report)
It seems the rudder service starts, but does nothing after that :-/

After reading the bug 4498, I tried to launch directly cf-agent -Kv
It's very verbose, but by searching for vzps inside the output, I indeed found lines like:

rudder verbose: P: END methods promise (any)
rudder verbose: Using the default body: processes_action
rudder verbose: Observe process table with /bin/vzps -E 0 -o user,pid,ppid,pgid,pcpu,pmem,vsz,ni,rss,thcount,stime,time,args
rudder verbose: A: ...................................................
rudder verbose: A: Bundle Accounting Summary for 'check_cron_daemon' in namespace default

There is no error around the vzps line, but when I tried to launch it from command line:

/bin/vzps -E 0 -o user,pid,ppid,pgid,pcpu,pmem,vsz,ni,rss,thcount,stime,time,args
error: unknown user-defined format specifier "thcount" 

Usage:
 vzps [options]

 Try 'vzps --help <simple|list|output|threads|misc|all>'
  or 'vzps --help <s|l|o|t|m|a>'
 for additional help text.

For more details see ps(1).

So it seems the thcount options is still the problem, and despite the error, cf-agent report evertyghing is ok.

If I replace thcount with nlwp the vzps command line works and return all processes.

I do not found any other logs, but it seems that the vzps failed when the cf-agent ran, but it's considered as a success.
So somewhat no report are generated but the agent think everything is ok :(

If I run rudder agent run manually, it works and report are generated and sent to the server!

I do not know why rudder agent check triggers the problem and not rudder agent run

For the moment I have a (dirty) workaround by deploying a custom cron launching rudder agent run (thanks rudder for this deployment ^^)

Please if you have any ideas how to patch that, it would be very appreciated, as Openvz hypervisor is not manageable by rudder at the moment because of that.

Thank you!


Related issues 2 (0 open2 closed)

Related to Rudder - Bug #15488: Virtuozzo Virtual machine reported as "Unknown type"ReleasedVincent MEMBRÉActions
Related to Rudder - Bug #15487: Openvz/Virtuozzo virtual machine detected as PhysicalResolvedActions
Actions

Also available in: Atom PDF