Project

General

Profile

Bug #15492

Updated by Alexis Mousset over 4 years ago

Hello! 

 This is a reopen of bug #4498 (https://issues.rudder.io/issues/4498) but I have not found how to reopen, sorry. 

 I use rudder agent 5.0.12, and part of the problem is still here. 

 More precisely, the part regarding calling @vzps@ with the option @"thcount"@ instead of @"nlwp"@ (see this comment: https://issues.rudder.io/issues/4498#note-4) 

 I have three openvz hypervisors, they are bare metal openvz and not Proxmox, as Alexis had noticed, Proxmox now uses LXC since v4. 

 On these hypervisors, all report are missing systematically, and @rudder agent check@ *always* return: 

 @WARNING: No disable file detected and no agent executor process either. Restarting agent service...ok: stop service rudder-agent succeeded 
 ok: start service rudder-agent succeeded 
  Done 
 ok: Rudder agent check ran without errors. 
 @ 

 Notive the "WARNING" part, it's always present, and on the rudder server, the node is seen at the exact hour when tu rudder agent check ran, but without any report (100% missing report) 
 It seems the rudder service starts, but does nothing after that :-/ 

 After reading the bug 4498, I tried to launch directly @cf-agent -Kv@ 
 It's very verbose, but by searching for @vzps@ inside the output, I indeed found lines like: 

 @rudder    verbose: P: END methods promise (any) 
 rudder    verbose: Using the default body:                                               processes_action 
 rudder    verbose: Observe process table with /bin/vzps -E 0 -o user,pid,ppid,pgid,pcpu,pmem,vsz,ni,rss,thcount,stime,time,args 
 rudder    verbose: A: ................................................... 
 rudder    verbose: A: Bundle Accounting Summary for 'check_cron_daemon' in namespace default 
 @ 

 There is no error around the vzps line, but when I tried to launch it from command line: 

 <pre>/bin/vzps @ /bin/vzps -E 0 -o user,pid,ppid,pgid,pcpu,pmem,vsz,ni,rss,thcount,stime,time,args 
 error: unknown user-defined format specifier "thcount" 

 Usage: 
  vzps [options] 

  Try 'vzps --help <simple|list|output|threads|misc|all>' 
   or 'vzps --help <s|l|o|t|m|a>' 
  for additional help text. 

 For more details see ps(1). 
 </pre> @ 

 So it seems the @thcount@ options is still the problem, and despite the error, cf-agent report evertyghing is ok. 

 If I replace thcount with nlwp the vzps command line works and return all processes. 

 I do not found any other logs, but it seems that the vzps failed when the cf-agent ran, but it's considered as a success. 
 So somewhat no report are generated *but* the agent think everything is ok :( 

 If I run @rudder agent run@ manually, it works and report are generated and sent to the server! 

 I do not know why @rudder agent check@ triggers the problem and not @rudder agent run@ 

 For the moment I have a (dirty) workaround by deploying a custom cron launching rudder agent run (thanks rudder for this deployment ^^) 

 Please if you have any ideas how to patch that, it would be very appreciated, as Openvz hypervisor is not manageable by rudder at the moment because of that. 

 Thank you!

Back