Bug #4408
closedSometimes there are too many cf-agent processes running
Description
[root@node ~]# free -m total used free shared buffers cached Mem: 3824 3798 25 0 0 13 -/+ buffers/cache: 3784 39 Swap: 2047 2047 0 [root@node ~]# ps wwwuax | grep cf-|wc -l 5988 [root@node ~]# kill -9 `ps wwwuax | grep cf- | awk '{ print $2 }'` -bash: kill: (8484) - No such process [root@node ~]# ps wwwuax | grep cf-|wc -l 5 [root@node ~]# free -m total used free shared buffers cached Mem: 3824 344 3479 0 2 188 -/+ buffers/cache: 153 3670 Swap: 2047 61 1986
Updated by Vincent MEMBRÉ almost 11 years ago
- Assignee set to Nicolas CHARLES
- Target version set to 2.8.3
That is very very much ...
Is it happening only on root server ?
A fix was merged this weekend #3928 that could fix that problem ... (maybe linked to the tokyo cabinet locks ...)
Nicolas, what do you think about that??
As i Suspect tcdb, i target branch 2.8
Updated by Dennis Cabooter almost 11 years ago
Sometimes it's on the server as well, sometimes on the nodes. This times it was on a node. And as you can understand ~ 6000 cf-agent processes absorb almost al the memory and makes the node unworkable. The nodes (included the server) report NoAnswer state when this happens.
Updated by Jonathan CLARKE almost 11 years ago
- Status changed from New to Rejected
This looks very much like a duplicate of #3928, and even if it is not, the fix for #3928 will fix this: the cron script that is run every 5 minutes checks if there are more than 8 CFEngine processes running, and kills them if so. It also cleans up the TokyoCabinet cf_lock.tcdb database, which is the cause of this.
Thanks for the report Dennis. I'm closing this ticket as a duplicate of #3928, which will be released in 2.8.3 and 2.9.3 very shortly.