Project

General

Profile

Actions

User story #7220

closed

Document nofiles dependency for syslog/tcp on master and relays

Added by Florian Heigl over 5 years ago. Updated over 5 years ago.

Status:
Released
Priority:
N/A
Category:
Documentation
Target version:
Suggestion strength:
User visibility:
Effort required:

Description

If syslog over tcp is used you need to raise the open files limits affecting the number of sockets open.

This affects relays, or if none are used, also the master.
Can you add that to the rudder relay setup docs?

Just put in a recommendation to raise the open files limits to something like 10000.

(No: You don't want to see what happens it it's too low.)


Subtasks 1 (0 open1 closed)

User story #7728: Document how to switch to UDP for reporting when upgrading to >= 3.1ReleasedJonathan CLARKE2016-01-06Actions
Actions #1

Updated by Florian Heigl over 5 years ago

ah, and turning off syncookies can also be needed under load :)

Actions #2

Updated by Janos Mattyasovszky over 5 years ago

Well, on SUSE's rsyslogd (v 5.8.7-0.5.5) you'd get messages like:

Sep 22 16:39:38 relayserver rsyslogd-2163: last message repeated 401230 times
Sep 22 16:39:38 relayserver rsyslogd-2163: epoll_ctl failed on fd 30, id 0/0x7f2c1000e650, op 1 with File exists
: File exists [try http://www.rsyslog.com/e/2163 ]
Sep 22 16:39:42 relayserver rsyslogd-2163: last message repeated 143065 times
Sep 22 16:39:42 relayserver rsyslogd-2163: epoll_ctl failed on fd 14, id 0/0x7f2c100021a0, op 1 with File exists
: File exists [try http://www.rsyslog.com/e/2163 ]
Sep 22 16:39:43 relayserver rsyslogd-2163: last message repeated 9846 times
Sep 22 16:39:42 relayserver rsyslogd-2163: epoll_ctl failed on fd 20, id 0/0x7f2c10008d30, op 1 with File exists
: File exists [try http://www.rsyslog.com/e/2163 ]
Sep 22 16:39:42 relayserver rsyslogd-2163: epoll_ctl failed on fd 14, id 0/0x7f2c100021a0, op 1 with File exists
: File exists [try http://www.rsyslog.com/e/2163 ]

Setting it to 10.000 was not enough for us, I had to set soft to 30k and hard to 100k, but this depends on the relay, how much clients it serves and how well it is fit with CPU/Memory. (This example Relay has approx ~2700 Clients)

And as Flo already mentioned, if you (by mistake) change a lot of systems to TCP instead of UDP, and they start sending the rudder messages via TCP, you might also experience Messages like this:

Sep 22 17:59:02 relayserver kernel: [7695000.930982] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931004] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931061] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931075] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931151] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931165] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931398] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931439] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931446] TCP: Possible SYN flooding on port 514. Sending cookies.
Sep 22 17:59:02 relayserver kernel: [7695000.931452] TCP: Possible SYN flooding on port 514. Sending cookies.

Also pay attention to the conntrack table, which seems to have a pretty high value of 65k, but this is shared among all processes, so if also handling additional workloads with many connections, this might also be worth to monitor (basically /proc/sys/net/netfilter/nf_conntrack_count).

Actions #3

Updated by Vincent MEMBRÉ over 5 years ago

  • Assignee set to François ARMAND
  • Target version set to 2.11.15

François, who do you think would be the best to document this ?

Actions #4

Updated by Vincent MEMBRÉ over 5 years ago

  • Target version changed from 2.11.15 to 2.11.16
Actions #5

Updated by Vincent MEMBRÉ over 5 years ago

  • Target version changed from 2.11.16 to 2.11.17
Actions #6

Updated by Vincent MEMBRÉ over 5 years ago

  • Target version changed from 2.11.17 to 2.11.18
Actions #7

Updated by Jonathan CLARKE over 5 years ago

  • Assignee changed from François ARMAND to Alexis MOUSSET
Actions #8

Updated by Alexis MOUSSET over 5 years ago

  • Status changed from New to In progress
Actions #9

Updated by Alexis MOUSSET over 5 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Alexis MOUSSET to Benoît PECCATTE
  • Pull Request set to https://github.com/Normation/rudder-doc/pull/148
Actions #10

Updated by Alexis MOUSSET over 5 years ago

  • Assignee changed from Benoît PECCATTE to François ARMAND
Actions #11

Updated by Alexis MOUSSET over 5 years ago

  • Assignee changed from François ARMAND to Benoît PECCATTE
Actions #12

Updated by Alexis MOUSSET over 5 years ago

  • Assignee changed from Benoît PECCATTE to Jonathan CLARKE
Actions #13

Updated by Alexis MOUSSET over 5 years ago

  • Status changed from Pending technical review to Pending release
  • % Done changed from 0 to 100
Actions #14

Updated by Vincent MEMBRÉ over 5 years ago

  • Tracker changed from Bug to User story
Actions #15

Updated by Vincent MEMBRÉ over 5 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 2.11.18, 3.0.13, 3.1.6 and 3.2.0 which were released today.

Actions

Also available in: Atom PDF