Project

General

Profile

Actions

Bug #4717

closed

Document how to solve hughe IO wait problem leading to random "NoAnswer"

Added by Dennis Cabooter over 10 years ago. Updated almost 9 years ago.

Status:
Rejected
Priority:
N/A
Assignee:
-
Category:
Documentation
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

There was a hughe problem on Ubuntu nodes. At random on or more nodes were in NoAnswer state. Remove all tcdb files and manually run cf-agent -KI solved the problem temporary. Also we had hughe IO wait on our storage, which affected the storage and all our VMs. Eventually I found out that CFEngine (Tokyo Cabinet) was the cause of all our IO problems (everywhere in our network).

There were machines really doing nothing (yet) and they had hughe IO waits. The iotop command showed that cf-agent was the only process writing to the file system. After Kegeruneku pointed me to http://blog.normation.com/en/2013/09/09/speed-up-your-cfengine-by-using-a-ram-disk/ and I implemented that through Rudder, all problems seem to be gone.

Please advise everyone to add the following to fstab, especially if they use Ubuntu (12.04 LTS - Precise). You should add this to the Rudder documentation in bold. However, it only applies to cfengine versions that use Tokyo Cabinet.

# Tmpfs for the CFEngine state backend storage directory
tmpfs /var/rudder/cfengine-community/state tmpfs size=128M,nr_inodes=2k,mode=0755,noexec,nosuid,noatime,nodiratime 0 0
Actions #1

Updated by François ARMAND over 10 years ago

  • Tracker changed from User story to Bug
  • Subject changed from Hughe IO wait problem solved to Document how to solve hughe IO wait problem leading to random "NoAnswer"
  • Status changed from New to 8
  • Target version set to 2.6.13

Thanks for the feedback.

Actions #2

Updated by Jonathan CLARKE over 10 years ago

Really looking forward to CFEngine 3.6 and using LMDB to avoid all this nonsense!

Actions #3

Updated by Vincent MEMBRÉ over 10 years ago

  • Target version changed from 2.6.13 to 2.6.14
Actions #4

Updated by Jonathan CLARKE over 10 years ago

  • Target version changed from 2.6.14 to 2.6.16
Actions #5

Updated by Jonathan CLARKE over 10 years ago

  • Target version changed from 2.6.16 to 2.6.17
Actions #6

Updated by Nicolas PERRON over 10 years ago

  • Target version changed from 2.6.17 to 2.6.18
Actions #7

Updated by Matthieu CERDA about 10 years ago

  • Target version changed from 2.6.18 to 2.6.19
Actions #8

Updated by Vincent MEMBRÉ about 10 years ago

  • Target version changed from 2.6.19 to 2.6.20
Actions #9

Updated by François ARMAND almost 10 years ago

  • Category set to Documentation
  • Target version changed from 2.6.20 to 2.10.10

This could be a section in the documentation.

Actions #10

Updated by Nicolas CHARLES almost 10 years ago

This would indeed apply on nodes where write cache is deactivated (happens also on 2.11 when no write cache)

Actions #11

Updated by Vincent MEMBRÉ almost 10 years ago

  • Target version changed from 2.10.10 to 2.10.11
Actions #12

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.11 to 2.10.12
Actions #13

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.12 to 2.10.13
Actions #14

Updated by Benoît PECCATTE over 9 years ago

  • Status changed from 8 to New
Actions #15

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.13 to 2.10.14
Actions #16

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.14 to 2.10.15
Actions #17

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.15 to 2.10.16
Actions #18

Updated by Vincent MEMBRÉ over 9 years ago

  • Target version changed from 2.10.16 to 2.10.17
Actions #19

Updated by Vincent MEMBRÉ about 9 years ago

  • Target version changed from 2.10.17 to 2.10.18
Actions #20

Updated by Vincent MEMBRÉ about 9 years ago

  • Target version changed from 2.10.18 to 2.10.19
Actions #21

Updated by Vincent MEMBRÉ about 9 years ago

  • Target version changed from 2.10.19 to 2.10.20
Actions #22

Updated by Vincent MEMBRÉ almost 9 years ago

  • Target version changed from 2.10.20 to 277
Actions #23

Updated by Vincent MEMBRÉ almost 9 years ago

  • Target version changed from 277 to 2.11.18
Actions #24

Updated by Alexis Mousset almost 9 years ago

  • Status changed from New to Rejected

Ramdisk use for CFEngine state is now documented (http://www.rudder-project.org/rudder-doc-2.11/rudder-doc.html#_performance_tuning), closing.

Actions

Also available in: Atom PDF