Bug #19161
openRudder agent consumes 100% CPU when copying a file to a file system with no space left
Description
While testing with Rudder Version 6.2.4 I noticed the following problem.
I used File download (Rudder server)
to copy a large file (~1GB) to a node (each time slightly modified by appending a single character).
The files in /var/rudder/modified-files/
eventually filled up the filesystem. (as expected)
When I then try again to copy the file with rudder to the filed up filesystem of the node, the execution of rudder agent run
takes very long (in my case ~1000 seconds) and needs 100% CPU.
rudder agent run
terminates at some point, but there are still cf-agent
processes left over that continue to use 100% CPU.
2021-04-18T15:58:10+00:00 error: Failed to write to destination file (write: No space left on device) 2021-04-18T15:58:10+00:00 error: Local disk write failed copying '<SERVER_IP>:/var/rudder/configuration-repository/shared-files/file2.txt' to '/root/file2.txt.cfnew' 2021-04-18T16:13:09+00:00 error: Was not able to copy '/var/rudder/configuration-repository/shared-files/file2.txt' to '/root/file2.txt' 907.43s E| error copyFile Copy file /root/file2.txt The content or permissions of the file(s) could not have been repaired (file file2.txt not found?) 0.02s E| n/a copyFile Post-modification hook /root/file2.txt No post-hook command for copy of file2.txt to /root/file2.txt was defined, not executing 2021-04-18T16:13:10+00:00 error: SSL read after retries: underlying network error ()
Updated by Lars Koenen over 3 years ago
- Subject changed from Rudder agent consumes 100% CPU when copying a file to a full file system to Rudder agent consumes 100% CPU when copying a file to a file system with no space left
Updated by Alexis Mousset over 3 years ago
- Severity set to Critical - prevents main use of Rudder | no workaround | data loss | security
- User visibility set to Operational - other Techniques | Rudder settings | Plugins
- Priority changed from 0 to 76
Thanks for the detailed report, we'll try to reproduce the problem.
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.2.6 to 6.2.7
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.2.7 to 6.2.8
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.2.8 to 6.2.9
- Priority changed from 76 to 74
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.2.9 to 6.2.10
- Priority changed from 74 to 73
Updated by Vincent MEMBRÉ over 3 years ago
- Target version changed from 6.2.10 to 6.2.11
- Priority changed from 73 to 72
Updated by Vincent MEMBRÉ about 3 years ago
- Target version changed from 6.2.11 to 6.2.12
- Priority changed from 72 to 69
Updated by Vincent MEMBRÉ about 3 years ago
- Target version changed from 6.2.12 to 6.2.13
- Priority changed from 69 to 68
Updated by Alexis Mousset almost 3 years ago
- Priority changed from 68 to 66
I did not not managed to reproduce it when I tried, but it looks like a protocol error on agent side.
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.2.13 to 6.2.14
- Priority changed from 65 to 63
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.2.14 to 6.2.15
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.2.15 to 6.2.16
Updated by Alexis Mousset over 2 years ago
- Target version changed from 6.2.16 to 6.2.17
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.2.17 to 997
- Priority changed from 63 to 0
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 997 to 6.2.18
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.2.18 to 6.2.19
Updated by Vincent MEMBRÉ over 2 years ago
- Target version changed from 6.2.19 to 6.2.20
Updated by Vincent MEMBRÉ about 2 years ago
- Target version changed from 6.2.20 to old 6.2 issues to relocate
Updated by Alexis Mousset over 1 year ago
- Target version changed from old 6.2 issues to relocate to 7.2.10
Updated by Alexis Mousset over 1 year ago
- Target version changed from 7.2.10 to 7.2.11
Updated by Vincent MEMBRÉ over 1 year ago
- Target version changed from 7.2.11 to 1046
Updated by Benoît PECCATTE over 1 year ago
- Regression set to No
Tried on rudder 7.3 centos 8, i was unable to reproduce, I just get an error on copy and a "write: No space left on device" message
Updated by Alexis Mousset about 1 year ago
- Target version changed from 1046 to 7.3.8
Updated by Vincent MEMBRÉ about 1 year ago
- Target version changed from 7.3.8 to 7.3.9
Updated by Vincent MEMBRÉ about 1 year ago
- Target version changed from 7.3.9 to 7.3.10
Updated by Vincent MEMBRÉ about 1 year ago
- Target version changed from 7.3.10 to 7.3.11
Updated by Vincent MEMBRÉ 11 months ago
- Target version changed from 7.3.11 to 7.3.12
Updated by Vincent MEMBRÉ 10 months ago
- Target version changed from 7.3.12 to 7.3.13
Updated by Vincent MEMBRÉ 10 months ago
- Target version changed from 7.3.13 to 7.3.14
Updated by Vincent MEMBRÉ 8 months ago
- Target version changed from 7.3.14 to 7.3.15
Updated by Vincent MEMBRÉ 7 months ago
- Target version changed from 7.3.15 to 7.3.16
Updated by Vincent MEMBRÉ 6 months ago
- Target version changed from 7.3.16 to 7.3.17