Actions
Bug #19161
openRudder agent consumes 100% CPU when copying a file to a file system with no space left
Pull Request:
Severity:
Critical - prevents main use of Rudder | no workaround | data loss | security
UX impact:
User visibility:
Operational - other Techniques | Rudder settings | Plugins
Effort required:
Priority:
0
Name check:
To do
Fix check:
To do
Regression:
No
Description
While testing with Rudder Version 6.2.4 I noticed the following problem.
I used File download (Rudder server)
to copy a large file (~1GB) to a node (each time slightly modified by appending a single character).
The files in /var/rudder/modified-files/
eventually filled up the filesystem. (as expected)
When I then try again to copy the file with rudder to the filed up filesystem of the node, the execution of rudder agent run
takes very long (in my case ~1000 seconds) and needs 100% CPU.
rudder agent run
terminates at some point, but there are still cf-agent
processes left over that continue to use 100% CPU.
2021-04-18T15:58:10+00:00 error: Failed to write to destination file (write: No space left on device) 2021-04-18T15:58:10+00:00 error: Local disk write failed copying '<SERVER_IP>:/var/rudder/configuration-repository/shared-files/file2.txt' to '/root/file2.txt.cfnew' 2021-04-18T16:13:09+00:00 error: Was not able to copy '/var/rudder/configuration-repository/shared-files/file2.txt' to '/root/file2.txt' 907.43s E| error copyFile Copy file /root/file2.txt The content or permissions of the file(s) could not have been repaired (file file2.txt not found?) 0.02s E| n/a copyFile Post-modification hook /root/file2.txt No post-hook command for copy of file2.txt to /root/file2.txt was defined, not executing 2021-04-18T16:13:10+00:00 error: SSL read after retries: underlying network error ()
Actions