Actions
Bug #19547
closedRelayd deadlocks when trying to forward a shared-file
Status:
Released
Priority:
N/A
Assignee:
Category:
Relay server or API
Target version:
Pull Request:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
0
Name check:
To do
Fix check:
Checked
Regression:
Description
Relayd stops working when the api is called with a shared-files request requiring to forward the file upstream.
After the request all API requests timeout and the service won't stop with a sigterm.
This happens in 6.1.14 (and probably older releases).
Updated by Alexis Mousset over 3 years ago
- Status changed from New to In progress
- Assignee set to Alexis Mousset
Updated by Alexis Mousset over 3 years ago
on the relay receiving the request:
TRACE hyper::proto::h1::conn: Conn::read_head DEBUG hyper::proto::h1::io: read 442 bytes TRACE hyper::proto::h1::role: Request.parse([Header; 100], [u8; 442]) TRACE hyper::proto::h1::role: Request.parse Complete(442) DEBUG hyper::proto::h1::io: parsed 7 headers DEBUG hyper::proto::h1::conn: incoming body is empty TRACE warp::filters::path: "rudder"?: "rudder" TRACE warp::filters::path: "relay-api"?: "relay-api" TRACE warp::filters::path: "1"?: "1" TRACE warp::filters::path: "system"?: "shared-files" TRACE warp::filters::path: "remote-run"?: "shared-files" TRACE warp::filters::path: "shared-files"?: "shared-files" TRACE warp::filters::method: method::PUT?: HEAD TRACE warp::filters::method: method::HEAD?: HEAD TRACE warp::filters::path: param?: "4756c0ec-b7f5-4046-8e6c-7e185bbba256" TRACE warp::filters::path: param?: "4db80a96-a2cf-43f1-b4da-c75deaa8dc01" TRACE warp::filters::path: param?: "toto" TRACE shared_files_head{target_id=4756c0ec-b7f5-4046-8e6c-7e185bbba256 source_id=4db80a96-a2cf-43f1-b4da-c75deaa8dc01 file_id=toto}: hyper::client::pool: checkout waiting for idle connection: "https://server" TRACE shared_files_head{target_id=4756c0ec-b7f5-4046-8e6c-7e185bbba256 source_id=4db80a96-a2cf-43f1-b4da-c75deaa8dc01 file_id=toto}: hyper::client::connect::http: Http::connect; scheme=https, host=server, port=None DEBUG hyper::client::connect::dns: resolving host="server" DEBUG shared_files_head{target_id=4756c0ec-b7f5-4046-8e6c-7e185bbba256 source_id=4db80a96-a2cf-43f1-b4da-c75deaa8dc01 file_id=toto}: hyper::client::connect::http: connecting to 192.168.5.2:443 DEBUG shared_files_head{target_id=4756c0ec-b7f5-4046-8e6c-7e185bbba256 source_id=4db80a96-a2cf-43f1-b4da-c75deaa8dc01 file_id=toto}: tokio_reactor: adding I/O source: 33554440 TRACE shared_files_head{target_id=4756c0ec-b7f5-4046-8e6c-7e185bbba256 source_id=4db80a96-a2cf-43f1-b4da-c75deaa8dc01 file_id=toto}: mio::poll: registering with poller DEBUG shared_files_head{target_id=4756c0ec-b7f5-4046-8e6c-7e185bbba256 source_id=4db80a96-a2cf-43f1-b4da-c75deaa8dc01 file_id=toto}: tokio_reactor::registration: scheduling Write for: 8
on the relay's policy server:
juil. 21 13:10:44 server rudder-relayd[1043]: INFO relayd::relay-api: 127.0.0.1:58642 "HEAD /rudder/relay-api/1/shared-files/4756c0ec-b7f5-4046-8e6c-7e185bbba256/root/toto HTTP/1.1" 200 "-" "curl/7.61.0" 843.172µs
on the agent:
# /opt/rudder/bin/curl https://relay/rudder/relay-api/shared-files/4756c0ec-b7f5-4046-8e6c-7e185bbba256/4db80a96-a2cf-43f1-b4da-c75deaa8dc01/toto?hash=28d3ee069a6e69a679e08f1505bc4bc0dcc612c4e2292dccfa8afb9ccc6f45039060ce698f980954a82c436fd481d413c44fe9735472db19ce4d3f22283e4f1d --insecure --tlsv1 --location --head HTTP/1.1 502 Proxy Error Date: Wed, 21 Jul 2021 13:10:50 GMT Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips Content-Type: text/html; charset=iso-8859-1
Updated by Alexis Mousset over 3 years ago
Relayd:
# strace -fp 17920 strace: Process 17920 attached with 7 threads [pid 17926] futex(0x56182030cb64, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished ...> [pid 17925] futex(0x7fb0bc00fc84, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished ...> [pid 17924] futex(0x56182035af90, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 17923] futex(0x56182035af90, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 17922] futex(0x56182035af90, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 17921] futex(0x56182035b064, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished ...> [pid 17920] futex(0x5618203650a4, FUTEX_WAIT_PRIVATE, 1, NULL
all threads are stuck on FUTEX_WAIT.
Updated by Alexis Mousset over 3 years ago
Also happens with up-to-date system and up-to-date relayd dependencies.
Updated by Alexis Mousset over 3 years ago
- Status changed from In progress to Pending technical review
- Assignee changed from Alexis Mousset to François ARMAND
- Pull Request set to https://github.com/Normation/rudder/pull/3735
Updated by Alexis Mousset over 3 years ago
- Status changed from Pending technical review to Pending release
Applied in changeset rudder|c9161c3f8c00371892cc8d2ffca663f5faad482d.
Updated by François ARMAND about 3 years ago
- Fix check changed from To do to Checked
Updated by Vincent MEMBRÉ about 3 years ago
- Status changed from Pending release to Released
This bug has been fixed in Rudder 6.1.16 and 6.2.10 which were released today.
Actions