Project

General

Profile

Actions

Bug #10185

closed

Remote-run exec for root fail with "rudder agent was interrupted"

Added by François ARMAND about 7 years ago. Updated about 7 years ago.

Status:
Released
Priority:
N/A
Category:
Relay server or API
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

The message is:

== [fanf@luhman16] 
% curl -k -H "X-API-Token: 5qsPVwcoa99sZfnSn3A6ive9Q7PMUzRx"  -X POST 'https://192.168.44.2/rudder/api/latest/nodes/root/applyPolicy' -d "classes=inventory" 
error    Rudder agent was interrupted during execution by a fatal error
         Run with -i to see log messages.

## Summary #####################################################################
0 components verified in 0 directives
execution time: 0.31s
################################################################################

== [fanf@luhman16] 
% curl -k -H "X-API-Token: 5qsPVwcoa99sZfnSn3A6ive9Q7PMUzRx"  -X POST 'https://192.168.44.2/rudder/api/latest/nodes/6866e5db-bb41-4110-958b-c1f1c90dbcbe/applyPolicy'
error    Rudder agent was interrupted during execution by a fatal error
         Run with -i to see log messages.

## Summary #####################################################################
0 components verified in 0 directives
execution time: 0.29s
################################################################################

OK, after trying to start it some more time on the remote node, it started to work. I have 0 idea about what was wrong. And there is no log.

We need to at least add logs to be able to do some forensic when things are not working as expected.


Subtasks 1 (0 open1 closed)

Bug #10300: NumberFormatException on remote-api call for rootRejectedFrançois ARMANDActions

Related issues 1 (0 open1 closed)

Related to Rudder - User story #10314: Document remote-run exec compatibilityRejectedActions
Actions #1

Updated by François ARMAND about 7 years ago

Editing on the root server file: /opt/rudder/share/relay-api/relay_api/remote_run.py to add a "-i" to REMOTE_RUN_COMMAND && restarting apache, I'm now getting:

rudder     info: ........................................................................
rudder     info: Hailing server.rudder.local : 5309
rudder     info: ........................................................................
   error: TRUST FAILED, server presented untrusted key: MD5=3275d8e38205fada95e6236901099527
   error: Failed to connect to host: server.rudder.local
error    Rudder agent was interrupted during execution by a fatal error

## Summary #####################################################################
0 components verified in 0 directives
execution time: 0.30s
################################################################################

We should have an API option to allow to use that output.

Actions #2

Updated by François ARMAND about 7 years ago

And sometimes, I don't get anything at all:

== [fanf@luhman16] ==
% curl -k -H "X-API-Token: 5qsPVwcoa99sZfnSn3A6ive9Q7PMUzRx"  -X POST 'https://192.168.44.2/rudder/api/latest/nodes/c867b070-0721-43d3-8825-d78c51c2c632/applyPolicy'
Actions #3

Updated by François ARMAND about 7 years ago

  • Assignee set to Benoît PECCATTE
Actions #4

Updated by François ARMAND about 7 years ago

  • Assignee changed from Benoît PECCATTE to Nicolas CHARLES
Actions #5

Updated by François ARMAND about 7 years ago

  • Translation missing: en.field_tag_list set to Blocking 4.1
Actions #6

Updated by Nicolas CHARLES about 7 years ago

Webapp log show following error:

java.lang.NumberFormatException: For input string: "Error when trying to contact internal remote-run API: null" 
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:580)
        at java.lang.Byte.parseByte(Byte.java:149)
        at java.lang.Byte.parseByte(Byte.java:175)
        at scala.collection.immutable.StringLike.toByte(StringLike.scala:297)
        at scala.collection.immutable.StringLike.toByte$(StringLike.scala:297)
        at scala.collection.immutable.StringOps.toByte(StringOps.scala:29)
        at com.normation.rudder.web.rest.node.NodeApiService8.runResponse(NodeAPIService8.scala:119)
        at com.normation.rudder.web.rest.node.NodeApiService8.$anonfun$runNode$4(NodeAPIService8.scala:155)
        at com.normation.rudder.web.rest.node.NodeApiService8.$anonfun$runNode$4$adapted(NodeAPIService8.scala:155)
        at net.liftweb.http.LiftServlet.sendResponse(LiftServlet.scala:1040)
        at net.liftweb.http.LiftServlet.doService(LiftServlet.scala:451)
        at net.liftweb.http.LiftServlet.$anonfun$service$2(LiftServlet.scala:157)
        at net.liftweb.util.TimeHelpers.calcTime(TimeHelpers.scala:427)
...
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:369)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:464)
        at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:913)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:975)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:641)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:231)
        at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:745)

Actions #7

Updated by Nicolas CHARLES about 7 years ago

Ha, this message is probably more relevant:

Feb 28 12:02:32 server cf-serverd[1583]: CFEngine(server) rudder 127.0.0.1> Connection was hung up while receiving line: 
Feb 28 12:02:32 server cf-serverd[1583]: CFEngine(server) rudder 127.0.0.1> Client closed connection early! He probably does not trust our key...

Actions #8

Updated by Nicolas CHARLES about 7 years ago

Full verbose is

rudder  verbose: Obtained IP address of '127.0.0.1' on socket 7 from accept
rudder  verbose: New connection (from 127.0.0.1, sd 7), spawning new thread...
rudder     info: 127.0.0.1> Accepting connection
rudder  verbose: 127.0.0.1> Setting socket timeout to 600 seconds.
rudder  verbose: 127.0.0.1> Peeked nothing important in TCP stream, considering the protocol as TLS
rudder  verbose: 127.0.0.1> TLS version negotiated:  TLSv1.2; Cipher: AES256-GCM-SHA384,TLSv1/SSLv3
rudder  verbose: 127.0.0.1> TLS session established, checking trust...
rudder  verbose: 127.0.0.1> Remote peer terminated TLS session (SSL_read)
   error: 127.0.0.1> Connection was hung up while receiving line: 
  notice: 127.0.0.1> Client closed connection early! He probably does not trust our key..

Actions #9

Updated by Nicolas CHARLES about 7 years ago

Agent side:

rudder  verbose: Connected to host 192.168.41.2 address 192.168.41.2 port 5309 (socket descriptor 4)
rudder  verbose: TLS version negotiated:  TLSv1.2; Cipher: AES256-GCM-SHA384,TLSv1/SSLv3
rudder  verbose: TLS session established, checking trust...
rudder  verbose: Did not find new key format '/var/rudder/cfengine-community/ppkeys/root-MD5=57ccba22df018012132877618ff655f9.pub'
rudder  verbose: Trying old style '/var/rudder/cfengine-community/ppkeys/root-192.168.41.2.pub'
rudder  verbose: Received key 'MD5=57ccba22df018012132877618ff655f9' not found in ppkeys
   error: TRUST FAILED, server presented untrusted key: MD5=57ccba22df018012132877618ff655f9
rudder  verbose: Connection to 192.168.41.2 is closed
   error: Failed to connect to host: 192.168.41.2

Actions #10

Updated by Nicolas CHARLES about 7 years ago

Ok, so after some tests:
  1. we cannot remote run on itself ; cf-runagent doesn't seem to support it
  2. remote running a 4.1 node is ok
  3. remote running a 4.0 node fails, as command is not valid
    root@agent1:/home/vagrant# /opt/rudder/bin/rudder agent run -uR -I -Dcfruncommand
    Rudder agent 4.0.4.rc1.git201702280322 (CFEngine Core 3.7.4)
    Node uuid: e04cdc24-2180-4d2e-b334-0445a13a3a45
    ok: Rudder agent promises were updated.
       error: Remote execution cannot ignore locks
    
  4. remote running a 3.1 node fails, as command /opt/rudder/bin/rudder agent run -uR -I -Dcfruncommand --inform is not valid
    root@agent2:/home/vagrant# /opt/rudder/bin/rudder agent run -uR -Dcfruncommand --inform
    /opt/rudder/share/commands/agent-run : option non permise -- u
    /opt/rudder/share/commands/agent-run : option non permise -- -
    /opt/rudder/share/commands/agent-run : option non permise -- n
    /opt/rudder/share/commands/agent-run : option non permise -- o
    Rudder agent 3.1.19.rc1.git201702210714 (CFEngine Core 3.6.5)
    Node uuid: 791a6ebe-cfb1-4f54-b9a2-48ca162f64b6
    2017-02-28T12:38:05+0000    error: Remote execution cannot ignore locks
    

So, a remote API should not try to remote run on local system, and we need a fix for 4.0 and 3.1 compatibility

Actions #11

Updated by François ARMAND about 7 years ago

  • Translation missing: en.field_tag_list deleted (Blocking 4.1)
  • Category set to Relay server or API
  • Assignee changed from Nicolas CHARLES to Benoît PECCATTE
Actions #12

Updated by François ARMAND about 7 years ago

I'm letting that ticket open to change Relay API and do the correct call to rudder agent. I'm opening a subticket to correct the null pointer exception on rudder side that should not happen.

Actions #13

Updated by Alexis Mousset about 7 years ago

  • Status changed from New to In progress
  • Assignee changed from Benoît PECCATTE to Alexis Mousset
Actions #14

Updated by Alexis Mousset about 7 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Alexis Mousset to Benoît PECCATTE
  • Pull Request set to https://github.com/Normation/rudder-packages/pull/1270
Actions #15

Updated by Alexis Mousset about 7 years ago

  • Status changed from Pending technical review to Pending release
Actions #16

Updated by Nicolas CHARLES about 7 years ago

Actions #17

Updated by François ARMAND about 7 years ago

  • Subject changed from Remote-run exec for root and nodes behind relays fail with "rudder agent was interrupted" to Remote-run exec for root fail with "rudder agent was interrupted"
Actions #18

Updated by Vincent MEMBRÉ about 7 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 4.1.0~rc1 which was released today.

Actions

Also available in: Atom PDF