Project

General

Profile

Actions

Bug #10185

closed

Remote-run exec for root fail with "rudder agent was interrupted"

Bug #10185: Remote-run exec for root fail with "rudder agent was interrupted"

Added by François ARMAND almost 9 years ago. Updated almost 9 years ago.

Status:
Released
Priority:
N/A
Category:
Relay server or API
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

The message is:

== [fanf@luhman16] 
% curl -k -H "X-API-Token: 5qsPVwcoa99sZfnSn3A6ive9Q7PMUzRx"  -X POST 'https://192.168.44.2/rudder/api/latest/nodes/root/applyPolicy' -d "classes=inventory" 
error    Rudder agent was interrupted during execution by a fatal error
         Run with -i to see log messages.

## Summary #####################################################################
0 components verified in 0 directives
execution time: 0.31s
################################################################################

== [fanf@luhman16] 
% curl -k -H "X-API-Token: 5qsPVwcoa99sZfnSn3A6ive9Q7PMUzRx"  -X POST 'https://192.168.44.2/rudder/api/latest/nodes/6866e5db-bb41-4110-958b-c1f1c90dbcbe/applyPolicy'
error    Rudder agent was interrupted during execution by a fatal error
         Run with -i to see log messages.

## Summary #####################################################################
0 components verified in 0 directives
execution time: 0.29s
################################################################################

OK, after trying to start it some more time on the remote node, it started to work. I have 0 idea about what was wrong. And there is no log.

We need to at least add logs to be able to do some forensic when things are not working as expected.


Subtasks 1 (0 open1 closed)

Bug #10300: NumberFormatException on remote-api call for rootRejectedFrançois ARMANDActions

Related issues 1 (0 open1 closed)

Related to Rudder - User story #10314: Document remote-run exec compatibilityRejectedActions

Updated by François ARMAND almost 9 years ago Actions #1

Editing on the root server file: /opt/rudder/share/relay-api/relay_api/remote_run.py to add a "-i" to REMOTE_RUN_COMMAND && restarting apache, I'm now getting:

rudder     info: ........................................................................
rudder     info: Hailing server.rudder.local : 5309
rudder     info: ........................................................................
   error: TRUST FAILED, server presented untrusted key: MD5=3275d8e38205fada95e6236901099527
   error: Failed to connect to host: server.rudder.local
error    Rudder agent was interrupted during execution by a fatal error

## Summary #####################################################################
0 components verified in 0 directives
execution time: 0.30s
################################################################################

We should have an API option to allow to use that output.

Updated by François ARMAND almost 9 years ago Actions #2

And sometimes, I don't get anything at all:

== [fanf@luhman16] ==
% curl -k -H "X-API-Token: 5qsPVwcoa99sZfnSn3A6ive9Q7PMUzRx"  -X POST 'https://192.168.44.2/rudder/api/latest/nodes/c867b070-0721-43d3-8825-d78c51c2c632/applyPolicy'

Updated by François ARMAND almost 9 years ago Actions #3

  • Assignee set to Benoît PECCATTE

Updated by François ARMAND almost 9 years ago Actions #4

  • Assignee changed from Benoît PECCATTE to Nicolas CHARLES

Updated by François ARMAND almost 9 years ago Actions #5

  • Translation missing: en.field_tag_list set to Blocking 4.1

Updated by Nicolas CHARLES almost 9 years ago Actions #6

Webapp log show following error:

java.lang.NumberFormatException: For input string: "Error when trying to contact internal remote-run API: null" 
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:580)
        at java.lang.Byte.parseByte(Byte.java:149)
        at java.lang.Byte.parseByte(Byte.java:175)
        at scala.collection.immutable.StringLike.toByte(StringLike.scala:297)
        at scala.collection.immutable.StringLike.toByte$(StringLike.scala:297)
        at scala.collection.immutable.StringOps.toByte(StringOps.scala:29)
        at com.normation.rudder.web.rest.node.NodeApiService8.runResponse(NodeAPIService8.scala:119)
        at com.normation.rudder.web.rest.node.NodeApiService8.$anonfun$runNode$4(NodeAPIService8.scala:155)
        at com.normation.rudder.web.rest.node.NodeApiService8.$anonfun$runNode$4$adapted(NodeAPIService8.scala:155)
        at net.liftweb.http.LiftServlet.sendResponse(LiftServlet.scala:1040)
        at net.liftweb.http.LiftServlet.doService(LiftServlet.scala:451)
        at net.liftweb.http.LiftServlet.$anonfun$service$2(LiftServlet.scala:157)
        at net.liftweb.util.TimeHelpers.calcTime(TimeHelpers.scala:427)
...
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:369)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:464)
        at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:913)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:975)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:641)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:231)
        at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:745)

Updated by Nicolas CHARLES almost 9 years ago Actions #7

Ha, this message is probably more relevant:

Feb 28 12:02:32 server cf-serverd[1583]: CFEngine(server) rudder 127.0.0.1> Connection was hung up while receiving line: 
Feb 28 12:02:32 server cf-serverd[1583]: CFEngine(server) rudder 127.0.0.1> Client closed connection early! He probably does not trust our key...

Updated by Nicolas CHARLES almost 9 years ago Actions #8

Full verbose is

rudder  verbose: Obtained IP address of '127.0.0.1' on socket 7 from accept
rudder  verbose: New connection (from 127.0.0.1, sd 7), spawning new thread...
rudder     info: 127.0.0.1> Accepting connection
rudder  verbose: 127.0.0.1> Setting socket timeout to 600 seconds.
rudder  verbose: 127.0.0.1> Peeked nothing important in TCP stream, considering the protocol as TLS
rudder  verbose: 127.0.0.1> TLS version negotiated:  TLSv1.2; Cipher: AES256-GCM-SHA384,TLSv1/SSLv3
rudder  verbose: 127.0.0.1> TLS session established, checking trust...
rudder  verbose: 127.0.0.1> Remote peer terminated TLS session (SSL_read)
   error: 127.0.0.1> Connection was hung up while receiving line: 
  notice: 127.0.0.1> Client closed connection early! He probably does not trust our key..

Updated by Nicolas CHARLES almost 9 years ago Actions #9

Agent side:

rudder  verbose: Connected to host 192.168.41.2 address 192.168.41.2 port 5309 (socket descriptor 4)
rudder  verbose: TLS version negotiated:  TLSv1.2; Cipher: AES256-GCM-SHA384,TLSv1/SSLv3
rudder  verbose: TLS session established, checking trust...
rudder  verbose: Did not find new key format '/var/rudder/cfengine-community/ppkeys/root-MD5=57ccba22df018012132877618ff655f9.pub'
rudder  verbose: Trying old style '/var/rudder/cfengine-community/ppkeys/root-192.168.41.2.pub'
rudder  verbose: Received key 'MD5=57ccba22df018012132877618ff655f9' not found in ppkeys
   error: TRUST FAILED, server presented untrusted key: MD5=57ccba22df018012132877618ff655f9
rudder  verbose: Connection to 192.168.41.2 is closed
   error: Failed to connect to host: 192.168.41.2

Updated by Nicolas CHARLES almost 9 years ago Actions #10

Ok, so after some tests:
  1. we cannot remote run on itself ; cf-runagent doesn't seem to support it
  2. remote running a 4.1 node is ok
  3. remote running a 4.0 node fails, as command is not valid
    root@agent1:/home/vagrant# /opt/rudder/bin/rudder agent run -uR -I -Dcfruncommand
    Rudder agent 4.0.4.rc1.git201702280322 (CFEngine Core 3.7.4)
    Node uuid: e04cdc24-2180-4d2e-b334-0445a13a3a45
    ok: Rudder agent promises were updated.
       error: Remote execution cannot ignore locks
    
  4. remote running a 3.1 node fails, as command /opt/rudder/bin/rudder agent run -uR -I -Dcfruncommand --inform is not valid
    root@agent2:/home/vagrant# /opt/rudder/bin/rudder agent run -uR -Dcfruncommand --inform
    /opt/rudder/share/commands/agent-run : option non permise -- u
    /opt/rudder/share/commands/agent-run : option non permise -- -
    /opt/rudder/share/commands/agent-run : option non permise -- n
    /opt/rudder/share/commands/agent-run : option non permise -- o
    Rudder agent 3.1.19.rc1.git201702210714 (CFEngine Core 3.6.5)
    Node uuid: 791a6ebe-cfb1-4f54-b9a2-48ca162f64b6
    2017-02-28T12:38:05+0000    error: Remote execution cannot ignore locks
    

So, a remote API should not try to remote run on local system, and we need a fix for 4.0 and 3.1 compatibility

Updated by François ARMAND almost 9 years ago Actions #11

  • Translation missing: en.field_tag_list deleted (Blocking 4.1)
  • Category set to Relay server or API
  • Assignee changed from Nicolas CHARLES to Benoît PECCATTE

Updated by François ARMAND almost 9 years ago Actions #12

I'm letting that ticket open to change Relay API and do the correct call to rudder agent. I'm opening a subticket to correct the null pointer exception on rudder side that should not happen.

Updated by Alexis Mousset almost 9 years ago Actions #13

  • Status changed from New to In progress
  • Assignee changed from Benoît PECCATTE to Alexis Mousset

Updated by Alexis Mousset almost 9 years ago Actions #14

  • Status changed from In progress to Pending technical review
  • Assignee changed from Alexis Mousset to Benoît PECCATTE
  • Pull Request set to https://github.com/Normation/rudder-packages/pull/1270

Updated by Alexis Mousset almost 9 years ago Actions #15

  • Status changed from Pending technical review to Pending release

Updated by Nicolas CHARLES almost 9 years ago Actions #16

Updated by François ARMAND almost 9 years ago Actions #17

  • Subject changed from Remote-run exec for root and nodes behind relays fail with "rudder agent was interrupted" to Remote-run exec for root fail with "rudder agent was interrupted"

Updated by Vincent MEMBRÉ almost 9 years ago Actions #18

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 4.1.0~rc1 which was released today.

Actions

Also available in: PDF Atom