Project

General

Profile

Actions

Bug #9736

closed

Stack overflow on node generation

Bug #9736: Stack overflow on node generation

Added by François ARMAND about 9 years ago. Updated almost 8 years ago.

Status:
Rejected
Priority:
1 (highest)
Assignee:
-
Category:
Web - Config management
Target version:
Severity:
Major - prevents use of part of Rudder | no simple workaround
UX impact:
User visibility:
Infrequent - complex configurations | third party integrations
Effort required:
Priority:
34
Name check:
Fix check:
Regression:

Description

On user found a case where generation is never ending, with the following stack trace:

[2016-11-24 10:16:39] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Node's target configuration built in 734 ms, start to update rule values.
[2016-11-24 10:16:39] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - RuleVals updated in 65 ms, start to detect changes in node configuration.
[2016-11-24 10:16:39] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Checked node configuration updates leading to rules serial number updates and serial number updated in 11 ms
[2016-11-24 10:16:40] INFO  com.normation.rudder.services.policies.nodeconfig.NodeConfigurationServiceImpl - Configuration of following nodes were updated, their promises are going to be written: [7357f552-34ee-4b8d-9fa9-54854bbdece0, root]
[2016-11-24 10:16:40] ERROR net.liftweb.actor.ActorLogger - Actor threw an exception
java.lang.StackOverflowError: null
    at com.normation.rudder.services.policies.nodeconfig.NodeConfiguration.nodeInfo(NodeConfiguration.scala:67)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$2.apply(PathComputer.scala:140)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$2.apply(PathComputer.scala:139)
    at net.liftweb.common.Full.map(Box.scala:610)
    at com.normation.rudder.services.policies.write.PathComputerImpl.com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath(PathComputer.scala:139)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3$$anonfun$apply$2.apply(PathComputer.scala:148)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3$$anonfun$apply$2.apply(PathComputer.scala:141)
    at net.liftweb.common.Full.flatMap(Box.scala:612)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3.apply(PathComputer.scala:141)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3.apply(PathComputer.scala:139)
    at net.liftweb.common.Full.flatMap(Box.scala:612)
    [....]

We don't know the cause for now, but there is relay, and there was a lot of updates in the node/releay topologie. So Rudder must have badly updated something somewhere, and now there is an infinite loop when trying to generate policies.

The problem was seens in 4.0.0, but most likelly already exists in previous versions.


Related issues 2 (0 open2 closed)

Is duplicate of Rudder - Bug #9735: Stackoverflow on promise generationRejectedActions
Is duplicate of Rudder - Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow)ReleasedNicolas CHARLESActions

Updated by François ARMAND about 9 years ago Actions #1

  • Is duplicate of Bug #9735: Stackoverflow on promise generation added

Updated by François ARMAND about 9 years ago Actions #2

So it happens that the node that is leading to the error (7357f552-34ee-4b8d-9fa9-54854bbdece0) is removed:

curl -k -H "X-API-Token: xxxxxxx"  https://yourserver/rudder/api/latest/nodes/7357f552-34ee-4b8d-9fa9-54854bbdece0
{"action": "nodeDetails","id": "7357f552-34ee-4b8d-9fa9-54854bbdece0","result": "success","data": {"nodes": [
{  "id": "7357f552-34ee-4b8d-9fa9-54854bbdece0" 
  , "hostname": "relay.bal.local" 
  , "status": "removed" 
  , "policyServerId": "root" 
  ...
}

Updated by François ARMAND about 9 years ago Actions #3

Fully removing by hand the node with the following LDAP command solved the generation problem:

ldapdelete -x -r -D "cn=Manager,cn=rudder-configuration" -W "nodeId=7357f552-34ee-4b8d-9fa9-54854bbdece0,ou=Nodes,ou=Removed Inventories,ou=Inventories,cn=rudder-configuration" 

It is not clear at all why the node was taken into accound for policy generation, nor why there was a stack overflow.

One clue is that node : "relay.bal.local" does exsist but with a different id

Updated by Vincent MEMBRÉ about 9 years ago Actions #4

  • Target version changed from 4.0.1 to 4.0.2

Updated by Vincent MEMBRÉ about 9 years ago Actions #5

  • Target version changed from 4.0.2 to 4.0.3

Updated by Nicolas CHARLES about 9 years ago Actions #6

comment suppressed, it's not the correct issue that i'm checking

Updated by Vincent MEMBRÉ almost 9 years ago Actions #7

  • Target version changed from 4.0.3 to 4.0.4

Updated by Benoît PECCATTE almost 9 years ago Actions #8

  • Severity set to Major - prevents use of part of Rudder | no simple workaround
  • User visibility set to Infrequent - complex configurations | third party integrations
  • Priority set to 24

Updated by François ARMAND almost 9 years ago Actions #9

  • Status changed from In progress to New

Updated by Vincent MEMBRÉ almost 9 years ago Actions #10

  • Target version changed from 4.0.4 to 4.0.5

Updated by Jonathan CLARKE over 8 years ago Actions #11

  • Assignee deleted (François ARMAND)

Updated by Vincent MEMBRÉ over 8 years ago Actions #12

  • Target version changed from 4.0.5 to 4.0.6
  • Priority changed from 24 to 23

Updated by Vincent MEMBRÉ over 8 years ago Actions #13

  • Target version changed from 4.0.6 to 4.0.7

Updated by Vincent MEMBRÉ over 8 years ago Actions #14

  • Target version changed from 4.0.7 to 357

Updated by Benoît PECCATTE over 8 years ago Actions #15

  • Priority changed from 23 to 37

Updated by Alexis Mousset over 8 years ago Actions #16

  • Target version changed from 357 to 4.1.6

Updated by Vincent MEMBRÉ over 8 years ago Actions #17

  • Target version changed from 4.1.6 to 4.1.7
  • Priority changed from 37 to 36

Updated by Vincent MEMBRÉ over 8 years ago Actions #18

  • Target version changed from 4.1.7 to 4.1.8

Updated by Vincent MEMBRÉ about 8 years ago Actions #19

  • Target version changed from 4.1.8 to 4.1.9
  • Priority changed from 36 to 35

Updated by Benoît PECCATTE about 8 years ago Actions #20

  • Priority changed from 35 to 34

Updated by Vincent MEMBRÉ about 8 years ago Actions #21

  • Target version changed from 4.1.9 to 4.1.10

Updated by Vincent MEMBRÉ almost 8 years ago Actions #22

  • Target version changed from 4.1.10 to 4.1.11

Updated by François ARMAND almost 8 years ago Actions #23

  • Status changed from New to Rejected

I'm closing this one because #12359 is much more clear in what lead to the stackoverflow.

Updated by François ARMAND almost 8 years ago Actions #24

  • Related to Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow) added

Updated by François ARMAND almost 8 years ago Actions #25

  • Related to deleted (Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow))

Updated by François ARMAND almost 8 years ago Actions #26

  • Is duplicate of Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow) added
Actions

Also available in: PDF Atom