Project

General

Profile

Actions

Bug #9736

closed

Stack overflow on node generation

Added by François ARMAND over 7 years ago. Updated about 6 years ago.

Status:
Rejected
Priority:
1
Assignee:
-
Category:
Web - Config management
Target version:
Severity:
Major - prevents use of part of Rudder | no simple workaround
UX impact:
User visibility:
Infrequent - complex configurations | third party integrations
Effort required:
Priority:
34
Name check:
Fix check:
Regression:

Description

On user found a case where generation is never ending, with the following stack trace:

[2016-11-24 10:16:39] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Node's target configuration built in 734 ms, start to update rule values.
[2016-11-24 10:16:39] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - RuleVals updated in 65 ms, start to detect changes in node configuration.
[2016-11-24 10:16:39] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Checked node configuration updates leading to rules serial number updates and serial number updated in 11 ms
[2016-11-24 10:16:40] INFO  com.normation.rudder.services.policies.nodeconfig.NodeConfigurationServiceImpl - Configuration of following nodes were updated, their promises are going to be written: [7357f552-34ee-4b8d-9fa9-54854bbdece0, root]
[2016-11-24 10:16:40] ERROR net.liftweb.actor.ActorLogger - Actor threw an exception
java.lang.StackOverflowError: null
    at com.normation.rudder.services.policies.nodeconfig.NodeConfiguration.nodeInfo(NodeConfiguration.scala:67)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$2.apply(PathComputer.scala:140)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$2.apply(PathComputer.scala:139)
    at net.liftweb.common.Full.map(Box.scala:610)
    at com.normation.rudder.services.policies.write.PathComputerImpl.com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath(PathComputer.scala:139)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3$$anonfun$apply$2.apply(PathComputer.scala:148)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3$$anonfun$apply$2.apply(PathComputer.scala:141)
    at net.liftweb.common.Full.flatMap(Box.scala:612)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3.apply(PathComputer.scala:141)
    at com.normation.rudder.services.policies.write.PathComputerImpl$$anonfun$com$normation$rudder$services$policies$write$PathComputerImpl$$recurseComputePath$3.apply(PathComputer.scala:139)
    at net.liftweb.common.Full.flatMap(Box.scala:612)
    [....]

We don't know the cause for now, but there is relay, and there was a lot of updates in the node/releay topologie. So Rudder must have badly updated something somewhere, and now there is an infinite loop when trying to generate policies.

The problem was seens in 4.0.0, but most likelly already exists in previous versions.


Related issues 2 (0 open2 closed)

Is duplicate of Rudder - Bug #9735: Stackoverflow on promise generationRejected2016-11-25Actions
Is duplicate of Rudder - Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow)ReleasedNicolas CHARLESActions
Actions #1

Updated by François ARMAND over 7 years ago

  • Is duplicate of Bug #9735: Stackoverflow on promise generation added
Actions #2

Updated by François ARMAND over 7 years ago

So it happens that the node that is leading to the error (7357f552-34ee-4b8d-9fa9-54854bbdece0) is removed:

curl -k -H "X-API-Token: xxxxxxx"  https://yourserver/rudder/api/latest/nodes/7357f552-34ee-4b8d-9fa9-54854bbdece0
{"action": "nodeDetails","id": "7357f552-34ee-4b8d-9fa9-54854bbdece0","result": "success","data": {"nodes": [
{  "id": "7357f552-34ee-4b8d-9fa9-54854bbdece0" 
  , "hostname": "relay.bal.local" 
  , "status": "removed" 
  , "policyServerId": "root" 
  ...
}
Actions #3

Updated by François ARMAND over 7 years ago

Fully removing by hand the node with the following LDAP command solved the generation problem:

ldapdelete -x -r -D "cn=Manager,cn=rudder-configuration" -W "nodeId=7357f552-34ee-4b8d-9fa9-54854bbdece0,ou=Nodes,ou=Removed Inventories,ou=Inventories,cn=rudder-configuration" 

It is not clear at all why the node was taken into accound for policy generation, nor why there was a stack overflow.

One clue is that node : "relay.bal.local" does exsist but with a different id

Actions #4

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 4.0.1 to 4.0.2
Actions #5

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 4.0.2 to 4.0.3
Actions #6

Updated by Nicolas CHARLES over 7 years ago

comment suppressed, it's not the correct issue that i'm checking

Actions #7

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 4.0.3 to 4.0.4
Actions #8

Updated by Benoît PECCATTE about 7 years ago

  • Severity set to Major - prevents use of part of Rudder | no simple workaround
  • User visibility set to Infrequent - complex configurations | third party integrations
  • Priority set to 24
Actions #9

Updated by François ARMAND about 7 years ago

  • Status changed from In progress to New
Actions #10

Updated by Vincent MEMBRÉ about 7 years ago

  • Target version changed from 4.0.4 to 4.0.5
Actions #11

Updated by Jonathan CLARKE about 7 years ago

  • Assignee deleted (François ARMAND)
Actions #12

Updated by Vincent MEMBRÉ about 7 years ago

  • Target version changed from 4.0.5 to 4.0.6
  • Priority changed from 24 to 23
Actions #13

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 4.0.6 to 4.0.7
Actions #14

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 4.0.7 to 357
Actions #15

Updated by Benoît PECCATTE almost 7 years ago

  • Priority changed from 23 to 37
Actions #16

Updated by Alexis Mousset almost 7 years ago

  • Target version changed from 357 to 4.1.6
Actions #17

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 4.1.6 to 4.1.7
  • Priority changed from 37 to 36
Actions #18

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.7 to 4.1.8
Actions #19

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.8 to 4.1.9
  • Priority changed from 36 to 35
Actions #20

Updated by Benoît PECCATTE over 6 years ago

  • Priority changed from 35 to 34
Actions #21

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.9 to 4.1.10
Actions #22

Updated by Vincent MEMBRÉ about 6 years ago

  • Target version changed from 4.1.10 to 4.1.11
Actions #23

Updated by François ARMAND about 6 years ago

  • Status changed from New to Rejected

I'm closing this one because #12359 is much more clear in what lead to the stackoverflow.

Actions #24

Updated by François ARMAND about 6 years ago

  • Related to Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow) added
Actions #25

Updated by François ARMAND about 6 years ago

  • Related to deleted (Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow))
Actions #26

Updated by François ARMAND about 6 years ago

  • Is duplicate of Bug #12359: Cannot generate policies when there is a loop in policy server hierharchy (stackoverflow) added
Actions

Also available in: Atom PDF