Project

General

Profile

Bug #13256

LDAP IO error on generation with a lot of nodes

Added by François ARMAND 4 months ago. Updated about 1 month ago.

Status:
Released
Priority:
N/A
Category:
Performance and scalability
Target version:
Severity:
User visibility:
Effort required:
Priority:
0

Description

A bug quite similar to #10646 was reported on the same heavy loaded installation, but with a different part of the system failing and that error message:

[2018-08-21 18:35:57] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Policy generation completed in 1124 ms
[2018-08-21 18:35:57] ERROR com.normation.rudder.batch.AsyncDeploymentAgent$DeployerAgent - Error when updating policy, reason Cannot get the Configuration Cache <- Can't execute LDAP request
[2018-08-21 18:35:57] ERROR com.normation.rudder.batch.AsyncDeploymentAgent - Policy update error for process '13637' at 2018-08-21 18:35:57: Cannot get the Configuration Cache
[2018-08-21 18:36:05] INFO  com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Start policy generation, checking updated rules
[2018-08-21 18:36:05] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Pre-policy-generation scripts hooks ran in 4 ms
[2018-08-21 18:36:05] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Pre-policy-generation modules hooks in 0 ms, start getting all generation related data.
[2018-08-21 18:36:06] ERROR com.normation.ldap.sdk.RWPooledSimpleAuthConnectionProvider - Can't execute LDAP request
com.unboundid.ldap.sdk.LDAPSearchException: The connection to server localhost:389 was closed while waiting for a response to search request SearchRequest(baseDN='cn=Nodes Configuration,ou=Rudder,cn=rudder-configuration', scope=BASE, deref=NEVER, sizeLimit=1, time
Limit=0, filter='(objectClass=*)', attrs={}):  An I/O error occurred while trying to read the response from the server:  IOException(message='The element indicated that it required 20972056 bytes to hold the value, but this is larger than the maximum of 20971520 b
ytes that the client has been configured to accept.', trace='readLength(ASN1StreamReader.java:390) / beginSequence(ASN1StreamReader.java:918) / readLDAPResponseFrom(LDAPMessage.java:1146) / run(LDAPConnectionReader.java:251)', revision=24201)

The problem is linked to a default parameter in unboundid: https://docs.ldap.com/ldap-sdk/docs/javadoc/com/unboundid/ldap/sdk/LDAPConnectionOptions.html#PROPERTY_DEFAULT_MAX_MESSAGE_SIZE_BYTES which has a default value of 20971520 B (20MB).


Related issues

Related to Rudder - Bug #10646: "SocketException(message='Socket closed'" error at the end of generation with 500 nodesReleased

Associated revisions

Revision ab9d54f3 (diff)
Added by François ARMAND 4 months ago

Fixes #13256: LDAP IO error on generation with a lot of nodes

Revision 3ca6e158 (diff)
Added by François ARMAND 4 months ago

Fixes #13256: LDAP IO error on generation with a lot of nodes

History

#1 Updated by François ARMAND 4 months ago

We will need to set a bigger default value to match the one configured in OpenLDAP in #10646.

#2 Updated by François ARMAND 4 months ago

  • Related to Bug #10646: "SocketException(message='Socket closed'" error at the end of generation with 500 nodes added

#3 Updated by François ARMAND 4 months ago

  • Description updated (diff)

#4 Updated by François ARMAND 4 months ago

The workaround was successfully tested but can only work for values below Int.MaxSize (ie 2147483647) and is in bit, so that accounts for 255 MB.

We have feedbacks of a size of ~20MB for 1600 nodes and not that many policies, so the 255MB limit can be a showstopper. We need to investigate how to make it less so (either by lifting the limit, of findind a more compact hash strategy, which ca be good in all cases, because it's not smart to move around hundreds of megabyte of data for that).

#5 Updated by François ARMAND 4 months ago

  • Status changed from New to In progress

#6 Updated by François ARMAND 4 months ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Vincent MEMBRÉ
  • Pull Request set to https://github.com/Normation/scala-ldap/pull/29

#7 Updated by François ARMAND 4 months ago

  • Status changed from Pending technical review to Pending release

#8 Updated by François ARMAND 4 months ago

  • Description updated (diff)

Removing the workaround as it does not work in Rudder 4.1/4.3. The correct correction is in the pull request reference by that ticket.

#9 Updated by François ARMAND 4 months ago

Updated gist to reproduce: https://gist.github.com/fanf/83f0eab664e785fd2e2449178ec582aa

Also, you can do is in pure shell + ldapmodify:

# create a 1MB string
str=a
for i in $(seq 1 20); do str="$str$str"; done

# update "description" attribute on "Node Configuration" entry. That attribute is retrieved even if it
# is not actually use, so modifying it won't mess with the actual behavior of Rudder. 
for i in $(seq 1 30); do ldapmodify -xc -H ldap://localhost:389  -D "cn=manager, cn=rudder-configuration" -w "xxxxxxxxxx" << EOF
dn: cn=Nodes Configuration,ou=Rudder,cn=rudder-configuration
changetype: modify
add: description
description: $i-$str
-
EOF
; done

We now have a big entry. Go to Rudder ui, try Status > "update policies". If it does not blow up almost immediately, the fix is ok. 

The description can be safely removed (no need to move 30MB along wires all day long):

ldapmodify -xc -H ldap://localhost:389  -D "cn=manager, cn=rudder-configuration" -w "xxxxxxxxxx" << EOF
dn: cn=Nodes Configuration,ou=Rudder,cn=rudder-configuration
changetype: modify
delete: description
-
EOF

#10 Updated by Vincent MEMBRÉ about 2 months ago

  • Status changed from Pending release to Released

EDIT: Not released, it was in the wrong branch.

#11 Updated by Vincent MEMBRÉ about 2 months ago

  • Tags set to Next minor release
  • Status changed from Released to Pending technical review
  • Target version changed from 4.1.15 to 4.1.16

This was not released, We were mistaken by the scala-ldap|3ca6e15833f9c5f4433a9a662cf97c383b2fb05c commit which changed the status of this release but was not made on a release branch (it was used to make a support/client release with the fix)

Reopening this issue and making a technical review

#12 Updated by François ARMAND about 2 months ago

  • Status changed from Pending technical review to Pending release

#13 Updated by Vincent MEMBRÉ about 1 month ago

  • Status changed from Pending release to Released
This bug has been fixed in Rudder 4.1.16, 4.3.6 and 5.0.2 which were released today.
Changelog
Changelog
Changelog

Also available in: Atom PDF