Project

General

Profile

Actions

Bug #4241

closed

CFEngine refuses to start after upgrade from 2.7.5 to 2.8.1 due to a CFEngine buffer overflow

Added by Daniel Stan about 11 years ago. Updated almost 10 years ago.

Status:
Released
Priority:
1 (highest)
Assignee:
Jonathan CLARKE
Category:
Web - Config management
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

Hello

We upgraded rudder server as per your tutorial and it seems that it refuses to start the agent with this error:

/etc/init.d/rudder-agent restart
rudder-agent[6278]: [INFO] Using /etc/default/rudder-agent for configuration
rudder-agent[6281]: [INFO] Using /var/rudder/cfengine-community for CFEngine workdir
rudder-agent[6282]: [INFO] Halting CFEngine Community cf-serverd...
rudder-agent[6283]: [INFO] can't read PID file, not stopping cf-serverd
rudder-agent[6284]: [INFO] Halting CFEngine Community cf-execd...
rudder-agent[6285]: [INFO] can't read PID file, not stopping cf-execd
rudder-agent[6286]: [INFO] Launching CFEngine Community cf-serverd...
input buffer overflow, can't enlarge buffer because scanner uses REJECT
2013-12-10T19:15:11+0000    error: Policy failed validation with command '"/var/rudder/cfengine-community/bin/cf-promises" -c "/var/rudder/cfengine-community/inputs/promises.cf"'

If I manually run the command I get this output:

CT-10112-bash-4.1# "/var/rudder/cfengine-community/bin/cf-promises" -c /var/rudder/cfengine-community/inputs/promises.cf -v
2013-12-10T19:16:37+0000  verbose: Work directory is /var/rudder/cfengine-community
2013-12-10T19:16:37+0000  verbose: Looking for a source of entropy in '/var/rudder/cfengine-community/randseed'
....
013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'fusionAgent'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'listInstalledVM'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'generateExtraInformations'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'turnUsersToUnicode'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'addInformationsToInventory'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'moveInventoryToFinalDestination'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'sendInventory'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'add_information_to_inventory'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'add_users_information_to_inventory'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'xmlify'
2013-12-10T19:16:37+0000  verbose: Resolving variables in bundle 'cleanForceInventoryFlagFile'
2013-12-10T19:16:37+0000  verbose: Parsing file '/var/rudder/cfengine-community/inputs/common/1.0/cf-served.cf'
input buffer overflow, can't enlarge buffer because scanner uses REJECT

It seems that it fails to load the /var/rudder/cfengine-community/inputs/common/1.0/cf-served.cf promise. After running the same command with strace it seems that it fails to load the ACL list located here:

 !policy_server::\n      \"acl\" slist => {\n      \"${def.policy_server}\"\n    };\n}\n\n\nbody server control\n{\n        trustkeysfrom     => {\n          \"127.0.0.0/8\" , \"::1\",\n          @{def.acl} ,\n           host2ip(\"hostname1\"), \"hostname1\",  
....
 host2ip(\"hostname256\"), \"hostname256\",  host2i"  host2i", 4096) = 4096
write(2, "input buffer overflow, can't enlarge buffer because scanner uses REJECT\n", 72input buffer overflow, can't enlarge buffer because scanner uses REJECT
) = 72
exit_group(2)                           = ?

<pre>

It loads around 256 of the hosts and it runs out of memory for that buffer. 
We are using this with a large amount of agents (over 500) and before the update  rudder was running with the same number of hosts so I think this problem is caused by the newer cf-engine version which adds that extra check. 
The real hostnames were replaced but can you please advise if there is any OS limit that can be increased to get this started or there is a bug that needs to be fixed in cf-engine.
Actions #1

Updated by Nicolas CHARLES about 11 years ago

  • Status changed from New to 8

I'm investigating what is the root cause, and how to fix it.

Actions #2

Updated by Nicolas CHARLES about 11 years ago

Actually, I'm struggling to reproduce the issue.
I tried with a list of 659 hostname, with the following pattern for hostname
veryveryveryveryveryveryveryverylonghostnamethatislong.some.domain.local[id] (so about 76 characters on average)

this convert to a list of 49kB for hostnames, and 9KB for IP (in term of string size)
And I'm not experiencing any cf-serverd issues

The hostname/ip relationship is stored for me in the /etc/hosts.

To help me reproduce the issue, could you tell me which OS and architecture are you using

Thank you

Actions #3

Updated by Nicolas CHARLES about 11 years ago

Oh, i could reproduce it. Actually the issues comes from the length of the line: if it is more than 16 KB long, it fails.
We simply need to include cariage return

Actions #4

Updated by Nicolas CHARLES about 11 years ago

  • Status changed from 8 to In progress
  • Assignee set to Nicolas CHARLES
  • Priority changed from N/A to 1 (highest)
  • Target version set to 2.8.2

The issue occured only in CFEngine 3.5

Actions #5

Updated by Nicolas CHARLES about 11 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Nicolas CHARLES to Jonathan CLARKE
  • Pull Request set to https://github.com/Normation/rudder-techniques/pull/252
Actions #6

Updated by Daniel Stan about 11 years ago

Thank you for getting back to us on this matter.
I tried to split those lines in smaller batches using CR but it still didn't start. I see you are doing it differently and use CR after each host name. Can you tell us how we can apply the workaround manually until the fix is included in the RPM? It is a matter of replacing techniques/system/common/1.0/cf-served.st with the one from your repo or do you recommend to wait until you add it to the rpm? At the moment our server is down and can't regenerate the promises for each node.

Actions #7

Updated by Nicolas CHARLES about 11 years ago

Hi Daniel

You can simply copy the cf-serverd.st from the pull request ( https://raw.github.com/ncharles/rudder-techniques/30b073c03fe46c872fc46133571e93c480b241ee/techniques/system/common/1.0/cf-served.st ) to replace your version in /var/rudder/configuration-repository/techniques/system/common/1.0
Then you'll need to run, in the /var/rudder/configuration-repository/techniques directory, the following commands:

git add system/common/1.0
git commit -m "Correcting cf-serverd accordingly to issue #4242" 

And in Web Interface, go to the Administration/Settings page, and click on the "Update Techniques now" button

Actions #8

Updated by Daniel Stan about 11 years ago

hello it seems that you fixed only half of the problem : the alc's. There is also another big list in this section:


#######################################################

body runagent control
{
        hosts => {

Can you please fix this one too?

Actions #9

Updated by Nicolas CHARLES about 11 years ago

Oh, sorry about this.
I updated the pull request, and the file, that you can download here:
https://raw.github.com/ncharles/rudder-techniques/f44201e64e634bbd13e39eb96a047ecdcf450636/techniques/system/common/1.0/cf-served.st

Actions #10

Updated by Nicolas CHARLES about 11 years ago

  • Status changed from Pending technical review to Pending release
  • % Done changed from 0 to 100

Applied in changeset policy-templates:commit:f44201e64e634bbd13e39eb96a047ecdcf450636.

Actions #11

Updated by Jonathan CLARKE about 11 years ago

Applied in changeset policy-templates:commit:698ea0e0ab275987f28a40d3ceddfad8597be89a.

Actions #12

Updated by Vincent MEMBRÉ about 11 years ago

  • Subject changed from cf-engine refuses to start after upgrade from 2.7.5 to 2.8.1 with this error "input buffer overflow, can't enlarge buffer because scanner uses REJECT" to CFEngine refuses to start after upgrade from 2.7.5 to 2.8.1 with this error "input buffer overflow, can't enlarge buffer because scanner uses REJECT"
Actions #13

Updated by Vincent MEMBRÉ about 11 years ago

  • Subject changed from CFEngine refuses to start after upgrade from 2.7.5 to 2.8.1 with this error "input buffer overflow, can't enlarge buffer because scanner uses REJECT" to CFEngine refuses to start after upgrade from 2.7.5 to 2.8.1 due to a CFEngine buffer overflow
Actions #14

Updated by Vincent MEMBRÉ about 11 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 2.8.2, which was released today.
Check out:

Actions #15

Updated by Benoît PECCATTE almost 10 years ago

  • Category changed from 14 to Web - Config management
Actions

Also available in: Atom PDF