Saltstack [SOLVED]: Salt masters behind ELB have flaky connection to minions

Saltstack [SOLVED]: Salt masters behind ELB have flaky connection to minions

Home Forums Automation Tools Saltstack Tutorials Saltstack [SOLVED]: Salt masters behind ELB have flaky connection to minions

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #183718

    Cloudy Point
    Keymaster

    QuestionQuestion

    I am running the following setup at AWS:

    • Elastic Loadbalancer in front of two EC2 machines (Amazon Linux) with a docker container that the salt-master runs in
    • Two EC2 instances with salt-minions installed

    • The ‘master’ value in the minion config is set to the dns of the loadbalancer (SaltMaster-env-vpc-test.szfegmankg.us-east-1.elasticbeanstalk.com)

    • The ELB accepts all traffic from the minions

    • The Salt-masters accept all traffic from the ELB as well as from the minions

    • The Salt-masters PKI Folder is shared between the two masters

    • The Salt-masters have the same private+public keys

    • The Salt-masters run on 2017.7.1

    • The Salt-minions run on 2016.11.5 (I tried it with 2017.7.1, but got the same results)

    • The Salt-minions accept all traffic from the ELB as well as from the masters

    • The master config looks as follows:

      open_mode: True 
      worker_threads: 20
      auto_accept: True 
      log_level: error 
      log_level_logfile: debug 
      extension_modules: srv/salt/ext 
      rest_cherrypy:   
      port: 8000   
      disable_ssl: True   
      debug: True 
      external_auth:   
        pam:
          saltdev:
            - .*
            - '@runner'
      # Setting the job_cache to redis.
      # The redis config settings are generated at the start of the docker container and
      # will be written into /etc/salt/master.d/redis.conf 
      master_job_cache: redis 
      cache: redis 
      pki_dir: /etc/salt/pki/master/efs
      
    • The minion config looks as follows:

      id: WIN-AB3GO7BJ72I
      log_file: C:salt.log
      multiprocessing: False
      log_level_logfile: debug
      pki_dir: /conf/pki/minion
      master: SaltMaster-env-vpc-test.szfegmankg.us-east-1.elasticbeanstalk.com
      master_type: str
      master_alive_interval: 30
      open_mode: True
      root_dir: c:salt
      ipc_mode: tcp
      recon_default: 1000
      recon_max: 199000
      recon_randomize: True
      
    • In the master log files, I can see on both masters:

      2017-09-05 10:06:18,118 [salt.utils.verify][DEBUG   ][35] This salt-master instance has accepted 2 minion keys.
      
    • A salt-key -L on both masters yield the same result:

      Accepted Keys:
      WIN-AB3GO7BJ72I
      WIN-EDMP9VB716B
      Denied Keys:
      Unaccepted Keys:
      Rejected Keys:
      

    So it looks like all is fine and everything should work. However, a test.ping is extremely flaky. Sometimes it works, but most of the time it doesnt.
    Most of the time neither master gets any return from the minion and on the minion side I can see in the log that the minion never receives the message to execute ‘test.ping’ from the master.
    Example 1:
    test.ping from Master1:

    root@d7383ff8f8bf:/# salt 'WIN-EDMP9VB716B' test.ping
    [ERROR   ] Exception raised when processing __virtual__ function for salt.loaded.int.cache.consul. Module will not be loaded: 'module' object has no attribute 'Consul'
    [ERROR   ] An un-handled exception was caught by salt's global exception handler:
    KeyError: 'redis.ls'
    Traceback (most recent call last):
      File "/usr/bin/salt", line 10, in <module>
        salt_main()
      File "/usr/lib/python2.7/dist-packages/salt/scripts.py", line 476, in salt_main
        client.run()
      File "/usr/lib/python2.7/dist-packages/salt/cli/salt.py", line 173, in run
        for full_ret in cmd_func(**kwargs):
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 805, in cmd_cli
        **kwargs):
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 1597, in get_cli_event_returns
        connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
      File "/usr/lib/python2.7/dist-packages/salt/utils/minions.py", line 577, in connected_ids
        search = self.cache.ls('minions')
      File "/usr/lib/python2.7/dist-packages/salt/cache/__init__.py", line 244, in ls
        return self.modules[fun](bank, **self._kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1113, in __getitem__
        func = super(LazyLoader, self).__getitem__(item)
      File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 101, in __getitem__
        raise KeyError(key)
    KeyError: 'redis.ls'
    Traceback (most recent call last):
      File "/usr/bin/salt", line 10, in <module>
        salt_main()
      File "/usr/lib/python2.7/dist-packages/salt/scripts.py", line 476, in salt_main
        client.run()
      File "/usr/lib/python2.7/dist-packages/salt/cli/salt.py", line 173, in run
        for full_ret in cmd_func(**kwargs):
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 805, in cmd_cli
        **kwargs):
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 1597, in get_cli_event_returns
        connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
      File "/usr/lib/python2.7/dist-packages/salt/utils/minions.py", line 577, in connected_ids
        search = self.cache.ls('minions')
      File "/usr/lib/python2.7/dist-packages/salt/cache/__init__.py", line 244, in ls
        return self.modules[fun](bank, **self._kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1113, in __getitem__
        func = super(LazyLoader, self).__getitem__(item)
      File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 101, in __getitem__
        raise KeyError(key)
    KeyError: 'redis.ls'
    

    I am aware that the redis error will be fixed soon https://github.com/saltstack/salt/issues/43295

    Example 2:
    test.ping from Master1, ~ 1 Minute after Example 1:

    root@d7383ff8f8bf:/# salt 'WIN-EDMP9VB716B' test.ping
    WIN-EDMP9VB716B:
        True
    

    Also during my tests, a test.ping from Master2 never succeeded.

    I would like to know if there is some flaw in my setup that I am not seeing, or if Salt only works with an HA Proxy as an ELB?

    Or maybe Salt doesn’t work at all behind an ELB?

    #183719

    Cloudy Point
    Keymaster

    Accepted AnswerAnswer

    See https://github.com/saltstack/salt/issues/43368 for more answers.

    TL;DR
    Because there is no session stickyness for TCP connections, it is currently not possible to work with a saltmaster that is behind an ELB, if you use the ELB’s ip/name as an entrypoint.

    Source: https://stackoverflow.com/questions/46052796/salt-masters-behind-elb-have-flaky-connection-to-minions
    Author: Florian
    Creative Commons License
    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.