Network namespaces – part 1

Linux namespaces are a relatively new kernel feature which is essential for implementation of containers. A namespace wraps a global system resource into an abstraction which will be bound only to processes within the namespace, providing resource isolation. In this article I discuss network namespace and show a practical example.

What is namespace?

A namespace is a way of scoping a particular set of identifiers. Using a namespace, you can use the same identifier multiple times in different namespaces. You can also restrict an identifier set visible to particular processes.

For example, Linux provides namespaces for networking and processes, among other things. If a process is running within a process namespace, it can only see and communicate with other processes in the same namespace. So, if a shell in a particular process namespace ran ps waux, it would only show the other processes in the same namespace.

Linux network namespaces

In a network namespace, the scoped ‘identifiers’ are network devices; so a given network device, such as eth0, exists in a particular namespace. Linux starts up with a default network namespace, so if your operating system does not do anything special, that is where all the network devices will be located. But it is also possible to create further non-default namespaces, and create new devices in those namespaces, or to move an existing device from one namespace to another.

Each network namespace also has its own routing table, and in fact this is the main reason for namespaces to exist. A routing table is keyed by destination IP address, so network namespaces are what you need if you want the same destination IP address to mean different things at different times – which is something that OpenStack Networking requires for its feature of providing overlapping IP addresses in different virtual networks.

Each network namespace also has its own set of iptables (for both IPv4 and IPv6). So, you can apply different security to flows with the same IP addressing in different namespaces, as well as different routing.

Any given Linux process runs in a particular network namespace. By default this is inherited from its parent process, but a process with the right capabilities can switch itself into a different namespace; in practice this is mostly done using the ip netns exec NETNS COMMAND… invocation, which starts COMMAND running in the namespace named NETNS. Suppose such a process sends out a message to IP address A.B.C.D, the effect of the namespace is that A.B.C.D will be looked up in that namespace’s routing table, and that will determine the network device that the message is transmitted through.

Lets play with ip namespaces

By convention a named network namespace is an object at /var/run/netns/NAME that can be opened. The file descriptor resulting from opening /var/run/netns/NAME refers to the specified network namespace.

create a namespace

power up loopback device

open up a namespace shell

now we can use this shell like user shell where it uses ns1 namespace only

 

In part-2  , I will explain how to connect to internet from ns1 namespace and adding custom routes.

Speed up Ansible

Update to the latest version. Ansible 2.0 is slower than Ansible 1.9 because it included an important change to the execution engine to allow any user to choose the execution algorithm to be used. In the versions that followed, and mostly in 2.1, big optimizations have been done to increase execution speed, so be sure to be running the latest possible version.

Profiling Tasks

The best way I’ve found to time the execution of Ansible playbooks is by enabling the profile_tasks callback. This callback is included with Ansible and all you need to do to enable it is add callback_whitelist = profile_tasks to the [defaults] section of your ansible.cfg:
# ansible.cfg

 

Enable pipelining

You can enable pipelining by simply adding pipelining = True to the [ssh_connection]area of your ansible.cfg or by by using the ANSIBLE_PIPELINING and ANSIBLE_SSH_PIPELINING environment variables.
# ansible.cfg
You’ll also need to make sure that requiretty is disabled in /etc/sudoers on the remote host, or become won’t work with pipelining enabled.

Enable Mitogen for Ansible

Enabling Mitogen for Ansible is as simple as downloading and extracting the plugin, then adding 2 lines to the [defaults] section of your ansible.cfg:
# ansible.cfg

SSH multiplexing

The first thing to check is whether SSH multiplexing is enabled and used. This gives a tremendous speed boost because Ansible can reuse opened SSH sessions instead of negotiating new one (actually more than one) for every task. Ansible has this setting turned on by default. It can be set in configuration file as follows:

But be careful to override  ssh_args  — if you don’t set ControlMaster   and ControlPersist  while overriding, Ansible will “forget” to use them.

To check whether SSH multiplexing is used, start Ansible with  -vvvv  option:
ansible test -vvvv -m ping

UseDNS

UseDNS is an SSH-server setting (/etc/ssh/sshd_config file) which forces a server to check a client’s PTR-record upon connection. It may cause connection delays especially with slow DNS servers on the server side. In modern Linux distribution, this setting is turned off by default, which is correct.

PreferredAuthentications

It is an SSH-client setting which informs server about preferred authentication methods. By default Ansible uses:
-o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
So if GSSAPI Authentication is enabled on the server (at the time of writing this it is turned on in RHEL EC2 AMI) it will be tried as the first option, forcing the client and server to make PTR-record lookups. But in most cases, we want to use only public key auth. We can force Ansible to do so by changing ansible.cfg:

 

Facts Gathering

At the start of playbook execution, Ansible collects facts about remote system (this is default behaviour for ansible-playbook but not relevant to ansible ad-hoc commands). It is similar to calling “setup” module thus requires another ssh communication step. If you don’t need any facts in your playbook (e.g. our test playbook) you can disable fact gathering:

Fork

Until this moment we discussed how to speed up playbook execution on a given remote host. But if you run playbook against tens or hundreds of hosts, Ansible internal performance becomes a bottleneck. For example, there’s preconfigured number of forks – number of hosts that can be interacted simultaneously. You can change this value in  ansible.cfg file:

 

The default value is 5, which is quite conservative. You can experiment with this setting depending on your local CPU and network bandwidth resources.
Another thing about forks is that if you have a lot of servers to work with and a low number of available forks, your master ssh-sessions may expire between tasks. Ansible uses linear strategy by default, which executes one task for every host and then proceeds to the next task. This way if time between task execution on the first server and on the last one is greater than ControlPersist then master socket will expire by the time Ansible starts execution of the following task on the first server, thus new ssh connection will be required.

Poll Interval

When module is executed on remote host, Ansible starts to poll for its result. The lower is interval between poll attempts, the higher is CPU load on Ansible control host. But we want to have CPU available for greater forks number (see above). You can tweak poll interval in  ansible.cfg:

 

If you run “slow” jobs (like backups) on multiple hosts, you may want to increase the interval to 0.05   to use less CPU.
Hope this helps you to speed up your setup. Seems like there are no more items in environment check-list and further speed gains only possible by optimizing your playbook code.

Asynchronous Actions and Polling

By default tasks in playbooks block, meaning the connections stay open until the task is done on each node. This may not always be desirable, or you may be running operations that take longer than the SSH timeout.
To avoid blocking or timeout issues, you can use asynchronous mode to run all of your tasks at once and then poll until they are done.
The behaviour of asynchronous mode depends on the value of poll.

Avoid connection timeouts: poll > 0

When poll is a positive value, the playbook will still block on the task until it either completes, fails or times out.
In this case, however, async explicitly sets the timeout you wish to apply to this task rather than being limited by the connection method timeout.
To launch a task asynchronously, specify its maximum runtime and how frequently you would like to poll for status. The default poll value is 15 seconds if you do not specify a value for poll:

 

Concurrent tasks: poll = 0

When poll is 0, Ansible will start the task and immediately move on to the next one without waiting for a result.
From the point of view of sequencing this is asynchronous programming: tasks may now run concurrently.
The playbook run will end without checking back on async tasks.
The async tasks will run until they either complete, fail or timeout according to their async value.
If you need a synchronization point with a task, register it to obtain its job ID and use the async_status module to observe it.
You may run a task asynchronously by specifying a poll value of 0:

 

Enable fact_caching

By enabling this value we’re telling Ansible to keep the facts it gathers in a local file. You can also set this to a redis cache. See the documentation for details.
Fact_caching is what happens when Ansible says, “Gathering facts” about your target hosts. If we don’t change our targets hardware (or virtual hardware) very often this can be very helpful. Enable it by adding this to your ansible.cfg file:
Enable facts caching mechanism
If you still need some of the facts groups, but at the same time the gathering process is still slow for you, you could try use fact caching.
Caching enables Ansible to cache the facts for a given host in some kind of backend.
Currently the caching plugin supports the following cache backend:

  •  
More information on the caching plugin, could be found here:
This is an example configuration of facts caching in json files

References:

1.https://dzone.com/articles/speed-up-ansible

2.https://habr.com/en/post/453446/

3.https://www.toptechskills.com/ansible-tutorials-courses/speed-up-ansible-playbooks-pipelining-mitogen/

4.https://www.youtube.com/watch?v=NZUYAbGs-ec

How to use rsync with ssh

Rsync is a fast and extraordinarily versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote rsync daemon. It offers a large number of options that control every aspect of its behavior and permit very flexible specification of the set of files to be copied. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.

Rsync finds files that need to be transferred using a lqquick checkrq algorithm (by default) that looks for files that have changed in size or in last-modified time. Any changes in the other preserved attributes (as requested by options) are made on the destination file directly when the quick check indicates that the file’s data does not need to be updated.

While tar over ssh is ideal for making remote copies of parts of a filesystem, rsync is even better suited for keeping the filesystem in sync between two machines. Typically, tar is used for the initial copy, and rsync is used to pick up whatever has changed since the last copy. This is because tar tends to be faster than rsync when none of the destination files exist, but rsync is much faster than tar when there are only a few differences between the two filesystems.
To run an rsync over ssh, pass it the -e switch, like this:
Notice the trailing / on the file spec from the source side  On the source specification, a trailing / tells rsync to copy the contents of the directory, but not the directory itself. To include the directory as the top level of whatever is being copied, leave off the /:
By default, rsync will only copy files and directories, but not remove them from the destination copy when they are removed from the source. To keep the copies exact, include the — delete flag:
If you run a command like this in cron, leave off the v switch. This will keep the output quiet (unless rsync has a problem running, in which case you’ll receive an email with the error output).
Using ssh as your transport for rsync traffic has the advantage of encrypting the data over the network and also takes advantage of any trust relationships you already have established using ssh client keys. For keeping large, complex directory structures in sync between two machines (especially when there are only a few differences between them), rsync is a very handy (and fast) tool to have at your disposal.

5 Ways to Speed Up SSH Connections in Linux

SSH is the most popular and secure method for managing Linux servers remotely. One of the challenges with remote server management is connection speeds, especially when it comes to session creation between the remote and local machines.

There are several bottlenecks to this process, one scenario is when you are connecting to a remote server for the first time; it normally takes a few seconds to establish a session. However, when you try to start multiple connections in succession, this causes an overhead (combination of excess or indirect computation time, memory, bandwidth, or other related resources to carry out the operation).

In this article, we will share four useful tips on how to speed up remote SSH connections in Linux.

1.Use Compression option in SSH

From the ssh man page (type man ssh to see the whole thing):

 

2.Force SSH Connection Over IPV4

OpenSSH supports both IPv4/IP6, but at times IPv6 connections tend to be slower. So you can consider forcing ssh connections over IPv4 only, using the syntax below:

Alternatively, use the AddressFamily (specifies the address family to use when connecting) directive in your ssh configuration file  (global configuration) or ~/.ssh/config (user specific file).

The accepted values are “any”, “inet” for IPv4 only, or “inet6”.

AddressFamily inet

3. Reuse SSH Connection

An ssh client program is used to establish connections to an sshd daemon accepting remote connections. You can reuse an already-established connection when creating a new ssh session and this can significantly speed up subsequent sessions.

You can enable this in your ~/.ssh/config file.

ControlMaster auto
ControlPath /home/akhil/.ssh/sockets/ssh_mux_%x_%p_%r
ControlPersist yes

openssh doesn’t support %x(ip address in control paths),  use my repo instead

https://github.com/akhilin/openssh-portable.git

or use %h to use hostname instead of ip address

using ip address is recommended so that even if you connect using different hostnames it uses same socket ( very useful when using ansible , pdsh )

4. Use Specific SSH Authentication Method

Another way of speeding up ssh connections is to use a given authentication method for all ssh connections, and here we recommend configuring ssh passwordless login using ssh keygen in 5 easy steps.

Once that is done, use the PreferredAuthentications directive, within ssh_config files (global or user specific) above. This directive defines the order in which the client should try authentication methods (you can specify a command separated list to use more than one method).

PreferredAuthentications=publickey

If you prefer password authentication which is deemed unsecure, use this.

5.Disable DNS Lookup On Remote Machine

By default, sshd daemon looks up the remote host name, and also checks that the resolved host name for the remote IP address maps back to the very same IP address. This can result into delays in connection establishment or session creation.

The UseDNS directive controls the above functionality; to disable it, search and uncomment it in the /etc/ssh/sshd_config file. If it’s not set, add it with the value no.

UseDNS=no