Network namespaces – part 1

Linux namespaces are a relatively new kernel feature which is essential for implementation of containers. A namespace wraps a global system resource into an abstraction which will be bound only to processes within the namespace, providing resource isolation. In this article I discuss network namespace and show a practical example.

What is namespace?

A namespace is a way of scoping a particular set of identifiers. Using a namespace, you can use the same identifier multiple times in different namespaces. You can also restrict an identifier set visible to particular processes.

For example, Linux provides namespaces for networking and processes, among other things. If a process is running within a process namespace, it can only see and communicate with other processes in the same namespace. So, if a shell in a particular process namespace ran ps waux, it would only show the other processes in the same namespace.

Linux network namespaces

In a network namespace, the scoped ‘identifiers’ are network devices; so a given network device, such as eth0, exists in a particular namespace. Linux starts up with a default network namespace, so if your operating system does not do anything special, that is where all the network devices will be located. But it is also possible to create further non-default namespaces, and create new devices in those namespaces, or to move an existing device from one namespace to another.

Each network namespace also has its own routing table, and in fact this is the main reason for namespaces to exist. A routing table is keyed by destination IP address, so network namespaces are what you need if you want the same destination IP address to mean different things at different times – which is something that OpenStack Networking requires for its feature of providing overlapping IP addresses in different virtual networks.

Each network namespace also has its own set of iptables (for both IPv4 and IPv6). So, you can apply different security to flows with the same IP addressing in different namespaces, as well as different routing.

Any given Linux process runs in a particular network namespace. By default this is inherited from its parent process, but a process with the right capabilities can switch itself into a different namespace; in practice this is mostly done using the ip netns exec NETNS COMMAND… invocation, which starts COMMAND running in the namespace named NETNS. Suppose such a process sends out a message to IP address A.B.C.D, the effect of the namespace is that A.B.C.D will be looked up in that namespace’s routing table, and that will determine the network device that the message is transmitted through.

Lets play with ip namespaces

By convention a named network namespace is an object at /var/run/netns/NAME that can be opened. The file descriptor resulting from opening /var/run/netns/NAME refers to the specified network namespace.

create a namespace

power up loopback device

open up a namespace shell

now we can use this shell like user shell where it uses ns1 namespace only

 

In part-2  , I will explain how to connect to internet from ns1 namespace and adding custom routes.

Speed up Ansible

Update to the latest version. Ansible 2.0 is slower than Ansible 1.9 because it included an important change to the execution engine to allow any user to choose the execution algorithm to be used. In the versions that followed, and mostly in 2.1, big optimizations have been done to increase execution speed, so be sure to be running the latest possible version.

Profiling Tasks

The best way I’ve found to time the execution of Ansible playbooks is by enabling the profile_tasks callback. This callback is included with Ansible and all you need to do to enable it is add callback_whitelist = profile_tasks to the [defaults] section of your ansible.cfg:
# ansible.cfg

 

Enable pipelining

You can enable pipelining by simply adding pipelining = True to the [ssh_connection]area of your ansible.cfg or by by using the ANSIBLE_PIPELINING and ANSIBLE_SSH_PIPELINING environment variables.
# ansible.cfg
You’ll also need to make sure that requiretty is disabled in /etc/sudoers on the remote host, or become won’t work with pipelining enabled.

Enable Mitogen for Ansible

Enabling Mitogen for Ansible is as simple as downloading and extracting the plugin, then adding 2 lines to the [defaults] section of your ansible.cfg:
# ansible.cfg

SSH multiplexing

The first thing to check is whether SSH multiplexing is enabled and used. This gives a tremendous speed boost because Ansible can reuse opened SSH sessions instead of negotiating new one (actually more than one) for every task. Ansible has this setting turned on by default. It can be set in configuration file as follows:

But be careful to override  ssh_args  — if you don’t set ControlMaster   and ControlPersist  while overriding, Ansible will “forget” to use them.

To check whether SSH multiplexing is used, start Ansible with  -vvvv  option:
ansible test -vvvv -m ping

UseDNS

UseDNS is an SSH-server setting (/etc/ssh/sshd_config file) which forces a server to check a client’s PTR-record upon connection. It may cause connection delays especially with slow DNS servers on the server side. In modern Linux distribution, this setting is turned off by default, which is correct.

PreferredAuthentications

It is an SSH-client setting which informs server about preferred authentication methods. By default Ansible uses:
-o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
So if GSSAPI Authentication is enabled on the server (at the time of writing this it is turned on in RHEL EC2 AMI) it will be tried as the first option, forcing the client and server to make PTR-record lookups. But in most cases, we want to use only public key auth. We can force Ansible to do so by changing ansible.cfg:

 

Facts Gathering

At the start of playbook execution, Ansible collects facts about remote system (this is default behaviour for ansible-playbook but not relevant to ansible ad-hoc commands). It is similar to calling “setup” module thus requires another ssh communication step. If you don’t need any facts in your playbook (e.g. our test playbook) you can disable fact gathering:

Fork

Until this moment we discussed how to speed up playbook execution on a given remote host. But if you run playbook against tens or hundreds of hosts, Ansible internal performance becomes a bottleneck. For example, there’s preconfigured number of forks – number of hosts that can be interacted simultaneously. You can change this value in  ansible.cfg file:

 

The default value is 5, which is quite conservative. You can experiment with this setting depending on your local CPU and network bandwidth resources.
Another thing about forks is that if you have a lot of servers to work with and a low number of available forks, your master ssh-sessions may expire between tasks. Ansible uses linear strategy by default, which executes one task for every host and then proceeds to the next task. This way if time between task execution on the first server and on the last one is greater than ControlPersist then master socket will expire by the time Ansible starts execution of the following task on the first server, thus new ssh connection will be required.

Poll Interval

When module is executed on remote host, Ansible starts to poll for its result. The lower is interval between poll attempts, the higher is CPU load on Ansible control host. But we want to have CPU available for greater forks number (see above). You can tweak poll interval in  ansible.cfg:

 

If you run “slow” jobs (like backups) on multiple hosts, you may want to increase the interval to 0.05   to use less CPU.
Hope this helps you to speed up your setup. Seems like there are no more items in environment check-list and further speed gains only possible by optimizing your playbook code.

Asynchronous Actions and Polling

By default tasks in playbooks block, meaning the connections stay open until the task is done on each node. This may not always be desirable, or you may be running operations that take longer than the SSH timeout.
To avoid blocking or timeout issues, you can use asynchronous mode to run all of your tasks at once and then poll until they are done.
The behaviour of asynchronous mode depends on the value of poll.

Avoid connection timeouts: poll > 0

When poll is a positive value, the playbook will still block on the task until it either completes, fails or times out.
In this case, however, async explicitly sets the timeout you wish to apply to this task rather than being limited by the connection method timeout.
To launch a task asynchronously, specify its maximum runtime and how frequently you would like to poll for status. The default poll value is 15 seconds if you do not specify a value for poll:

 

Concurrent tasks: poll = 0

When poll is 0, Ansible will start the task and immediately move on to the next one without waiting for a result.
From the point of view of sequencing this is asynchronous programming: tasks may now run concurrently.
The playbook run will end without checking back on async tasks.
The async tasks will run until they either complete, fail or timeout according to their async value.
If you need a synchronization point with a task, register it to obtain its job ID and use the async_status module to observe it.
You may run a task asynchronously by specifying a poll value of 0:

 

Enable fact_caching

By enabling this value we’re telling Ansible to keep the facts it gathers in a local file. You can also set this to a redis cache. See the documentation for details.
Fact_caching is what happens when Ansible says, “Gathering facts” about your target hosts. If we don’t change our targets hardware (or virtual hardware) very often this can be very helpful. Enable it by adding this to your ansible.cfg file:
Enable facts caching mechanism
If you still need some of the facts groups, but at the same time the gathering process is still slow for you, you could try use fact caching.
Caching enables Ansible to cache the facts for a given host in some kind of backend.
Currently the caching plugin supports the following cache backend:

  •  
More information on the caching plugin, could be found here:
This is an example configuration of facts caching in json files

References:

1.https://dzone.com/articles/speed-up-ansible

2.https://habr.com/en/post/453446/

3.https://www.toptechskills.com/ansible-tutorials-courses/speed-up-ansible-playbooks-pipelining-mitogen/

4.https://www.youtube.com/watch?v=NZUYAbGs-ec

How to configure django app using gunicorn?

Django

Django is a python web framework used for developing web applications. It is fast, secure and scalable. Let us see how to configure the Django app using gunicorn.

Before proceeding to actual configuration, let us see some intro on the Gunicorn.

Gunicorn

Gunicorn (Green Unicorn) is a WSGI (Web Server Gateway Interface) server implementation commonly used to run python web applications and implements PEP 3333 server standard specifications, therefore, it can run web applications that implement application interface. Web applications written in Django, Flask or Bottle implements application interface.

Installation

Gunicorn coupled with Nginx or any web server works as a bridge between the web server and web framework. Web server (Nginx or Apache) can be used to serve static files and Gunicorn to handle requests to and responses from Django application. I will try to write another blog in detail on how to set up a django application with Nginx and Gunicorn.

Prerequisites

Please make sure you have below packages installed in your system and a basic understanding of Python, Django and Gunicorn are recommended.

  • Python > 3.5
  • Gunicorn > 15.0
  • Django > 1.11

Configure Django App Using Gunicorn

There are different ways to configure the Gunicron, I am going to demonstrate more on running the Django app using the gunicorn configuration file.

First, let us start by creating the Django project, you can do so as follows.

After starting the Django project, the directory structure looks like this.

The simplest way to run your django app using gunicorn is by using the following command, you must run this command from your manage.py folder.

This will run your Django project on 8000 port locally.

Configuration

Now let’s see, how to configure the django app using gunicorn configuration file. A simple Gunicorn configuration with worker class sync will look like this.

Let us see a few important details in the above configuration file.

  1. Append the base directory path in your systems path.
  2. You can bind the application to a socket using bind.
  3. backlog Maximum number of pending connections.
  4. workers number of workers to handle requests. This is based on your machine’s CPU count. This can be varied based on your application workload.
  5. worker_class, there are different types of classes, you can refer here for different types of classes. sync is the default and should handle normal types of loads.

You can refer more about the available Gunicorn settings here.

Running Django with gunicorn as a daemon PROCESS

Here is the sample systemd file,

After adding the file to the location /etc/systemd/system/. To reload new changes in file execute the following command.

NOTE: MAKE SURE TO INSTALL REQUIRED PACKAGES, GUNICORN FAILS TO START IF THERE ARE ANY MISSING PACKAGES. YOU CAN REFER TO MORE INFO IN ERROR LOGFILE MENTIONED IN CONFIGURATION FILE.

Start, Stop and Status of Application using systemctl

Now you can simply execute the following commands for your application.

To start your application

To stop your application.

To check the status of your application.

Please refer to a short complete video tutorial to configure the Django app below.

Setup Xamarin Environment on Mac & Visual Studio

Below I have explained how to setup Xamarin environment on mac operating system step by step.

1. Download Visual studio : 

      Download Visual Studio with below link

     https://visualstudio.microsoft.com/downloads/

      

At Microsoft website, you will have three options of  Visual Studio edition to choose from. Choose one according to your need. To download Visual Studio just click on download button and an installer .dmg file will be downloaded.

2. Install Visual Studio:

   Click on downloaded dmg file and below screen will be presented

    

Select from the different Platforms you need to develop apps for on Xamarin and press the Install button. Once Visual Studio installation is complete, we need to setup environment for both Android and Apple.

3. Setup Android SDK:

    To setup Android SDK open Visual Studio and go to :-

    Tools -> SDK Manager ->Android -> Locations

    

Set path for SDK ,NDK and JDK to your local machine locations. Once correct  path is given a green tick will appear on right.This completes our Android SDK   setup.

4. Apple Setup (for both iOS and Mac apps development):

You need latest Xcode to setup Apple environment. If you have Xcode preinstalled on your machine then it automatically configures and we don’t  have to do anything. If you are installing Xcode after installation of Visual  Studio then follow below steps to setup.

a. Download latest Xcode from apple store and Install it on your machine.

b. Go to Tool -> SDK Manager -> Apple

 

Give path to your Xcode.app . You will see green check mark once the correct path is given. This completes Apple environment setup.

That is all.  Now you can start your Android and iOS development on Xamarin. Happy Coding!

How to create Gridview using Recylerview Android

First let’s understand what Gridview and Recylerview are, in Android.

Gridview

A view that shows items in two-dimensional scrolling grid is known as Gridview. GridView layout in one of the most useful layouts in android to create a scrolling grid (rows & columns).

Recylerview

Recylerview is introduced in Android 5.0 Lollipop. The Recylerview widget is a more advanced and flexible version of Listview. It is a container used to display a large number of data sets that can be scrolled very efficiently by maintaining a limited number of views.

Now let’s start implementing Gridview

First, we need to add below dependency in build.gradle file at app level module.

After that, we need to add Recylerview widget in your main XML file.

Now we need to create item_logo.xml for Gridview row item.

We need to create Adapter Object. An adapter in Android carries the data from a source (e.g. List<> ) and delivers it to a layout (.xml file).  The Adapter provides access to the data items.

To display images we can use Glide dependency.

Now we need to set data into Adapter.

GridLayoutManager is a Recylerview Layout Manager implementation to lay out items in a grid.

In the above code “3” is a number of columns in per row.

Output

 

 

 

 

 

 

 

 

 

 

That’s it, Happy Coding 🙂

Reference:-  https://developer.android.com/guide/topics/ui/layout/recyclerview

Country Picker With Flag jQuery plugin

About single/multiple country picker jQuery plugin :-

This single/multiple country picker jQuery plugin allows you to easily display a list of countries with flag in your Bootstrap form.

Dependencies :-

Usage :-

Create your <select> with the .country_selector class and add option of required countries.

Improtant Notes:
For multiple country picker add multiple attribute in <select> tag.

Add CSS class under <head> tag.


Add JS function under <script> tag in bottom or add in your JS file.

Configuration :-

Refer to this documentation for more configuration.

5 Ways to Speed Up SSH Connections in Linux

SSH is the most popular and secure method for managing Linux servers remotely. One of the challenges with remote server management is connection speeds, especially when it comes to session creation between the remote and local machines.

There are several bottlenecks to this process, one scenario is when you are connecting to a remote server for the first time; it normally takes a few seconds to establish a session. However, when you try to start multiple connections in succession, this causes an overhead (combination of excess or indirect computation time, memory, bandwidth, or other related resources to carry out the operation).

In this article, we will share four useful tips on how to speed up remote SSH connections in Linux.

1.Use Compression option in SSH

From the ssh man page (type man ssh to see the whole thing):

 

2.Force SSH Connection Over IPV4

OpenSSH supports both IPv4/IP6, but at times IPv6 connections tend to be slower. So you can consider forcing ssh connections over IPv4 only, using the syntax below:

Alternatively, use the AddressFamily (specifies the address family to use when connecting) directive in your ssh configuration file  (global configuration) or ~/.ssh/config (user specific file).

The accepted values are “any”, “inet” for IPv4 only, or “inet6”.

AddressFamily inet

3. Reuse SSH Connection

An ssh client program is used to establish connections to an sshd daemon accepting remote connections. You can reuse an already-established connection when creating a new ssh session and this can significantly speed up subsequent sessions.

You can enable this in your ~/.ssh/config file.

ControlMaster auto
ControlPath /home/akhil/.ssh/sockets/ssh_mux_%x_%p_%r
ControlPersist yes

openssh doesn’t support %x(ip address in control paths),  use my repo instead

https://github.com/akhilin/openssh-portable.git

or use %h to use hostname instead of ip address

using ip address is recommended so that even if you connect using different hostnames it uses same socket ( very useful when using ansible , pdsh )

4. Use Specific SSH Authentication Method

Another way of speeding up ssh connections is to use a given authentication method for all ssh connections, and here we recommend configuring ssh passwordless login using ssh keygen in 5 easy steps.

Once that is done, use the PreferredAuthentications directive, within ssh_config files (global or user specific) above. This directive defines the order in which the client should try authentication methods (you can specify a command separated list to use more than one method).

PreferredAuthentications=publickey

If you prefer password authentication which is deemed unsecure, use this.

5.Disable DNS Lookup On Remote Machine

By default, sshd daemon looks up the remote host name, and also checks that the resolved host name for the remote IP address maps back to the very same IP address. This can result into delays in connection establishment or session creation.

The UseDNS directive controls the above functionality; to disable it, search and uncomment it in the /etc/ssh/sshd_config file. If it’s not set, add it with the value no.

UseDNS=no

Threads usage in C programming

Threads usage in C programming

If you want to write tables from two – five at the same time and using pencils and papers we need at least four writing hands ( four people), four pencils and four papers one for each to write a table. This method is called parallelism. In this method, we obtain the result in a short time. If one person does the same task it takes four times longer. (To understand threads)

In computer C programming, this process is called threads. By using these we get efficiency in programs to solve complex issues. In the Linux environment, POSIX threads have appeared. These are called pthreads and having a library named pthread.

Types of threads:

These are generally two types.

1. Joinable threads
2. Detachable threads

Joinable threads need to join them, whereas Detachable threads run their self. In every program main function, itself is the main thread.

To create pthread in C program we using phtread_create() function. In this function, it takes four arguments.

1. thread id
2. attribute
3. function to call
4. only argument to calling function

Example 1:

Here is an example C program to demonstrate joinable.

To compile below program
gcc thread.c -o thread -lpthread

To execute the program
./thread

The result is almost like this:

result of thread.c

result of thread.c For more information on the pthread_create function refer the below link

https://linux.die.net/man/3/pthread_create

Example 2:

Here is another program to demonstrate the detachable type.

To compile below program
gcc thread_detach.c -o thread_detach -lpthread

To execute the program
./thread_detach

The result is almost like this:

thread_detach.c

For more information on the pthread_detach function refer the below link:

https://linux.die.net/man/3/pthread_detach

 

Installing ELK Stack(Elasticsearch,Logstash,Kibana) on CentOS with Sentinl plugin

ELK stack is also known as the Elastic stack, consists of Elasticsearch, Logstash, and Kibana. It helps you to have all of your logs stored in one place and analyze the issues by correlating the events at a particular time.

This guide helps you to install ELK stack on CentOS 7 / RHEL 7.

Components

Logstash – It does the processing (Collect, enrich and send it to Elasticsearch) of incoming logs sent by beats (forwarder).

Elasticsearch – It stores incoming logs from Logstash and provides an ability to search the logs/data in a real-time

Kibana – Provides visualization of logs.

Sentinl –  Sentinl extends Siren Investigate and Kibana with Alerting and Reporting functionality to monitor, notify and report on data series changes using standard queries, programmable validators and a variety of configurable actions – Think of it as a free an independent “Watcher” which also has scheduled “Reporting” capabilities (PNG/PDFs snapshots).

SENTINL is also designed to simplify the process of creating and managing alerts and reports in Siren Investigate/Kibana 6.xvia its native App Interface, or by using native watcher tools in Kibana 6.x+.

 

Beats – Installed on client machines, send logs to Logstash through beats protocol.

Environment

To have a full-featured ELK stack, we would need two machines to test the collection of logs.

ELK Stack

Filebeat

Prerequisites

Install Java

Since Elasticsearch is based on Java, make sure you have either OpenJDK or Oracle JDK is installed on your machine.

Here, I am using OpenJDK 1.8.

Verify the Java version.

Output:

Configure ELK repository

Import the Elastic signing key.

Setup the Elasticsearch repository and install it.

Add the below content to the elk.repo file.

Install Elasticsearch

Elasticsearch is an open source search engine, offers a real-time distributed search and analytics with the RESTful web interface. Elasticsearch stores all the data are sent by the Logstash and displays through the web interface (Kibana) on users request.

Install Elasticsearch.

Configure Elasticsearch to start during system startup.

Use CURL to check whether the Elasticsearch is responding to the queries or not.

Output:

Install Logstash

Logstash is an open source tool for managing events and logs, it collects the logs, parse them and store them on Elasticsearch for searching. Over 160+ plugins are available for Logstash which provides the capability of processing the different type of events with no extra work.

Install the Logstash package.

Create SSL certificate (Optional)

Filebeat (Logstash Forwarder) are normally installed on client servers, and they use SSL certificate to validate the identity of Logstash server for secure communication.

Create SSL certificate either with the hostname or IP SAN.

(Hostname FQDN)

If you use the Logstash server hostname in the beats (forwarder) configuration, make sure you have A record for Logstash server and also ensure that client machine can resolve the hostname of the Logstash server.

Go to the OpenSSL directory.

Now, create the SSL certificate. Replace green one with the hostname of your real Logstash server.

Configure Logstash

Logstash configuration can be found in /etc/logstash/conf.d/. Logstash configuration file consists of three sections input, filter, and the output. All three sections can be found either in a single file or separate files end with .conf.

I recommend you to use a single file for placing input, filter and output sections.

In the first section, we will put an entry for input configuration. The following configuration sets Logstash to listen on port 5044 for incoming logs from the beats (forwarder) that sit on client machines.

Also, add the SSL certificate details in the input section for secure communication – Optional.

In the filter section. We will use Grok to parse the logs ahead of sending it to Elasticsearch. The following grok filter will look for the syslog labeled logs and tries to parse them to make a structured index.

For more filter patterns, take a look at grokdebugger page.

In the output section, we will define the location where the logs to get stored; obviously, it should be Elasticsearch.

Now start and enable the Logstash service.

You can troubleshoot any issues by looking at Logstash logs.

Install & Configure Kibana

Kibana provides visualization of logs stored on the Elasticsearch. Install the Kibana using the following command.

Edit the kibana.yml file.

By default, Kibana listens on localhost which means you can not access Kibana interface from external machines. To allow it, edit the below line with your machine IP.

Uncomment the following line and update it with the Elasticsearch instance URL. In my case, it is localhost.

Start and enable kibana on system startup.

Install Sentinl plugin:

Install and Configure Filebeat

There are four beats clients available

  1. Packetbeat – Analyze network packet data.
  2. Filebeat – Real-time insight into log data.
  3. Topbeat – Get insights from infrastructure data.
  4. Metricbeat – Ship metrics to Elasticsearch.

To analyze the system logs of the client machine (Ex. client.lintel.local), we need to install filebeat. Create beats.repo file.

Add the below content to the above repo file.

Now, install Filebeat using the following command.

Set up a host entry on the client machine in case your environment does not have DNS server.

Make an host entry like below on the client machine.

Filebeat (beats) uses SSL certificate for validating Logstash server identity, so copy the logstash-forwarder.crt from the Logstash server to the client.

Skip this step, in case you are not using SSL in Logstash.

Filebeat configuration file is in YAML format, which means indentation is very important. Make sure you use the same number of spaces used in the guide.

Open up the filebeat configuration file.

On top, you would see the prospectors section. Here, you need to specify which logs should be sent to Logstash and how they should be handled. Each prospector starts with – character.

For testing purpose, we will configure filebeat to send /var/log/messages to Logstash server. To do that, modify the existing prospector under paths section.

Comment out the – /var/log/*.log to avoid sending all .log files present in that directory to Logstash.

Comment out the section output.elasticsearch: as we are not going to store logs directly to Elasticsearch.

Now, find the line output.logstash and modify the entries like below. This section defines filebeat to send logs to Logstash server server.lintel.local on port 5044 and mention the path where the copied SSL certificate is placed

Replace server.lintel.local with IP address in case if you are using IP SAN.

Restart the service.

Beats logs are typically found syslog file.

Access Kibana

Access the Kibana using the following URL.

http://your-ip-address:5601/

You would get the Kibana’s home page.

Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 - Kibana Starting Page
Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 – Kibana Starting Page

On your first login, you have to map the filebeat index. Go to Management >> Index Patterns.

Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 - Management
Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 – Management

Type the following in the Index pattern box.

Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 - Create Index Pattren
Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 – Create Index Pattern

You should see at least one filebeat index something like above. Click Next step.

Select @timestamp and then click on Create.

Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 - Configure Timestamp
Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 – Configure Timestamp

Verify your index patterns and its mappings.

Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 - Index Mappings
Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 – Index Mappings

Now, click Discover to view the incoming logs and perform search queries.

Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 - Discover Logs
Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 7 – Discover Logs

You can see sentinl plugin here

sentinl_annotation

That’s All.

 

Reference list:

https://github.com/sirensolutions/sentinl

https://www.itzgeek.com

Configure Celery with SQS and Django on Elastic Beanstalk

 Introduction

Has your users complained about the loading issue on the web app you developed. That might be because of some long I/O bound call or a time consuming process. For example, when a customer signs up to website and we need to send confirmation email which in normal case the email will be sent and then reply 200 OK response is sent on signup POST. However we can send email later, after sending 200 OK response, right?. This is not so straight forward when you are working with  a framework like Django, which is tightly binded to MVC paradigm.

So, how do we do it ? The very first thought in mind would be python threading module. Well, Python threads are implemented as pthreads (kernel threads), and because of the global interpreter lock (GIL), a Python process only runs one thread at a time. And again threads are hard to manage, maintain code and scale it.

Perequisite

Audience for this blog requires to have knowledge about Django and AWS elastic beanstalk.

Celery

Celery is here to rescue. It can help when you have a time consuming task (heavy compute or I/O bound tasks) between request-response cycle. Celery is an open source asynchronous task queue or job queue which is based on distributed message passing. In this post I will walk you through the celery setup procedure with django and SQS on elastic beanstalk.

Why Celery ?   

Celery is very easy to integrate with existing code base. Just write a decorator above the definition of a function declaring a celery task and call that function with a .delay method of that function.

Broker

To work with celery, we need a message broker. As of writing this blog, Celery supports RabbitMQ, Redis, and Amazon SQS (not fully) as message broker solutions. Unless you don’t want to stick to AWS ecosystem (as in my case), I recommend to go with RabbitMQ or Redis because SQS does not yet support remote control commands and events. For more info check here. One of the reason to use SQS is its pricing. One million SQS free request per month for every user.

Proceeding with SQS, go to AWS SQS dashboard and create a new SQS queues. Click on create new queue button.

Depending upon the requirement we can select any type of the queue. We will name queue as dev-celery.

Installation

Celery has a very nice documentation. Installation and configuration is described here. For convenience here are the steps

Activate your virtual environment, if you have configured one and install cerely.

pip install celery[sqs]

Configuration

Celery has built-in support of django. It will pick its setting parameter from django’s settings.py which are prepended by CELERY_ (‘CELERY’ word needs to be defined while initializing celery app as namespace). So put below setting parameter in settings.py

AWS login credentials should be present in the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

Now let’s configure celery app within django code. Create a celery.py file besides django’s settings.py.

Now put below code in projects __init__.py

Testing

Now let’s test the configuration. Open terminal start celery

Terminal 1

 

All the task which are registered to use celery using celery decorators appear here while starting celery. If you find that your task does not appear here then make sure that the module containing the task is imported on startup.

Now open django shell in another terminal

Terminal 2

After executing the task function with delay method, that task should run in the worker process which is listening to events in other terminal. Here celery sent a message to SQS with details of the task and worker process which was listening to SQS, received it and task was executed in worker process. Below is what you should see in terminal 1

Terminal 1

Deploy celery worker process on AWS elastic beanstalk

Celery provides “multi” sub command to run process in daemon mode, but this cannot be used on production. Celery recommends various daemonization tools http://docs.celeryproject.org/en/latest/userguide/daemonizing.html

AWS elastic beanstalk already use supervisord for managing web server process. Celery can also be configured using supervisord tool. Celery’s official documentation has a nice example of supervisord config for celery. https://github.com/celery/celery/tree/master/extra/supervisord. Based on that we write quite a few commands under .ebextensions directory.

Create two files under .ebextensions directory. Celery.sh file extract the environment variable and forms celery configuration, which copied to /opt/python/etc/celery.conf file and supervisord is restarted. Here main celery command:

At the time if writing this blog celery had https://github.com/celery/celery/issues/3759 issue. As a work around to this issue we add “-P solo”. This will run task sequentially for a single worker process.

Now create elastic beanstalk configuration file as below. Make sure you have pycurl and celery in requirements.txt. To install pycurl libcurl-devel needs to be installed from yum package manager.

Add these files to git and deploy to elastic beanstalk.

Below is the figure describing the architecture with django, celery and elastic beanstalk.