Configure Celery with SQS and Django on Elastic Beanstalk

 Introduction

Has your users complained about the loading issue on the web app you developed. That might be because of some long I/O bound call or a time consuming process. For example, when a customer signs up to website and we need to send confirmation email which in normal case the email will be sent and then reply 200 OK response is sent on signup POST. However we can send email later, after sending 200 OK response, right?. This is not so straight forward when you are working with  a framework like Django, which is tightly binded to MVC paradigm.

So, how do we do it ? The very first thought in mind would be python threading module. Well, Python threads are implemented as pthreads (kernel threads), and because of the global interpreter lock (GIL), a Python process only runs one thread at a time. And again threads are hard to manage, maintain code and scale it.

Perequisite

Audience for this blog requires to have knowledge about Django and AWS elastic beanstalk.

Celery

Celery is here to rescue. It can help when you have a time consuming task (heavy compute or I/O bound tasks) between request-response cycle. Celery is an open source asynchronous task queue or job queue which is based on distributed message passing. In this post I will walk you through the celery setup procedure with django and SQS on elastic beanstalk.

Why Celery ?   

Celery is very easy to integrate with existing code base. Just write a decorator above the definition of a function declaring a celery task and call that function with a .delay method of that function.

Broker

To work with celery, we need a message broker. As of writing this blog, Celery supports RabbitMQ, Redis, and Amazon SQS (not fully) as message broker solutions. Unless you don’t want to stick to AWS ecosystem (as in my case), I recommend to go with RabbitMQ or Redis because SQS does not yet support remote control commands and events. For more info check here. One of the reason to use SQS is its pricing. One million SQS free request per month for every user.

Proceeding with SQS, go to AWS SQS dashboard and create a new SQS queues. Click on create new queue button.

Depending upon the requirement we can select any type of the queue. We will name queue as dev-celery.

Installation

Celery has a very nice documentation. Installation and configuration is described here. For convenience here are the steps

Activate your virtual environment, if you have configured one and install cerely.

pip install celery[sqs]

Configuration

Celery has built-in support of django. It will pick its setting parameter from django’s settings.py which are prepended by CELERY_ (‘CELERY’ word needs to be defined while initializing celery app as namespace). So put below setting parameter in settings.py

AWS login credentials should be present in the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

Now let’s configure celery app within django code. Create a celery.py file besides django’s settings.py.

Now put below code in projects __init__.py

Testing

Now let’s test the configuration. Open terminal start celery

Terminal 1

 

All the task which are registered to use celery using celery decorators appear here while starting celery. If you find that your task does not appear here then make sure that the module containing the task is imported on startup.

Now open django shell in another terminal

Terminal 2

After executing the task function with delay method, that task should run in the worker process which is listening to events in other terminal. Here celery sent a message to SQS with details of the task and worker process which was listening to SQS, received it and task was executed in worker process. Below is what you should see in terminal 1

Terminal 1

Deploy celery worker process on AWS elastic beanstalk

Celery provides “multi” sub command to run process in daemon mode, but this cannot be used on production. Celery recommends various daemonization tools http://docs.celeryproject.org/en/latest/userguide/daemonizing.html

AWS elastic beanstalk already use supervisord for managing web server process. Celery can also be configured using supervisord tool. Celery’s official documentation has a nice example of supervisord config for celery. https://github.com/celery/celery/tree/master/extra/supervisord. Based on that we write quite a few commands under .ebextensions directory.

Create two files under .ebextensions directory. Celery.sh file extract the environment variable and forms celery configuration, which copied to /opt/python/etc/celery.conf file and supervisord is restarted. Here main celery command:

At the time if writing this blog celery had https://github.com/celery/celery/issues/3759 issue. As a work around to this issue we add “-P solo”. This will run task sequentially for a single worker process.

Now create elastic beanstalk configuration file as below. Make sure you have pycurl and celery in requirements.txt. To install pycurl libcurl-devel needs to be installed from yum package manager.

Add these files to git and deploy to elastic beanstalk.

Below is the figure describing the architecture with django, celery and elastic beanstalk.

What is DBF file? How to read it in linux and python?

What is DBF files ?

A DBF file is a standard database file used by dBASE, a database management system application. It organises data into multiple records with fields stored in an array data type. DBF files are also compatible with other “xBase” database programs, which became an important feature because of the file format’s popularity.

Tools which can read or open DBF files

Below are list of program which can read and open dbf file.

  • Windows
    1. dBase
    2. Microsoft Access
    3. Microsoft Excel
    4. Visual Foxpro
    5. Apache OpenOffice
    6. dbfview
    7. dbf Viewer Plus
  • Linux
    1. Apache OpenOffice
    2. GTK DBF Editor

How to read file in linux ?

“dbview” command available in linux, which can read dbf files.

Below code snippet show how to use dbview command.

 How to read it using python ?

dbfread” is the library available in python to read dbf files. This library reads DBF files and returns the data as native Python data types for further processing.

dbfread requires python 3.2 or 2.7.  dbfread is a pure python module, so doesn’t depend on any packages outside the standard library.

You can install library by the command below.

The below code snippet can read dbf file and retrieve data as python dictionary.

You can also use the with statement:

By default the records are streamed directly from the file.  If you have enough memory you can load them into a list instead. This allows random access

 How to Write content in DBF file using python ?

dbfpy is a python-only module for reading and writing DBF-files.  dbfpy can read and write simple DBF-files.

You can install it by using below command

The below example shows how to create dbf files and write records in to it.

Also you can update a dbf file record using dbf module.

The below example shows how to update a record in a .dbf file.

 

What is milter?

Every one gets tons of email these days. This includes emails about super duper offers from amazon to princess and wealthy businessmen trying to offer their money to you from some African country that you have never heard of. In all these emails in your inbox there lies one or two valuable emails either from your friends, bank alerts, work related stuff. Spam is a problem that email service providers are battling for ages. There are a few opensource spam fighting tools available like SpamAssasin or SpamBayes.

What is milter ?

Simply put – milter is mail filtering technology. Its designed by sendmail project. Now available in other MTAs also. People historically used all kinds of solutions for filtering mails on servers using procmail or MTA specific methods. The current scene seems to be moving forward to sieve. But there is a huge difference between milter and sieve. Sieve comes in to picture when mail is already accepted by MTA and had been handed over to MDA. On the other hand milter springs into action in the mail receiving part of MTA. When a new connection is made by remote server to your MTA, your MTA will give you an opportunity to accept of reject the mail every step of the way from new connection, reception of each header, and reception of body.

milter stages
milter protocol various stages

The above picture depicts simplified version of milter protocol working. Full details of milter protocol can be found here https://github.com/avar/sendmail-pmilter/blob/master/doc/milter-protocol.txt  . Not only filtering; using milter, you can also modify message or change headers.

HOW DO I GET STARTED WITH CODING MILTER PROGRAM ?

If you want to get started in C you can use libmilter.  For Python you have couple of options:

  1. pymilter –  https://pythonhosted.org/milter/
  2. txmilter – https://github.com/flaviogrossi/txmilter

Postfix supports milter protocol. You can find every thing related to postfix’s milter support in here – http://www.postfix.org/MILTER_README.html

WHY NOT SIEVE WHY MILTER ?

I found sieve to be rather limited. It doesn’t offer too many options to implement complex logic. It was purposefully made like that. Also sieve starts at the end of mail reception process after mail is already accepted by MTA.

Coding milter program in your favorite programming language gives you full power and allows you to implement complex , creative stuff.

WATCHOUT!!!

When writing milter programs take proper care to return a reply to MTA quickly. Don’t do long running tasks in milter program when the MTA is waiting for reply. This will have crazy side effects like remote parties submitting same mail multiple time filling up your inbox.

FreeSWITCH status on LED display using socket connection

It is a simple experiment to show  FreeSWITCH  status on LED display using socket connection. Here is Video :

What You Need

1.Raspberry pi-3

2.MAX-7219 based 8×8 LED Matrix Displays(4.No’s or more).

Those available in kit form and assembled form. And we can purchase through on- line marketing like Amazon etc.

In my case 4 modules are powered from GPIO pins of Raspberry . It is good to use separate power for modules for more than 2 modules.

3.Female to Female connector wires

to connect GPIO pins and MAX7219 LED modules.

Next What to do(installing FreeSWITCH)

1.Prepare SD card and load Raspbian and install FreeSWITCH.  For details

https://www.algissalys.com/how-to/freeswitch-1-7-raspberry-pi-2-voip-sip-server

2.Install Display drivers for MAX7219. 

git clone https://github.com/rm-hull/max7219.git
sudo python max7219/setup.py install

3.Do wiring.

(as given below) between GPIO of Raspberry pi and MAX 7219 matrix LED displays.

Pin        Name       Remarks            RPi Pin          RPi Function

1            Vcc          +5V Power              2                        5V0

2            Gnd           Ground                  6                        Gnd

3            DIN            Data In                 19                GPIO 10 (MOSI)

4             CS          Chip Select              24                 GPIO  8 (SPI CS0)

5            CLK           Clock                      23                GPIO 11 (SPI CLK)

4.Run demo program.

Edit matrix_demo.py according to no. of matrix devices used  i.e cascaded= n, in my case n=4.

device = max7219(serial, cascaded=4 or 1, block_orientation=block_orientation).

sudo python max7219/examples/matrix_demo.py

At Last

Use ESL connection between FreeSWITCH and Max7219demo program. For details

https://freeswitch.org/confluence/display/FREESWITCH/Python+ESL

Here is my source file.

 

 

InlineCallbacks

Twisted features a decorator named inlineCallbacks which allows you to work with deferreds without writing callback functions.

This is done by writing your code as generators, which yield deferreds instead of attaching callbacks.

Consider the following function written in the traditional deferred style:

using inlineCallbacks, we can write this as:

from twisted.internet.defer import inlineCallbacks

Instead of calling addCallback on the deferred returned by redis.Connection, we yield it. this causes Twisted to return the deferred‘s result to us.

Though the inlineCallbacks looks like synchronous code, which blocks while waiting for the request to finish, each yield statement allows other code to run while waiting for the deferred being yielded to fire.

inlineCallbacks become even more powerful when dealing with complex control flow and error handling.

txRedis

txredis is a non-blocking client for the redis database, written in Python. It uses twisted for the asynchronous communication with redis.

Install

Now to check if txredis is properly installed or not goto python prompt and import txredis :

Example :

Using redis-cli, add list of user’s in redis-server with hmset :

hmset, set key to value within hash name for each corresponding key and value from the mapping dict.

Now, using python twisted get user’s :

Above code do connection with redis-server.

Here, rediscreator.connectTCP return defer and register appropriate method in addCallback, addErrback

Here, define callback function onRedisConnect and errRedisConnect.

Here, hget return the value of key within the hase name and return defer

Here, define callback function onSuccess and onFailur and log the appropriate result.

It give output 123, it is correspond to user abc

How Implement Multiservice in Twisted.

Multiservice module is service collection provided by twisted, which is useful for creating a new service and combines with two or more existing services.

The major tools that manages Twisted application is a command line utility called twistd. twistd is a cross-platform, and is the recommended tool for running twisted applications.

The core component of the Twisted Application infrastructure is the

object. which represents your application. Application acts as a container of any “Services” that your application provides. This will be done through Services.

Services manages application that can be started and stopped. In Application object can contain many services, or can even hierarchies of Services using “Multiservice” or your own custom IServiceCollection implementations.

Multiservice Implementaion:

To use multiserivce, which implements IService. For this, import internet and service module.

 

Example :

To run, Save above code in a file as serviceexample.tac . Here, “tac ” file is regular python file. Twisted application infrastructure, protocol implementations live in a module, services, using those protocols are registered in a Twisted Application Configuration(TAC) file and the reactor and configuration are managed by an external utility.

Here, I use multiservice functionality from service. agentservice create object of multiservice. Then add services using add service method. In service, you can add web servers, FTP servers and SSH clients. After this, set application name and pass application to serviceparent method.

now, add service on port 8082 as :

add another service same as above on port 8083 as:

To run serviceexample.tac file using twistd program, use command twistd -y serviceexample.tac -n. After this, open browser and enter url localhost:8082 and localhost:8083. You can see result on web page and both TCP servers are active.

Asynchronous DB Operations in Twisted

Twisted is an asynchronous networking framework. Other Database API Implementations have blocking interfaces.

For this reason, twisted.enterprise.adbapi was created. It is a non-blocking interface,which allows you to access a number of different RDBMSes.

General Method to access DB API.

1 ) Create a Connection with db.

2) create a cursor.

3) do a query.

Cursor blocks to response in asynchronous framework. Those delays are unacceptable when using an asynchronous framework such as Twisted.
To Overcome blocking interface, twisted provides asynchronous wrapper for db module such as twisted.enterprise.adbapi

Database Connection using adbapi API.

To use adbapi, we import dependencies as below

1) Connect Database using adbapi.ConnectionPool

Here, We do not need to import dbmodule directly.
dbmodule.connect are passed as extra arguments to adbapi.ConnectionPool’s Constructor.

2) Run Database Query

Here, I used ‘%s’ paramstyle for mysql. if you use another database module, you need to use compatible paramstyle. for more, use DB-API specification.

Twisted doesn’t attempt to offer any sort of magic parameter munging – runQuery(query,params,…) maps directly onto cursor.execute(query,params,…).

This query returns Deferred, which allows arbitrary callbacks to be called upon completion (or failure).

Demo : Select, Insert and Update query in Database.

Here, I have used MySQLdb api, agentdata as a database name, root as a user, 123456 as a password.
Also, I have created select, insert and update query for select, insert and update operation respectively.
runQuery method returns deferred. For this, add callback and error back to handle success and failure respectively.

How to upload a file in Zoho using python?

UploadFile API Method of Zoho CRM

Table of Contents

  • Purpose
  • Request URL
  • Request Parameters
  • Python Code to Upload a file to a record
  • Sample Response

Purpose

You can use this method to attach files to records.

Request URL

XML Format:
https://crm.zoho.com/crm/private/xml/Leads/uploadFile?authtoken=Auth Token&scope=crmapi&id=Record Id&content=File Input Stream

JSON Format:
https://crm.zoho.com/crm/private/json/Leads/uploadFile?authtoken=Auth Token&scope=crmapi&id=Record Id&content=File Input Stream

Request Parameters

Parameter Data Type Description
authtoken* String Encrypted alphanumeric string to authenticate your Zoho credentials.
scope* String Specify the value as crmapi
id* String Specify unique ID of the “record” or “note” to which the file has to be attached.
content FileInputStream Pass the File Input Stream of the file
attachmentUrl String Attach a URL to a record.

* – Mandatory parameter

Important Note:

  • The total file size should not exceed 20 MB.
  • Your program can request only up to 60 uploadFile calls per min. If API User requests more than 60 calls, system will block the API access for 5 min.
  • If the size exceeds 20 MB, you will receive the following error message: “File size should not exceed 20 MB“. This limit does not apply to URLs attached via attachmentUrl.
  • The attached file will be available under the Attachments section in the Record Details Page.
  • Files can be attached to records in all modules except Reports, Dashboards and Forecasts.
  • In the case of the parameter attachmentUrl, content is not required as the attachment is from a URL.
    Example for attachmentUrl: crm/private/xml/Leads/uploadFile?authtoken=*****&scope=crmapi&id=<entity_id>&attachmentUrl=<insert_ URL>

Python Code to Upload a file to a record

Here’s a simple script that you can use to upload a file in zoho using python.

Go to https://pypi.python.org/pypi/MultipartPostHandler2/0.1.5 and get the egg file and install it.

In the program, you need to specify values for the following:
  • Your Auth Token
  • The ID of the Record
  • The uploadFile Request URL in the format mentioned above
  • The File Path i.e the location of the File

Sample Response

 

Implementing Webhook Handler in Python.

What is Webhook ?

Webhook is an asynchronous HTTP callback on an event occurrence. It is a simple server to server communication for reporting a specific event occurred on a server. The server on which event occurred will fire a HTTP POST request to another server on a URL which is provided by receiving server.

For example, whenever your colleague pushes code commits to github, an event has occurred on github’s server. Now if a webhook URL is provided in github settings, a webhook will be fired to that URL. This webhook will be a HTTP POST request with commit details inside the body in a specified format.  More details on github webhook can be found here.

In this post, I will share my experience of implementing webhook handler in python. For the readers, basic knowledge on implementing web application in python would be better.

Webhook Handler

A Webhook can be handled by simply providing a URL endpoint in a web application. Following is an example using Django. Add webhook url in urls.py

Now create view function in views.py which will parse the data and process it.  In most of the cases, webhook data is sent in JSON format. So lets load the webhook data and sent the data to process_webhook function.

Most of the web applications accept POST request after verifying CSRF token, but here we need to exempt it from this check. So put @csrf_token decorator above the view function. Also put an @require_post decorator to ensure the request is only POST.

The above implementation of URL endpoint will remain different for various other python web framework like Flask, tornado, twisted. But the below code  process_webhook function implementation will remain same irrespective of any framework.

Processing event

There may be different type events we need to handle. So, before proceeding to implement process_webhook function, lets create a python module named webhook_events.py, which will contain a single function for each type of event wherein will be the logic for that particular event. In other words, we are going to map event name with its function, which will handle the logic for that particular type of webhook event.

There are many ways to implement process_webhook function and how we map a webhook event with its function. We are going to discuss different implementation of process_webhook based on extendability. Most basic version of that is below.

A Better way

Now suppose, there are 10s of webhook to be served. We certainly don’t want to write repetitive code. So below is a better way of implementing process_webhook. Here we just replace dot in event name with underscore, so that we get the function name written in webhook_events.py for that event. If the function is not found that means event is not registered (not being served). In this way, no matter the number webhook to be served, just write the function to handle it, in webhook_events.py

Decorators

More robust and pythonic way of implementing process_webhook is by using decorators. Lets define a decorator in webhook_events.py which will map the event_name to its function. Here the EVENT_MAP is dictionary inside a setting module, which will contain event name as key and event function as its value.

In this case, the process_webhook will look like below:

This is the way which I prefer to implement webhook handler in python. How would you prefer ? Please feel free to comment below.