How to shuffle lines in the file in linux

We can shuffle lines in the file in linux using following commands

  • shuf
  • sed and sort
  • awk
  • python

As an example we will take a file shuffle_mylines.txt  having numbers till 10 each digit in a new line.

Create a file using following command

$ seq 10  > shuffle_mylines.txt

Command shuf

This command is light wight and straight forward. You just need to call this command with file name as an argument.

Shuffle lines using sed

You may have already know about command sed(Stream Editor). It is one of the command widely used for text processing in unix/linux. We can’t shuffle line using single sed command, but we will do by combining other commands. Let’s take a look at following command,

How does it work?

Breakdown of above command,

Commands we have used in the above example are,

  • cat
  • while loop
  • $RANDOM   environment variable
  • soft
  • tail
  • sed


Now, lets come to see how this command work. First command cat will read the file content and will pipe it to shell while loop

Where, while loop will read the piped input into variable x and will iterate over all lines to generate  output  <random_number>:<line> as you can see $RANDOM:$x. Where $RANDOM is the environment variable, each time you query this variable you will get random number. Which is useful for to shuffle lines.

Then, we will sort output of above while loop using sort command

Out put of this command will always be randomly shuffled lines. It’s because $RANDOM.

Output here would look like,

To remove preceded random values we will use sed.

That’s it. On every execution of this command you will get shuffled lines. You can redirect output to new file if you want to store using (>) or (>>).


Shuffle lines using awk

The awk is the programming language which is specially designed for text processing. We will use it to shuffle lines.


Another example using awk. It’s is similar to sed and sort example.

Shuffle lines in file using python

Python is popular scripting language widely used today from big projects to small scripts. We will see, how you can shuffle lines using python.

Python Example 1

In this example, we are passing file name as command line argument. Reading it and shuffling the lines of file and printing them on terminal.

The output can be redirected to  a file using redirect operator (> or >>)


If you are looking for a quick shuffle command shuf is best choice or you can have a fun of using other ways to shuffle lines in the file.



How to see stashed changes using git stash

The command git stash is used to stash the changes in a dirty working directory away.

You can list all stashed change using the command  git stash list,

Every time you stash your working directory, git will save the state of working directory into somethine which mantins history of stash tree. Every time you stash your changes it will be save as a kind of commit in stash tree. All stashed changes are stacked in the order with serial or reference. stash@{0} is the first or top most or recent stash. Every time if you want to refer a particular stash you have to use the it’s reference id something like stash@{3}, stash@{5}… etc

Think of each stash as a separate  commit. These commits are stored and stacked differently and not overlapped with conventional git commit history 

A stash is represented as a commit whose tree records the state of the working directory, and its first parent is the commit at HEAD when the stash was created.

The following command can be used to extract diff of stashed change againest any other stash or commit or branch or HEAD.

  • git stash show
  • git show
  • git diff
  • git difftool

Let’s see, how we can use each of the above mentioned commands.

Command git stash show

The simple command git stash show  gives very brief summary of changes of file, but will not show the diff of changes against current HEAD. Something like,

Something like below

Some times this is not so useful. You may want to see the difference against  current HEAD or any specific commit or current directory.

If you use git stash show along with option -p, It will show all changes.


Check diff against selected stash.

Command git show

The command git-show  is used to see  various types of objects.

The command git-show is not only used to visualize  stash changes, but also used to see one or more objects like blobs, trees, tags and commits. For more information check git-show


To see top most stash difference against  HEAD:

To get diff of of selected stash against HEAD:

See selected complete whole file as if stash is applied  from selected stash:


stash@{0} is the reference of stash. I could be any one of stash@{0}, stash@{1}, stash@{2}… etc.

<file_name>  is the name of the file relative to project/git repository

Command git diff 

The command git-diff  is also one of common command which is used to show changes between commits, commit and working tree, etc.

By default, git diff will show the diff of selected stash against(modified files) current state of repository unless other stash  reference or commit is specified.

To get difference between top most stash stash@{0} and master branch:

Only display the names of file not diff of changes:

See the diff between selected stashes for a selected file:


Command git difftool

The command git-difftool can also be used to find diff between selected stash and selected commit or branch or stash

See the difference between latest two stashes:




Commands which are  useful to extract the diff from selected stash git stash show, git show,  git diff, git difftool . 

See difference using command git stash show,

See the changes in the stash using command git show,

See the difference between latest stash and selected commit using command git diff,





How to delete files older than specified number of days using find command

The Linux command utility find will allow us to perform arbitrary commands on files which are filtered by the command.  Using this opportunity we can delete the files which are older than specified days by passing either command or action to find command.

Find command syntax would look like,

How to delete files older than specified days

To delete files which are older than specified number of days. We have to filter those files using criteria with action which will delete those files.

We can filter the files which are older than specified number of days using test -mtime.  Filtered files can b deleted using either of the following actions  -exec or -delete.

The command we need would look like,

find <path-to-files> -mtime +n -exec rm {} \;


find <path-to-files> -mtime +n -delete

n specifies number of days old that file should be to get not filtered(to get included in output)

Let’s consider an example to  delete files which are older than 7 days.

Example: Delete files older than 7 days

or, as mentioned we can also use -delete action.

As you can in the above command. We also mention other filters(tests) like -name along with -mtime to control what files should be deleted.

Break Down Of Command

First argument:  this could be either absolute path or relative path or wildcard specifies that where to search for files

Second argument: this is the criteria to filter files based on name, path, pattern and how many number of days older etc. More tests can be added to reduce final outcome of files.  Here we used -mtime +7 to filter all files which are older than 7 days.

Third argument: this argument will be the action. Which specifies what action should be preformed on found files. By default it is -print, means it will just print the result. As per our requirement we are using action -delete or -exec to delete files

Action -exec this is generic action, which can be used to perform any shell command on each file which is being located. Here use are using rm {} \;  Where {} represents the current file, it will expand to the name/path of found file.
Note: There is space between {} and \. If you omit this space it will throw and error.


How to build and install FreeSWITCH 1.6 on Debian 8 Jessie

FreeSWITCH is an opensource telephony soft switch created in 2006. As per official wiki  page,

It is a scalable open source cross-platform telephony platform designed to route and interconnect popular communication protocols using audio, video, text or any other form of media.

Sounds good. right ?

We are using Debian for this tutorial as it is very stable & mature linux distribution and FreeSWITCH core developers’ choice-of-distribution .
You can read more about FreeSWITCH on there wiki page.

Now lets cut a crap & start an action, assuming you  already have working Debian 8 OS.

Build & Install FreeSWITCH

There are different ways to install FreeSWITCH. In this tutorial, we will see how to install it from source.

  1. First update your Debian box & install curl & git.

  2. Add FreeSWITCH GPG key to APT sources keyring.

  3. Add FreeSWITCH repository to APT sources.

  4. Once again update your system.

  5. Now lets first install FreeSWITCH dependencies.

  6. Though above step takes care of most of dependencies, few still remains required to compile mod_fsv. So install them as,

  7. Grab source code of FreeSWITCH as follows,

  8. Now lets compile FreeSWITCH source for version 1.6

  9. Now lets compile sounds

  10. Lets create simlinks to required binaries to access them from anywhere.


Set Owner & Permissions

Starting FreeSWITCH service on boot automatically

To start FreeSWITCH after each boot automatically we need to set up init script. Init script is script used by init system to manipulate services. Debian 8 is now migrated to systemd init system, we will add systemd unit file.

Copy following content to ‘/lib/systemd/system/freeswitch.service’

Now execute following commands in your shell

Start FreeSWITCH

Now we are all set. Lets start hacking FreeSWITCH.



  1. If something goes wrong & you try compilation again by ‘make clean’, sometimes you get errors regarding ‘spandsp’. To resolve them try to clean using
    ‘git clean -fdx’. For more info check this ticket –


How to integrate Celery into Django project

What is Celery?

Celery is a distributed task queue that allows us to execute jobs in background. This article explains how to set up Celery with Django to perform a background task.


  • Large or small, Celery makes scheduling such periodic tasks easy.
  • You never want end users to have to wait unnecessarily for pages to load or actions to complete. If a long process is part of your application’s workflow, you can use Celery to execute that process in the background, as resources become available, so that your application can continue to respond to client requests.

Celery uses brokers to pass messages between a Django Project and the Celery workers. We will use Redis as the message broker.


Before diving into Celery, follow the below setup points

Create a new virtualenv ‘venv’ using following command:

To activate the environment use command:

Install django and create a django project ‘myproject’. Make sure to activate a virtualenv, create a requirements.txt file and run the migrations. Then fire up the server and navigate to http://localhost:8000/ in your browser. You should see the familiar “Congratulations on your first Django-powered page” text. When done, kill the sever.

Let’s install Celery:

Now we will integrate Celery into our Django project with the following steps:

Step 1:

Inside the myproject directory i.e beside your create a new file called and add the following code in that:

Let’s break down what happens in the first module, first we import absolute imports from the future, so that our module will not clash with the library:

Then we set the default DJANGO_SETTINGS_MODULE for the celery command-line program:

Specifying the settings here means the celery command line program will know where your Django project is. This statement must always appear before the app instance is created, which is what we do next:

This is your instance of the library, you can have many instances but there’s probably no reason for that when using Django.

We also add the Django settings module as a configuration source for Celery. This means that you don’t have to use multiple configuration files, and instead configure Celery directly from the Django settings.

You can pass the object directly here, but using a string is better since then the worker doesn’t have to serialize the object when using Windows or execv:

Next, a common practice for reusable apps is to define all tasks in a separate module, and Celery does have a way to autodiscover these modules:

Step 2:

To ensure that the Celery app is loaded when Django starts, add the following code into the file that sits next to your file:

Project layout should look like:

Step 3:

Celery uses “brokers” to pass messages between a Django Project and the Celery workers. In this article, we will use Redis as the message broker.

First, install Redis from the official download page and then turn to your terminal, in a new terminal window, fire up the server:

You can test that Redis is working properly by typing this into your terminal:

Redis should reply with PONG – try it!

Once Redis is up, add the following code to your file:

You also need to add Redis as a dependency in the Django Project:

Test that the Celery worker is ready to receive tasks:

Kill the process with CTRL-C. Now, test that the Celery task scheduler is ready for action:

That’s it! You can now use Celery with Django. For more information on setting up Celery with Django, please check out the official Celery documentation.

Sending emails asynchronously using Twisted – Part 2

In Part 1 of article, we saw how to send blocking emails using ‘smtplib’ module & non-blocking emails using Twisted framework. In this part, we will see how to send asynchronous emails to multiple recipients using Twisted

  • Sending multiple emails

    Refer following script.This script sends emails to given recipients asynchronously. Here we have used twisted.internet.defer.DeferredList API. This API is very useful in some scenarios. Suppose you have to finish multiple task asynchronously and then you have to finish one final task. For examples, your program is connected to 4 different clients & and before shutting it down, you have to make sure that all connections are closed properly. In such cases, DeferredList API is used. Create deferrands of each task & make their list. Pass this list to ‘DeferredList‘ API which will return you another deferrand. This final deferrand will be fired when all deferrands in list will be fired.

  • Sending multiple emails using coiterator

    Though above script runs fine, there is one problem. Here, recipients number is very small. But suppose you have to send emails to millions recipients then will this code work ?. Refer function ‘send_multiple_emails’.

    Here we have used ‘for’ loop which is blocking. So until this ‘for’ loop is iterated, program will not move to next line of code. For 3 recipients iteration will not take much time however for millions of recipients, it will not work.
    So lets modify our code to work like generators.

    Here, we have used twisted.internet.task.coiterate API. This API iterates over iterator by dividing reactor runtime between all iterators. Thus we can send millions of emails asynchronously.

Introduction to Riak

Riak is a distributed database designed to deliver maximum data availability by distributing data across multiple servers. Riak is an open-source, distributed key/value database for high availability, fault-tolerance, and near-linear scalability.

Riak Components

Riak is a Key/Value (KV) database, built from the ground up to safely distribute data across a cluster of physical servers, called nodes.

Riak functions similarly to a very large hash space. Depending on your background, you may call it hashtable, a map, a dictionary, or an object. But the idea is the same: you store a value with an immutable key, and retrieve it later.

1) Key and value

2) Buckets

Key and Value

Key/value is the most basic construct in all of computerdom.


Buckets in Riak  provide logical namespaces so that identical keys in different buckets will not conflict.

Buckets are so useful in Riak that all keys must belong to a bucket. There is no global namespace. The true definition of a unique key in Riak is actually        bucket + key.

For convenience, we call a bucket/key + value pair as an object, sparing ourselves the verbosity of “X key in the Y bucket and its value”.

Replication and Partitions

Distributing data across several nodes is how Riak is able to remain highly available, tolerating out-ages and  partitioning. Riak combines two styles of distribution to achieve this: replication and partitions.


Replication is the act of duplicating data across multiple nodes. Riak replicates by default.

The obvious benefit  of replication is that if one node goes down, nodes that contain replicated data remain available to serve requests. In other words, the system remains available with no down time.

The downside with replication is that you are multiplying the amount of storage required for every duplicate. There is also some network overhead with this approach, since values must also be routed to all replicated nodes on write.


A partition is how we divide a set of keys onto separate physical servers. Rather than duplicate values, we pick one server to exclusively host a range of keys, and the other servers to host remaining non-overlapping ranges.

With partitioning, our total capacity can increase without any big expensive hardware, just lots of cheap commodity servers. If we decided to partition our database into 1000 parts across 1000 nodes, we have (hypothetically) reduced the amount of work any particular server must do to 1/1000th.

There’s also another downside. Unlike replication, simple partitioning of data actually decreases uptime.

If one node goes down, that entire partition of data is unavailable. This is why Riak uses both replication and partitioning.


Since partitions allow us to increase capacity, and replication improves availability, Riak combines them. We partition data across multiple nodes, as well as replicate that data into multiple nodes.


The Riak team suggests a minimum of 5 nodes for a Riak cluster, and replicating to 3 nodes (this setting is called n_val, for the number of nodes on which to replicate each object).

The Ring

Riak applies consistent hashing to map objects along the edge of a circle (the ring).

Riak partitions are not mapped alphabetically (as we used in the examples above), but instead a partition marks a range of key hashes (SHA-1 function applied to a key). The maximum hash value is 2160 , and divided into some number of partitions—64 partitions by default (the Riak config setting isring_creation_size).

The Ring is more than just a circular array of hash partitions. It’s also a system of metadata that gets copied to every node. Each node is aware of every other node in the cluster, which nodes own which vnodes, and other system data.

N/R/W Values


With our 5 node cluster, having an n_val=3 means values will eventually replicate to 3 nodes, as we’ve discussed above. This is the N value. You can set other values (R,W) to equal the n_val number with the shorthand all.


Reading involves similar tradeoffs. To ensure you have the most recent value, you can read from all 3 nodes containing objects (r=all). Even if only 1 of 3 nodes has the most recent value, we can compare all nodes against each other and choose the latest one, thus ensuring some consistency. Remember when I mentioned that RDBMS databases were write consistent? This is close to read consistency. Just like w=all,however, the read will fail unless 3 nodes are available to be read. Finally, if you only want to quickly read any value, r=1 has low latency, and is likely consistent if w=all.


But you may not wish to wait for all nodes to be written to before returning. You can choose to wait for all 3 to finish writing (w=3 or w=all), which means my values are more likely to be consistent. Or you could choose to wait for only 1 complete write (w=1), and allow the remaining 2 nodes to write asynchronously, which returns a response quicker but increases the odds of reading an inconsistent value in the short term. This is the W value

Since Riak is a KV database, the most basic commands are setting and getting values. We’ll use the HTTP interface, via curl, but we could just as easily use Erlang, Ruby, Java, or any other supported language. The basic structure of a Riak request is setting a value, reading it, and maybe eventually deleting it. The actions are related to HTTP methods (PUT, GET, POST, DELETE).



The simplest write command in Riak is putting a value. It requires a key, value, and a bucket. In curl, all HTTP methods are prefixed with -X. Putting the value pizza into the key favorite under the food bucket is done like this:

The -d flag denotes the next string will be the value. Declaring it as text with the proceeding line -H ‘Content-Type:text/plain’

This declines the HTTP MIME type of this value as plain text. We could have set any value at all, be it XML or JSON—even an image or a video. Riak does not care at all what data is uploaded, so long as the object size doesn’t get much larger than 4MB.


The next command reads the value pizza under the bucket/key food/favorite.

This is the simplest form of read, responding with only the value. Riak contains much more information, which you can access if you read the entire response, including the HTTP header. In curl you can access a full response by way of the -i flag.


Similar to PUT, POST will save a value. But with POST a key is optional. All it requires is a bucket name, and it will generate a key for you.

Let’s add a JSON value to represent a person under the people bucket. The response header is where a POST will return the key it generated for you.

You can extract this key from the Location value. Other than not being pretty, this key is treated the same as if you defined your own key via PUT.


You may note that no body was returned with the response. For any kind of write, you can add the returnbody=true parameter to force a value to return, along with value-related headers like X-Riak-Vclock and ETag.


The Final basic operation is deleting keys, which is similar to getting a value, but sending the DELETE method to the url/bucket/key.

A deleted object in Riak is internally marked as deleted, by writing a marker known as a tombstone. Unless configured otherwise, another process called a reaper will later finish deleting the marked objects.

read/write ratios.


Riak provides two kinds of lists. The first lists all buckets in your cluster, while the second lists all keys under a specific bucket. Both of these actions are called in the same way, and come in two varieties.

The following will give us all of our buckets as a JSON object.

And this will give us all of our keys under the food bucket.

Adjusting N/R/W to our needs

N is the number of total nodes that a value should be replicated to, defaulting to 3. But we can set this n_val to less than the total number of nodes.

Any bucket property, including n_val, can be set by sending a props value as a JSON object to the bucket URL. Let’s set the n_val to 5 nodes, meaning that objects written to cart will be replicated to 5 nodes.

Symbolic Values

A quorum is one more than half of all the total replicated nodes (floor(N/2) + 1). This figure is important, since if more than half of all nodes are written to, and more than half of all nodes are read from, then you will get the most recent value (under normal circumstances).


Another utility of buckets are their ability to enforce behaviors on writes by way of hooks. You can attach functions to run either before, or after, a value is committed to a bucket.

Functions that run before a write is called precommit, and has the ability to cancel a write altogether if the incoming data is considered bad in some way. A simple precommit hook is to check if a value exists at all.

I put my custom Erlang code files under the riak installation ./custom/my_validators.erl.


Then compile the file.(You need to install erlang before installing Riak)

erlc my_validators.erl

Install the file by informing the Riak installation of your new code via app.config (restart Riak).

Then you need to do set the Erlang module (my_validators) and function (value_exists) as a JSON value to the bucket’s precommit array {“mod”:”my_validators”,”fun”:”value_exists”}.

If you try and post to the cart bucket without a value, you should expect a failure.


Siblings occur when you have conflicting values, with no clear way for Riak to know which value is correct. Riak will try to resolve these conflicts itself if the allow_mult parameter is configured to false, but you can instead ask Riak to retain siblings to be resolved by the client if you set allow_mult to true.

Siblings arise in a couple cases.

We used the second scenario to manufacture a conflict in the previous chapter when we introduced the concept of vector clocks, and we’ll do so again here.

Resolving Conflicts

When we have conflicting writes, we want to resolve them. Since that problem is typically use-case specific, Riak defers it to us, and our application must decide how to proceed.

For our example, let’s merge the values into a single result set, taking the larger count if the item is the same. When done, write the new results back to Riak with the vclock of the multipart object, so Riak knows you’re resolving the conflict, and you’ll get back a new vector clock.

Successive reads will receive a single (merged) result.

Will share more on this arctile soon.

About SCM Source Code Managment System

GIT The Source Code Management System(SCM)

Many people uses version control system but they have on idea why they are using it, as the team is using, they will also use it get work done.

Why do we need version control system? Here are the few requirements, from where the SCM idea comes.

See the project directory with out version control system. Project directory with no SCM

When ever you think to have snap shot of your source code for that particular moment, then you have to copy your source code directory by naming it according to that moment. Like wise you may do it with tarballs to save memory. Its horrible….not?, you will end up with endless tarballs or directories as per the project.

Just think, your other developer asked about release code that you did on some x day. Then you have to send him whole project directory if every file got changed and he is not having those changes. You did some performance fixes, it takes 70% refactoring, and it’s an experiment then you should keep your project directory backup, before you starting doing these performance fixes. And also you need to have directory/tarball when you completed with your changes, because you may not undo your 70% of changes if you want the last stable code that you have before 70% of refactoring for performance. If these changes are not stable and will take long time to get them stable, in such a case you will replace your production with the directory that you copied before you did performance fixes.

If we go like this, you will end up with whole hard disk with your single project. Even though, you will not dare to find some directory(If you think of directory as snap shot or save point).

Now come to team collaboration, there is no pretty much better way to communicate with the team with code changes up to dated as per time with out SCM. Some times your team will need to wait for the fixes those are being implemented by others, to get source code. In this case one person can work faster than the 10 persons. Because every body else will end up by doing nothing other than taking/giving changed files looking at differences, asking others about changes and finally preparing single file by combining working changes.

All this story about when we don’t have source code management system. Now imagine the development with out SCM.

Thats why the version control system comes into picture.

Version control, also known as revision control, source control management, is the management of changes to files, programs and other information. Version control system allows us to track the incremental changes in files or content. Provides the ability for many developers to work on single file or project concurrently.

The project directory with SCM (GIT) Project With SCM

There are lot of SCM softwares out side including open source and commercial. Here are few….

  • CVS
  • SVN
  • BitKeeper
  • GIT
  • Mercurial

There are lot of softwares available other than I have mentioned, you can find them at list of version control systems

We can classify the version control systems based on their model. That is centralized and distributed. You may call centralized as client server model. As there are many version control systems no one meet all requirements at all.

Here are few characteristics of version control systems, which will vary around each version control system.

  • User interface
  • Performance
  • Memory management
  • Learning cure
  • Maintenance

Apart from them there are two other things to consider, Open source/Proprietary Centralized/Distributed

Lets have a look at GIT. GIT is a distributed revision control system which is especially designed for speed. It was designed by Linus Torvalds in early 2005. It was designed to manage kernel source code, and for the BitKeeper
replacement. Linux kernel was managing with BitKeeper before GIT was invented. It’s initial release was 7 April 2005. GIT is really helpful for open source projects where it supports merges much better than any other SCM.

Here is how GIT distributed(decentralized) model would look like.GIT decentralized model

And centralized model would look like:GIT centralized model

In centralized model whole git history is resides in only central repository. We need to have connected via network to commit our changes unlike git(distributed). It required additional maintenance and need to take source code backups to regret central repository from any hardware failure of so.

First time with GIT:

Every git repository is nothing but a directory either on server or locally in your machine.

Creating the GIT repository is very simple, go to the directory that you would like make it as your git repository. To make sure you are in that directory justify it by the command pwd.

figure 1

Here I would like to make my PROJECT directory as GIT repository. It’s simple, with the following command.

figure 2

One more thing, if your directory is git repository you would have .git inside.

figure 3

GIT use to track changes and all with this directory.

We do save our changes with in commit. You may think each commit as one save point. You can go back to that history or save point when ever you want. You can tag commits with your version numbers. That is like v1.0, v2.0, v3.2,…. so on. You may call commit as revision or version as well.

As GIT uses a unique SHA1 to identify the commit. So each revision can be described by 40 characters hexadecimal string. Instead of mentioning this long commit hash into the releases and executables git users tag that specific commit with version number. So we can identify and get that specific source from the incremental git source tree for that version.

For each commit we will give some descriptive message we will call that as commit message. Here is the commit look like, it’s from git source.

figure 4

Here we see four things, commit, author, date and commit message. In each commit git stores the author name and his email. So we need to configure git to take our name and email and other settings if required to have them in commits we do. There are global and repository wide settings/options.

Here is how you can configure settings

figure 5

If you use –global you will have global settings configured. Those will take affect over each repository. Repository settings have more precedence than global. You need to take –global off from the command to have repository wide options configured. Make sure you are in side repository to configure repository wide options. You don’t need to be in repository to set global options.

To view git configured settings, try the command

$ git config –list

figure 6

If you use command line option/argument –global that will show all configured global level options.

$ git config –list –global

There are many more options to customize git behavior. But we very few options often.

GIT First Commit:

There are three states in the git commit procedure. Your file resides in any of the following state, those are

  • Modified
  • Staged
  • Commited

Modified means you have changed or added file and have not stored in git database. Staged means you have marked changes in current version to go into next snap shot, that is commit. Committed means, you have saved your modifications into database. The middle state is optional and it is to avoid accidental commits. We can skip this if required but it’s not recommended.

Adding files and modifying committed files comes under modified state. If you don’t add your new files those will be treated as untracked files.

To add or stage files use the following command.

git add is the multi purpose command we use that to both track new files and to stage new files.

Now add first file into our repository

Create a new file

GIT won’t track the changes of newly added files unless we say. That is unless we track that file. Newly added files are untracked, git won’t show modifications to those files.

Command git status will show us the status inside the repository. It will give the idea about three things, Untracked files, Modified file (Modified files are tracked files ) * Staged files

Here is screen shot where new file will be shown untracked.

Current Status

Now have this file added into the git database.

Add Modified File(s)

The status after we added that file,


In the above screen shot command git add made the given file stated. The files which are below the secion #changes to be committed are staged files. It’s second state as we discussed. Now we have to commit the staged file. The git commit will do that. It will take the argument -m along with the commit message. If we don’t give commit message with -m we will have editor open to enter commit message.

Finally, to see our history or previous commits we have the command

10 MySQL best practices

When we design database schema it’s recommended to follow the beast practices to use memory in optimal way and to gain performance. Following are 10 MySQL best practices

Always try to avoid redundancy

We can say database schema designed is the best one if it is having no redundancy. If you want to avoid redundancy in your schema, normalize it after you design.

Normalize tables

Database normalization is the process of  organizing columns and tables in relational database to avoid redundancy. Find more about normalization here

Use (unique) indexes on foreign key columns

We use foreign keys for data integrity and to represent relation. Some times these are result of process called normalization. When tables are mutually related obviously we can’t query  the data without using joins

Avoid using varchar for fixed width column instead use char

Choose the right one CHAR vs VARCHAR. CHAR(15) will just allocate the space for 15 characters but VARCHAR(15) will allocate the space only required by number of characters you store.

Always use explain  to investigate your queries and learn about how mysql is using indexes

EXPLAIN  statement is very handy in mysql. I’m sure it will spin your head. This statement will give you analyzed report. Where you can use it to improve your queries and schema. It works on both select and update. If you try it on update queries it will that query as select and will give you the report.

Use right data type

Choosing right data type for you column will help you to get rid of many bottle necks. MySQL query optimizer will choose the indexes based on data type you used in query and column datatype. There are many MySQL datatype.

Use ENUM if required  

ENUM is one datatype that mysql supports. By using this you can save lot of memory if you have predefined and predictable values in your database column.

Don’t use too many indexes, it will slow down the inserts and updates. Only use the indexes on selected column

As you know indexes will help you query data much faster than expected. It’s very tempting to you indexes on unintended columns. Choosing index on every column or unnecessary columns will get you slow inserts and updates. You need to think of indexes as seperate table. Where MySQL needs to create a index for every insert in seperate table/file. It’s extra overhead.

Tune  mysql default parameters

MySQL comes with default parameters. These parameters are not suitable if you want use mysql on dedicated machine or production. You have to tune these parameters. Formally we call them as system variables.

 Always create an account with associated hosts instead of wildcard %

MySQL manages the user with associated hosts. i.e, the user  root@localhost can’t login to mysql from everywhere except localhost. but root@% can login from every where. Using only associated hosts will mitigate many attacks those are in your blind spot.