Tag Archives: sed

How to edit file with sed

This tutorial is about how to replace the pattern with sed command in the same file you are reading from.

What is sed?
Sed stands for stream editor, as per the Linux man page.
But sed is more than it. We can write more powerful bash scripts with just a single line using sed. It can be used for fetching specific lines, a bunch of lines, replace the character with patterns and so much. Basically, just manipulate the stream of characters with whatever logic you projectile

Let’s get started,
We understand more with the task in hand. So, our task for this tutorial is to replace some characters with our characters.
Suppose we have a file like content below,

  304  PULSEaudio -k
  306  PULSEaudio --cleanup-shm 
  310  PULSEaudio --check 
  311  PULSEaudio --start 
  312  PULSEaudio --kill 
  322  PULSEaudio -k
  323  PULSEaudio --check 
  324  killall PULSEaudio
  325  PULSEaudio --check 
  331  ls * | grep -e PULSE
  332  cd PULSE/
  340  PULSEaudio -k
  344  PULSEaudio -D
  345  PULSEaudio -d
  346  service PULSEaudio status
  348  ps -eo "user args" | grep PULSE
  350  ps -eo "user args" | grep PULSE
  351  PULSEaudio -k
  352  killall PULSEaudio 

And we are supposed to replace the characters PULSE to pulse.
Either, we can open this file in VIM and type command like

 %s/PULSE/pulse/gc

If you are familiar with vim. You’ll know what I am talking about
But if you are using this output for some reason in your bash script, you need to do this with a single command.

Here comes our savior sed.

sed s/pattern/replace_char/ <file_name>

This command does our task, but the problem is it gives output on stdout.

Common error:- we generally try to redirect that output to the file we are editing
If we are editing the file named replace.txt then the command will be
sed s/PULSE/pulse/ replace.txt 2> replace.txt

But sed creates a problem here. It doesn’t work that way it’s not sed’s problem, Its problem with the order of file descriptors it set.

This is a common error, we want to modify a file using something that reads from a file and writes the result to stdout. To do this, we redirect stdout to the file we want to modify. The problem here is that, as we have seen, the redirections are setup before the command is actually executed.
So BEFORE sed starts, standard output has already been redirected, with the additional side effect that, because we used >, “file” gets truncated. When sed starts to read the file, it contains nothing
( if you don’t know what it is read this link https://wiki.bash-hackers.org/howto/redirection_tutorial ).

Sed added one feature which internally sets this redirection of a file descriptor for us. Use -i option to overcome this problem.

The final command will be

sed s/PULSE/pulse/ replace.txt -i

How to truncate empty lines using sed

Sed command is very helpful  for text processing. It’s just matter of regular expression. Here is how you can  truncate empty lines  in the file using command sed.

$ sed   '/^\s*$/d' <file_name>

The above command will give you the output by deleting empty lines. If you want to modify the original file itself. You use the option -i

$ sed  -i '/^\s*$/d' <file_name>

Example:

I have a file with following data and named as numbers

$ cat numbers
2989239823
8239823922

3892389239
2938923829
923892389

838888888



000000000

Now see, the way to truncate empty lines using sed

$ sed '/^\s*$/d' numbers 
2989239823
8239823922
3892389239
2938923829
923892389
838888888
0000000000

The above command won’t change the original file. If you want to change original use option -i

sed -i '/^\s*$/d' numbers 

Explanation:

Sed interprets ‘/^s*$/d’  and giving us the desired output. Here  text between slashes(/regular expression/) represent the regular expression followed by sed command “d”  means delete.  So command sed matches the lines with given regular expression. If it finds the match it applies the command.

 

How to shuffle lines in the file in linux

We can shuffle lines in the file in linux using following commands

  • shuf
  • sed and sort
  • awk
  • python

As an example we will take a file shuffle_mylines.txt  having numbers till 10 each digit in a new line.

Create a file using following command

$ seq 10  > shuffle_mylines.txt

Command shuf

This command is light wight and straight forward. You just need to call this command with file name as an argument.

$ shuf shuffle_mylines.txt

Shuffle lines using sed

You may have already know about command sed(Stream Editor). It is one of the command widely used for text processing in unix/linux. We can’t shuffle line using single sed command, but we will do by combining other commands. Let’s take a look at following command,

$ cat shuffle_mylines.txt | while read x; do    echo $RANDOM:$x done | sort -t: -k1 -n  | sed 's/^[0-9]*://'
How does it work?

Breakdown of above command,

Commands we have used in the above example are,

  • cat
  • while loop
  • $RANDOM   environment variable
  • soft
  • tail
  • sed

 

Now, lets come to see how this command work. First command cat will read the file content and will pipe it to shell while loop

while read x; do    echo $RANDOM:$x done

Where, while loop will read the piped input into variable x and will iterate over all lines to generate  output  <random_number>:<line> as you can see $RANDOM:$x. Where $RANDOM is the environment variable, each time you query this variable you will get random number. Which is useful for to shuffle lines.

Then, we will sort output of above while loop using sort command

sort -t: -k1 -n

Out put of this command will always be randomly shuffled lines. It’s because $RANDOM.

Output here would look like,

$ cat shuffle_mylines.txt | while read x; do echo  $RANDOM:$x; done | sort -t: -k1 -n
7966:1
9825:6
16019:10
18495:4
22349:5
23058:8
23099:7
26017:3
31133:9
32683:2

To remove preceded random values we will use sed.

sed 's/^[0-9]*://'

That’s it. On every execution of this command you will get shuffled lines. You can redirect output to new file if you want to store using (>) or (>>).

 

Shuffle lines using awk

The awk is the programming language which is specially designed for text processing. We will use it to shuffle lines.

awk 'BEGIN{srand() }
{ lines[++d]=$0 }
END{
    while (1){
    if (e==d) {break}
        RANDOM = int(1 + rand() * d)
        if ( RANDOM in lines  ){
            print lines[RANDOM]
            delete lines[RANDOM]
            ++e
        }
    }
}' shuffle_mylines.txt

 

Another example using awk. It’s is similar to sed and sort example.

cat shuffle_mylines.txt | awk 'BEGIN{srand();}{print rand()"\t"$0}' | sort -k1 -n | cut -f2- > shuffled_linex.txt

Shuffle lines in file using python

Python is popular scripting language widely used today from big projects to small scripts. We will see, how you can shuffle lines using python.

Python Example 1

$ python -c "import random, sys; x = open(sys.argv[1]).readlines(); random.shuffle(x); print ''.join(x)," shuffle_mylines.txt  

In this example, we are passing file name as command line argument. Reading it and shuffling the lines of file and printing them on terminal.

The output can be redirected to  a file using redirect operator (> or >>)

python -c "import random, sys; x = open(sys.argv[1]).readlines(); random.shuffle(x); print ''.join(x)," shuffle_mylines.txt > shuffled_lines

Conclusion:

If you are looking for a quick shuffle command shuf is best choice or you can have a fun of using other ways to shuffle lines in the file.