A filter is a Unix command that does some manipulation of the text of a file Filters are commands that alter data passed through them, typically via pipes. Some filters can be used on their own, but the true power to manipulate streams of data to the desired output comes from the combination of pipes and filters. Below are some of the more useful advanced Unix commands used in shell scripting.
1. HEAD Command
head command is used to display starting portion of the file. By default head command displays the top 10 lines of file.
[[email protected] ~]$ head /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail:/sbin/nologin news:x:9:13:news:/etc/news:
We can override the default behaviour of the head command as follows
[[email protected] ~]$ head -5 /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
Note: Instead of 5 we can give any number regardless of it is less or greater than 10.
2. Tail Command
Tail command works exactly opposite of the head command in unix like OS. It displays the ending portion of file.By default it also displays the 10 lines from the file we can override the behaviour as follows.
[[email protected] ~]$ tail -3 /etc/passwd raj:x:7276:7276::/home/raj:/bin/bash ram:x:7277:7277::/home/ram:/bin/bash suhas:x:7278:7278::/home/suhas:/bin/bash
suppose I want to retrieve a line on particular position from the file then combination of head and tail commands can be used as follows
[[email protected] ~]$ cat /etc/passwd | head -50 | tail -1 mahesh:x:523:501::/home/mahesh:/bin/bash
in above the 50th line from /etc/passwd file will be displayed.
3. WC Command
In Unix, to get the line, word, or character count of a document, use the wc command. At the Unix shell prompt. wc filename Replace file name with the file or files for which we want information. For each file, wc will output three numbers. The first is the line count, the second the word count, and the third is the character count.
[[email protected] ~]$ wc login 38 135 847 login
To narrow the focus of your query, we may use one or more of the following wc options:
Option Entities counted
- -c : bytes
- -l : lines
- -m : characters
- -w : words
Note: In some versions of wc, the -m option will not be available or -c will report characters. However, in most cases, the values for -c and -m are equal.
To count the characters in a file. Here it counts the no of characters in the file abc.txt
[[email protected] ~]$ wc –c / abc.txt
For example, to find out how many bytes are in the .login file, we could enter:
[[email protected] ~]$ wc -c .login
We may also pipe standard output into wc to determine the size of a stream. For example, to find out how many files are in a directory, enter:
[[email protected] ~]$ /bin/ls -l | wc -l
4. SORT Command
sort is a standard Unix command line program that prints the lines of its input or concatenation of all files listed in it’s argument list in sorted order. The -r flag will reverse the sort order.
1. By default sort command sorts in ascending order Examples:
[[email protected] ~]$ cat phonebook Smith,Brett 5554321 Doe,John 5551234 Doe,Jane 5553214 Avery,Cory 5554321 Fogarty,Suzie 5552314 [[email protected] ~]$ cat phonebook | sort Avery,Cory 5554321 Doe,Jane 5553214 Doe,John 5551234 Fogarty,Suzie 5552314 Smith,Brett 5554321
2. The -n option makes the program to sort according to numerical value:
[[email protected] ~]$ du /bin/* | sort -n 4 /bin/domainname 4 /bin/echo 4 /bin/hostname 4 /bin/pwd ... 24 /bin/ls 30 /bin/ps 44 /bin/ed 54 /bin/rmail 80 /bin/pax 102 /bin/sh 304 /bin/csh
3. If the first column of file does not contains numerical data then it will not sort according to numbers we have to provide the position of column by using -k option.
[[email protected] ~]$ cat student.txt harsh 10 mahesh 5 uday 55 $sort student.txt nk2 mahesh 5 harsh 10 uday 5 5
-k option will work when there the column separator is space. If the delimiter is other than space
then use -t option.
[[email protected] ~]$ cat student.txt harsh:10 mahesh:5 uday:55 [[email protected] ~]$ sort student.txt -t”:” -nk2 mahesh:5 harsh:10 uday:55
4. The -r option just reverses the order of the sort:
[[email protected] ~]$ cat zipcode | sort -r Joe 56789 Sam 45678 Bob 34567 Wendy 23456
5. CUT Command
cut is a Unix command which is typically used to extract a certain range of characters from a line, usually from a file. Syntax
[[email protected] ~]$ cut [c][flist] [ddelim] [file]
Flags which may be used include
[[email protected] ~]$ cat company.data 406378:Sales:Itorre:Jan 031762:Marketing:Nasium:Jim 636496:Research:Ancholie:Mel 396082:Sales:Jucacion:Ed
-c Characters: a list following c specifies a range of characters which will be returned, Examples:
1. If you want to print just columns 1 to 6 of each line (the employee serial numbers), use the c1,6 flag, as in this command:
[[email protected] ~]$ cut -c1,6 company.data 406378 031762 636496 396082
2. If you want to print just columns 4 and 8 of each line(the first letter of the department and
the fourth digit of the serial number), use the c4,8 flag, as in this command:
[[email protected] ~]$ cut -c4,8 company.data 3S 7M 4R 0S
-f :- Specifies a field list, separated by a delimiter list A comma separated or blank separated list
of integer denoted fields, incrementally ordered. The indicator may be supplied as shorthand to allow inclusion of ranges of fields
-d :- Delimiter the character immediately following the d option is the field delimiter for use in
conjunction with the -f option the default delimiter is tab. Space and other characters with
special meanings within the context of the shell in use must be unquoted or escaped
If you want to access single field.
[[email protected] ~]$ cut -d”:” -f3 company.data Itorre Nasium Ancholie Jucacion
If you want to access multiple fields.
[[email protected] ~]$ cut -d”:” -f1,3 company.data 406378:Itorre 031762:Nasium 636496:Ancholie 396082:Jucacion
6. GREP Command
“grep” one of the most frequently used TEXT PROCESSING TOOLS stands for “Global Regular
Expression Print”. grep command searches the given file for lines containing a match to the given strings or words. By default, grep prints the matching lines. Use grep to search for lines of text that match one or
many regular expressions, and outputs only e the matching lines.
1. If you want to count of a particular word in log file. You can use “grep -c” option to count the word.
Below command will print how many times word “Error” has appeared in logfile.txt
[[email protected] ~]$ grep -c "Error" logfile.txt
2. Sometime we are not just interested on matching line but also on lines around matching lines particularly useful to see what happens before any Error or Exception. grep –context option allows us toprint lines around matching pattern. Below example of grep command in UNIX will print 6 lines around matching line of word “successful” in logfile.txt
[[email protected] ~]$ grep --context=6 successful logfile.txt
Show additional six lines after matching very useful to see what is around and to print whole message if it splits around multiple lines. You can also use command line option “C” instead of “context” for example
[[email protected] ~]$ grep -C 'hello' logfile
Prints two lines of context around each matching line.
3. If you want to do case insensitive search than use i option from grep command in UNIX. Grep I will find occurrence of both Error, error and ERROR and quite useful to display any sort of Error u from log file.
[[email protected] ~]$ grep -i Error logfile
4. Use grep -o command in UNIX if you find whole word instead of just pattern. grep -o ERROR logfile
Above grep command in UNIX searches only for instances of ‘ERROR’ that are entire words; it does not match.
5 . Another useful grep command line option is “grep -l” which display only the file names which matches the given pattern. Below command will only display file names which have ERROR?
[[email protected] ~]$ grep -l ERROR logfile [[email protected] ~]$ grep -l 'main' *.java
It will list the names of all Java files in the current directory whose contents mention`main’.
6. If you want to see line number of matching lines you can use option “grep -n” below command will show on which lines w Error has appeared.
[[email protected] ~]$ grep -n ERROR logfile
7. grep command in UNIX can show matching pattern in color which is quite useful to highlight the matching section , to see matching pattern in color use below command.
[[email protected] ~]$ grep Exception today.log --color