Html/javascript

Regularl expression in unix

 Regular Expressions:



The term regular expressions comes from theoreticalcomputer science. In 

its simplest form it is defined as a language 

for specifying patterns that match a 

sequence of character. Unix evaluates text against thepattern to determine if the text and the pattern match. Some of the most powerful unix utilities, such as

grep and sed , use Regular Expressions.

In Unix , regular expression are constructed

 using all the alphanumeric characters 

along with 

certain metacharacters like ^ (caret) , $ 

(dollar) , . (dot) and * (asterisk).

Metacharacters and their meaning: Special 

characters or metacharacters , have a special meaning 

to the shell. They can be used as wildcards 

to specify the name of a file without having

 to type out 

file’s full name.

(^) The Caret or Circumflex Character: This 

metacharacter is used to search and extract 

lines or records that begins with a specific

 pattern . for example , if all the lines or 

records are begin 

with the word Murthy are to be searched 

and extracted , then the search pattern will

 be ‘^Murthy’. 

($) The Dollar Character: This 

metacharacter is used to search and extract 

lines or records 

that end with a specific pattern. For 

example , if all the lines or records that end 

with the word 

Murthy are to be searched and extracted 

then the search pattern will be ‘Murthy$’ .

(.) The Dot Character: The dot is used to 

match any single character except a new line 

character. For example , if the user is 

interested in extracting all lines or records 

having the name 

spelled either as Murthy or Murthi, the 

search pattern will be ‘Murth.’

(*) The Asterisk Character: Asterisk is used 

to match multiple characters. This 

metacharacter stands for zero or more

occurrences of the preceding character. 

For example, to search 

for all the lines that contain the pattern made with the letter M , the search pattern will be ‘M*’ .

Character Class: There are situation when it

 is necessary to match a character from 

within a set of 

characters. In unix set of characters out of 

which, only one character is matched is 

referred to as a 

character class. This set of characters 

presented within a pair of square brackets.

Searching for patterns having 

Metacharacters : Sometimes it is necessary to search and extract 

lines containing metacharacters. This can 

be done by de-specialising the 

metacharacter that appears 

in the search pattern. The metacharacter \ 

(backslash) is used to de-specialize the 

special meaning 

associated with any character that 

immediately follows it.

Searching for words that Begin or End with 

a specific pattern:

All the lines or records that begins with 

same pattern or character such as 

Indonesia, India , 

Ink and others that begins with the pattern

 In ,anywhere in the line are searched and 

extracted by 

using the regular expression ‘<In’ .


All the lines or records having words such 

as Asia, India, Bolivia and others that end 

with the 

pattern ia and could be anywhere in a line 

or record are searched and extracted by 

using the regular 

expression ‘ia\>’.


Post a Comment

0 Comments