Proxy > Gmail Facebook Yahoo!

Regular Expression Operations Examples



Regular Expression Search - Match Operators
Operator Description
*
Matches zero or more expressions enclosed in ( ) or [ ]. * may be used by itself, although it is intended to be used around strings. If the * operator is entered alone it will match all characters from the start of the line to the end of the line. You can match characters between two or more strings up to the maximum regular expression size by specifying a range after the * operator. Entering several expressions in a row containing * should be done carefully to avoid overlapping matches which may produce unpredictable results.
       *(is) will match zero or more strings such as: is, crisis

Windows *[0-9] will match Windows 95
 
This operator can also be used to match all characters between two strings, e.g.,

Win*95 will match Windows 1995, Win 95, Windows 95

*(is) will match zero or more strings such as: is, isis
 
Note: Using the * operator at the beginning of the line will match all characters from the start of the line and at the end, to the end of the line. You can match characters between two or more strings up to 32767 characters (32K) apart by specifying a range after the * operator, e.g.,

Windows*[]95 will match up to 32767 characters (on several lines) between Windows and 95

Windows*[\0-ΓΏ] will also accomplish the same match (older syntax)
 
Note: When * is combined with a numeric range and the %n> or %n>starting value> replacement operators, the search expression above, Windows *[0-9], would be part of a Regular Expression Counter Operation.
+
One or more expressions enclosed in (), e.g.,
       +(is) will match one or more strings such as: is, crisis.
?
Exactly one expression enclosed in () or any one character, e.g.,
     ?(is) will match the string is.
 
This operator can also be used to match any character between two strings or before or after a string. This is the main use for this operator. E.g.,

Win?95 will match Win 1995, Win-95, Win/95 etc.
 
Note: Using the ? operator by itself will match every character in a file one at a time and should be avoided.
!
Note: This is for version 3.1 and above. See below for syntax for older verions.
A match will be made when both a 'positive' hit component and a !() or ![] component of the expression are found. The complete expression requires both components. The first may be as simple as a single regular expression operator such as * or ?. The ! component should be enclosed in () or []. Additional postive hit strings &/or regular expressions to find may be specified after !() or ![]. Note, however, that regular expressions following the !() or ![] will not be available to the %n operators.
?at!((b|c)at)
*file!(beg*file)
*98!(Windows 98)
*98!(+[a-z ]98) 
#include*!()"C  
mat & sat but not 'b'at or 'c'at
a file & this file but not 'beginning of file'
98 in 1998 but not in 'Windows 98'
98 in 1998 but not in 'Windows 98'

"ChildFrm.h" in "#include "ChildFrm.h"" but finds nothing in "#include "Edit.h"". This example uses !() as an AND operator.
 
Note:
More than one ! component can be specified. For example:
<a href="?!(<a href="a)!(<a href="b)!(<a href="c)
and
<a href="?!(<a href="(a|b|c))

finds <a href="., <a href="dada, <a href="foo but not <a href="a, <a href="bas, <a href="cold
CAP*!(ITAL)!('CINE)!(CAP [A-Z])
finds CAP but not when followed by ITAL or 'CINE or when followed by a space and a world beginning with a capital letter. (Turn Case Sensitive on).
!
Note: Syntax for pre-3.1 verions.
Does not match if the expression is found, e.g.,
       !(a-c)at will match the string mat, rat or fat but not bat or cat
 
Another use is to look for a string that doesn't contain an expression, e.g.,

!(Windows )95 will find 95 in 1995 but not in Windows 95
^
An expression that starts at the beginning of a line, e.g.,
       ^the finds the at the beginning of a line and The (if case sensitive searching is turned off)
$
An expression that ends at the end of a line, e.g.,
       end$ finds end when its the last string on a line.
^^
Beginning of file operator - Matches an expression found at the beginning of a file.
       ^^First finds First in "First line of the file" if that string is on the first line in the file.
^^+50[] matches the first 50 characters in the the file.
$$
End of file operator
       *^^ finds the very last line in the fie.
<table*[]</table>*[]$$ matches the last 'table' in the file.


Regular Expression Search - Subexpression Operators
Operator Description
[]
When entered alone, will match all characters and is equivalent to ?[]. When entered in combination with the * operator, it will span across multiple lines up to to 4096 characters. Alternatively, any one character entered between the brackets will be matched. Ranges are allowed by using the a-z notation. The [] operator can be used after a ?* + operator to modify the range matched by that operator, e.g.,
     t[]e will match: This is line, ttp://www.e

H[]d will match across two lines: Hello (cr-lf) World

*[0-9] will match: 234907, 5795, or an empty string

[niewW] will match one or more strings such as: Win, new, win

[a-z] will match any lower case strings if case sensitive and any words if not case sensitive.
 
You can also enter multiple ranges and the - character itself (preceded by \), e.g.,

[a-zA-Z0-9\-_]
 
When [] is combined with a numeric range, the * operator (e.g., *[0-9]), and the %n> or %n>starting value> replacement operators, a search expression such as Windows *[0-9] would be part of a Regular Expression Counter Operation.
()
Denotes one or more sub-expression. You can specify matching an expression or another by using the | operator, e.g.,
     Win( 95|dows 95) will match strings such as: Windows 95 and Win 95
+n
Denotes the number of columns to match either before of after an expression. Use in combination with other sub-expression operators. A range can be specified, E.g.,
     +4[]w will match llo W and Use W in Hello World and Use Windows

[ ]+5-15[0-9 ] will match "100.01", "123.9", & "543.21" in:
Data1 100.01 Somethin'
Dat2 123.9 Nuthin'
Dataa3 543.21 and junk

Special Search Characters (Literals)
- + * ? ( ) [ ] \ | $ ^ ! If you wish to search for any of these characters, they must be preceded by the character to be interpreted as a literal in a search.



Some Example Regular Expression Search Operations
What to Match Operator Effect
Any single character  ? g?t finds get, got, gut
Any string of characters (one or more)  + w+e finds wide, white, write but not we
Any string of characters (or none)  * w*e finds wide, white, write and we
One of the specified characters [] g[eo]t finds get and got but not gu
One of the characters in a range  [-] [b-p]at finds bat, cat, fat, hat, mat but not rat or sat
All characters [] i[] finds line, list, late
One expression or another  (|) W(in|indows) will find Win or Windows
One or more expressions  +() +(at) will find atat in catatonic and at in battle
All characters (perhaps on different lines) *[] h[]d finds helped, Hello World, and Hello (cr lf) Win95 World.
/\**[]\*/ will match C style comments (on several lines if necessary
(*[] will span across multiple lines up to 32767 characters)
A string that doesn't start with an expression  !() : !(http) finds : in "following:" but not in "http://www.funduc.com" Note: Syntax for pre-3.1 versions would be !(http):
One of the characters not in a range  ![-] [a-z]at!([b-p]at) matches r in "rat" & s in "sat" but nothing in "bat", "cat", "hat".
Note: Syntax for pre-3.1 versions would be ![b-p]at
An expression at the beginning of a line  ^ ^the finds the at the beginning of a line and The (if case sensitive is turned off)
An expression at the end of a line  $ end$ finds end when its the last string on a line.
One or more column(s) before or after a string +n [h]+4// finds http:// but not https://
Using Special Characters  \ \(\*\) will find (*)





Regular Expression Replacements - Match Operators
Operator Description

%n

Core replacement operators use a %n convention, where n corresponds to a component in the regular expression search string. For example, %1 refers to the first expression value in the search string, %2 refers to the second, and so on. The %n parameters may be used several times, omitted, or used in any order. Up to 24 parameters may be used at once by referring to those over number %9 moving up the ASCII table, e.g., 123456789:;<=>?@ABCDEFGH. However, if your search-replace involves a large number of parameters you may find it easier to use a multi-step script.
 



Given the Search string:
?include (<|\[)[a-z0-9_].h*(p)+[\]>]
And the replace string:
%1exclude [%3>.H%4>]
The results might be: 
#include [stdafx.h]  to    #exclude [STDAFX.H]
#include <dos.h>  to #exclude [DOS.H]
#include [my_include.hpp]   to #exclude [MY_INCLUDE.HPP]
#include [sr32.h]  to #exclude [SR32.H]
In this example:


Replace Operator    Search Operator
%1 ?
%2 (<|\[)
%3 [a-z0-9_]
%4 *(p)
%5 +[\]>]
These parameters can be used several times, omitted or used in any order.
< Make lower case operator. To be used in conjunction with %n, e.g., %1< will replace the original first matched expression with its lower case version.
> Make upper case operator. To be used in conjunction with %n, e.g., %1> will replace the original first matched expression with its upper case version.
%n> Counter Operator. When used in conjunction with numeric regular expression search (e.g., *[0-9]), %n> begins incrementing with a value of +1 from the value of the first number found by *[0-9]. For example:


Given the series:   page5.htm, page2.htm, page4.htm
Search Expression:   page*[0-9]
Replacement Expression:   page%n>
The results would be:   page6.htm, page7.htm, page8.htm
%n>#> Counter Operator. This operator allows you to specify a starting value for an incrementing replacement counter. %n>starting value> begins incrementing with a value of +1 from the starting value you supply. This counter operator also respects the number of digit places you supply. To begin incrementing with a value of 1, use the expression %n>0>. The expression %n>000> would begin replacements with a a value of 001. Another example is:


Given the series:   Var19, Var82, Var8
Search Expression:   Var*[0-9]
Replacement Expression:   Var%n>99>
The results would be:   Var100, Var101, Var102


Special Regular Expression Replacement Characters (Literals)
% \ < > If you wish to replace any of these characters, they must be preceded by the character to be interpreted as a literal in a replacement.



Regular Expression Search & Replacement Examples

Search
Expression

Replacement
Expression


Effect
*.* %1>.%2> c:\windows\win.ini ==> C:\WINDOWS\WIN.INI
+[a-z] %1> Windows ==> WINDOWS
7*.htm 5%1.htm 711.htm ==> 511.htm
7days.htm ==> 5days.htm
[253]7[832].htm %15%2.htm 3572.htm ==> 3552.htm
*[253]7[832].htm %15%2.htm 72.htm ==> 52.htm
(homepage|index).htm %11.htm homepage.htm ==> homepage1.htm
index.htm ==> index1.htm
+(12)[0-9] %1%2a 12532 ==> 12532a
1212753 ==> 1212753a
???*(d|m).htm %1%2%3d1.htm card.htm ==> card1.htm
form.htm ==> form1.htm
back2.jpg*[]height="30" back2.jpg%1height="32" A multiline Search/Replace changing the height setting for 'back2.jpg' regardless of differing 'alt' text or how the html editor line breaks the code, e.g.,

src="images/back2.jpg" alt="Go Back"
 border="0" width="57"
 height="30"

Becomes:

src="images/back2.jpg" alt="Go Back"
border="0" width="57"
height="32"
?(Windows) OS/2 Windows ==> OS/2 (just kidding)


Regular Expression Counters
Regular Expression search & replace Counter Operations allow you to quickly revise a sequence of numbers in one or more files. You can also insert sequential numbers to text strings where no numbers exist originally. Counter operations make use of *[0-9] regular expression search operator and either the %n> or %n>user defined starting value> regular expression replacement operators. The %n> replacement operator begins incrementing by one with a value of +1 from the value found by your *[0-9] expression (e.g.*[0-9]+1). The %n>user defined starting value> replacement operator increments by one beginning with a value of (user defined starting value+1). This counter operator also respects the number of digit places you supply. Incrementing counter operations may be combined with other regular expression search & replace operators. For example, a search expression such as (file|variable)*[0-9] with a counter replacement expression such as %1%2>100> is perfectly legal.

Regular Expression Counter Examples
 
Initial Contents:   Windows 98 will be released in 5 days.
Search String:   *[0-9]
Replace String:   %1>
Results:   Windows 99 will be released 100 days.
        
Initial Contents:   file.htm, file.htm, ffillee.htm
Search String:   e*[0-9].htm
Replace String:   e%1>.htm
Results:  file2.htm, file3.htm, ffillee4.htm
 
Initial Contents:   Var22 Var20 Var86 Var30
Search String:   Var*[0-9]
Replace String:   Var%1>49>
Results:   Var50 Var51 Var52 Var53
 
Initial Contents:   Var22 Var20 Var86 Var30
Search String:   Var*[0-9]
Replace String:   Var%1>00>
Results:   Var01 Var02 Var03 Var04
 
Initial Contents:   VarA101 VarB12 VarC0 VarA102 VarB45
Search String:   Var[a-z]*[0-9]
Replace String:   Var%1%2>08>
Results:   VarA09 VarB10 VarC11 VarA12 VarB13

Special Replacement Operators - %%srpath%% & %%srfile%%
Search and Replace currently has two special replacement operators - %%srpath%% and %%srfile%%. %%srpath%% inserts the path to the file in which the search string was found and %%srfile%% inserts the filename of that file. %%srpath%% and %%srfile%% can be used in ordinary search & replace operations,  Regular Expression operations, Regular Expression counters, & Binary mode operations.


%%srpath%% & %%srfile%% Examples
 
 Ordinary Search/Replace 
      File Searched:   D:\Example\Test.txt

Initial String in File:   Page No.

Search String:   Page No.

Replace String:   %%srpath%%%%srfile%% Page No.

Results:   D:\Example\Test.txt Page No.
 
Complex Search/Replace

Files Searched:   home.htm; index.html

File Mask:   *.htm*

Initial String in File:   Last Updated: 10/10/97 and Last Updated 10/12/97

Regular Expression Search String:   Last Updated: *[0-9]/*[0-9]/*[0-9]

Binary Mode Replace String:   Last Updated %1/%2/%3\r\nUrl: %%srfile%%

Results:   Last Updated: 10/10/97
Url: home.htm
    and
Last Updated: 10/12/97
Url: index.html
Environment Operators
You can search for, or make replacements based on, Environment Variables via the Binary Mode dialog or a Regular Expression string. Searches may be case sensitive or not. The syntax is:

%%envvar=variable name%%

where variable name is the name of the environment variable to use. For example, to search for the environment variable temp, enter the string

%%envvar=temp%%

in the binary mode search block field or your regular expression. If the value of your temp environment variable is c:\windows\temp, search hits would occur wherever the string c:\windows\temp is found. If Case Sensitive is on, c:\windows\temp would be found but C:\WINDOWS\TEMP would not. To insert the value of the environment variable winbootdir in a replacement string, enter the string

%%envvar=winbootdir%%

in binary mode replace block field or your regular expression. If the value of your winbootdir variable is C:\WINDOWS, the string C:\WINDOWS would be used in replacements.


Responses

0 Respones to "Regular Expression Operations Examples"


Send mail to your Friends.  

Expert Feed

 
Return to top of page Copyright © 2011 | My Code Logic Designed by Suneel Kumar