I seem to find my self turning to regular expressions more often these days, often to help out a fellow scripter in one of the many scripting forums I keep tabs on. I thought I’d take a moment and provide a brief explanation on how to use regular expressions in VBScript. First off, for those who don’t know, a regular expression is a way of identifying or matching a string against a pattern. For example we all know that 172.16.100.200 looks like an IP address because of its pattern. The string roygbiv@rainbow.net looks like an email address, again based on the pattern. A regular expression can be used identify strings based on a pattern.
Regular expressions don’t necessarily validate the data. 172.16.100.200 looks like an IP but whether it is valid or not is another story. I know there are some very complex regular expression patterns you can use that will. For our purposes we’ll stick to more basic patterns.
A pattern can be a simple as another string, but usually you don’t know the exact string and you want to find a pattern. To accomplish that you need to build a regular expression pattern using elements like these.
Regular Expression Characters
Character |
Description |
\b |
Match a word boundary |
\B |
Match any non-word boundary |
\d |
Match any digit (0-9) |
\D |
Match any non-digit (0-9) |
\f |
Match a form feed |
\n |
Match a newline |
\r |
Match a carriage return |
\s |
Match any whitespace, tab or newline |
\S |
Match any non whitespace character |
\t |
Match any tab |
\v |
Match any vertical tab (not really used) |
\w |
Match a word (alpha-numeric and the underscore character) |
\W |
Match any non-word character |
(Value) |
Matches exact characters anywhere in the original value. |
. |
Matches any single character. |
[value] |
Matches at least one of the characters in the brackets. |
[range] |
Matches at least one of the characters within the range. The Use of a hyphen () allows specification of contiguous character. [a-z] |
[^] |
Matches any character except those in brackets. You can also use a range [^m-p] |
^ |
Matches the beginning characters. |
$ |
Matches the end characters. |
* |
Matches zero or more instances of the preceding character. |
? |
Matches zero or more instances of the preceding character. |
\ |
Matches the character that follows as an escaped character. |
+ |
Matches repeating instances of the specified pattern such as abc+. |
{n} |
Specifies exactly n matches. |
{n,} |
Specifies at least n matches. |
{n,m} |
Specifies at least n, but no more than m matches. |
Set RegEx = New RegExp
This object has several properties we can set. First, we can configure our regular expression to be case sensitive or not.
RegEx.IgnoreCase = True
Regular expressions can be configured to find the first match or all matches. By setting the Global property to True, we’ll get all matches.
RegEx.Global=True
Finally we need to assign a pattern.
RegEx.Pattern=”\w+”
The RegEx object will match on any alphanumeric character that is repeated. Suppose we want to find the strings in “SAPIEN Technologies”. I’ll use the RegEx object’s Test() method to see if the pattern matches.
Regex.Test(“SAPIEN Technologies”)
This method returns True if there is a match. Another way I could have accomplished this, and especially if I wanted to see what the pattern matched is to use the Execute() method.
Set colMatches=RegEx.Execute(strString)
Any matches will be stored as an array which is easy enough to enumerate.
for each match in colMatches
wscript.echo match.value
next
If I run this code I’ll see that I got matches on “SAPIEN” and “Technologies”. Nothing matched on the space separating the two words. Let’s modularize what we have so far into a few functions.
Function RegExMatch(strString,strPattern)
Dim RegEx
RegExMatch=False
Set RegEx = New RegExp
RegEx.IgnoreCase = True
RegEx.Global=True
RegEx.Pattern=strPattern
If RegEx.Test(strString) Then RegExMatch=True
End Function
Function GetMatch(strString,strPattern)
Dim RegEx,arrMatches
Set RegEx = New RegExp
RegEx.IgnoreCase = True
RegEx.Global=True
RegEx.Pattern=strPattern
Set colMatches=RegEx.Execute(strString)
Set GetMatch=colMatches
End Function
The RegExMatch function will return True if the string matches the regular expression pattern.
strString=”123.45.67.89″
strPattern=”^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$”
If RegExMatch(strString,strPattern) Then
WScript.Echo strString & ” looks like a valid IP address.”
Else
WScript.Echo strString & ” does NOT look like a valid IP address.”
End If
Because my string matches the regular expression pattern exactly, notice I’m using the ^ and $ anchors, I’ll get a positive result back.
Or perhaps I need to parse a line from a log file pulling out IP addresses.
strString=”2009-04-14 15:00:53 172.16.10.1 PROPFIND /admin$. – 80 – 172.16.10.102 Microsoft-WebDAV-MiniRedir/6.0.6001 403 2 0″
strPattern=”\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}”
If RegExMatch(strString,strPattern) Then
Set matches=GetMatch(strString,strPattern)
For Each match In matches
‘ignore the web server IP
If match.value <> “172.16.10.1” then
WScript.Echo match.value
End If
Next
End If
If the pattern matches, then I call the GetMatch function to retrieve the matching values. In this case, I’ll get a match for 172.16.10.12. I’ll be back later to have more fun with regular expressions and VBScript.
In the meantime, download a script file with these functions and examples here.
You can also learn more about regular expressions and VBScript in WSH and VBScript Core: TFM.
1 comment on “VBScript Regular Expressions”
Comments are closed.