Regular Expressions in JMeter
JMeter includes the pattern matching software
Apache Jakarta ORO.
There is some documentation for this on the Jakarta web-site.
A summary of the pattern matching characters can be found at
http://jakarta.apache.org/oro/api/org/apache/oro/text/regex/package-summary.html
There is also documentation on an older incarnation of the product at
OROMatcher User's guide, which might prove useful.
Overview
The pattern matching is very similar to the pattern matching in Perl. A full installation of Perl will include plenty of documentation on regular expressions - look for perlrequick, perlretut, perlre, perlreref.
O'Reilly sell a book called "Mastering Regular Expressions" by Jeffrey Friedl which will tell you all you need to know (and a lot more) about regular expressions.
There are also a couple of
sample chapters available on their web-site covering REs in Java and .NET, and the Java chapter has a
section on ORO (PDF) - worth a look.
It is worth stressing the difference between "contains" and "matches", as used on the
Response Assertion test element:
"contains" means that the regular expression matched at least some part of the target, so 'alphabet' "contains" 'ph.b.' because the regular expression matches the substring 'phabe'.
"matches" means that the regular expression matched the whole target. So 'alphabet' is "matched" by 'al.*t'. In this case, it is equivalent to wrapping the regular expression in ^ and $, viz '^al.*t$'. However, this is not always the case. For example, the regular expression 'alp|.lp.*' is "contained" in 'alphabet', but does not match 'alphabet'.
Why? Because when the pattern matcher finds the sequence 'alp' in 'alphabet', it stops trying any other combinations - and 'alp' is not the same as 'alphabet', as it does not include 'habet'.
Note: unlike Perl, there is no need to (i.e. do not) enclose the regular expression in //. So how does one use the Perl modifiers ismx etc if there is no trailing /? The solution is to use Perl5 extended regular expressions, i.e. /abc/i becomes (?i)abc
Links to regex resources
These resources are not for Jakarta ORO specifically, but are helpful in understanding Regexes in general.
http://tlc.perlarchive.com/articles/perl/pm0001_perlretut.shtml
http://www.visibone.com/regular-expressions/ - quick reference
http://www.regexlib.com/ - regular expressions library
http://java.sun.com/docs/books/tutorial/extra/regex/index.html - Tutorial on java.util.regex
Links to Regex Testers
http://jakarta.apache.org/oro/demo.html - Java applet using Jakarta ORO
Note that the following testers use engines which may work slightly differently from the Jakarta-ORO (the one currently used by JMeter). However, a lot of regexes will work the same in all the tools.
http://jakarta.apache.org/regexp/applet.html - online Jakarta regexp package tester
http://weitz.de/regex-coach/ - easy to use regex tester (Linux and Windows, Perl-like expressions)
http://jregexptester.sourceforge.net/index.html - Java Swing (requires JVM 1.4 or above, presumably uses java.util.regex)
http://www.fileformat.info/tool/regex.htm - online java.util.regex tester
http://royo.is-a-geek.com/iserializable/regulator/ - the Regulator (Windows only, .NET regexes)
Examples
Extract single string
Suppose you want to match the following portion of a web-page: name="file" value="readme.txt" and you want to extract readme.txt.
A suitable reqular expression would be:
name="file" value="(.+?)"
The special characters above are:
( and ) - these enclose the portion of the match string to be returned
. - match any character. + - one or more times. ? - don't be greedy, i.e. stop when first match succeeds
Note: without the ?, the .+ would continue past the first " until it found the last possible " - probably not what was intended.
Extract multiple strings
Suppose you want to match the following portion of a web-page: name="file.name" value="readme.txt" and you want to extract file.name and readme.txt.
A suitable reqular expression would be:
name="(.+?)" value="(.+?)"
This would create 2 groups, which could be used in the JMeter Regular Expression Extractor template as $1$ and $2$.
The JMeter Regex Extractor saves the values of the groups in additional variables.
For example, assume:
Reference Name: is MYREF
Regex: name="(.+?)" value="(.+?)"
Template: $1$$2$
The following variables would be set:
MYREF: file.namereadme.txt
MYREF_g0: name="file.name" value="readme.txt"
MYREF_g1: file.name
MYREF_g2: readme.txt
These variables can be referred to later on in the JMeter test plan, as ${MYREF}, ${MYREF_g1} etc