Thursday, September 16, 2010

Apache Ant - How to search a and replace regular expressions inside a text

As reported by documentation, Ant provides a task, <ReplaceRegExp>, "for replacing the occurrence of a regular expression with a substitution pattern in a selected file or set of files". Now the question is: do Ant also provide a task for doing the same on texts? Well... the answer is no.

Anyway, this doesn't mean that it's not possible to achieve the same effect on texts. Actually there are at least two ways: the first is to use AntContrib's <PropertyRegex> task. The second (that works only starting from Ant 1.7.1) is to combine some standard Ant task to do the same: just read on.

Let's start from the <concat> task. The documentation says: "Since Ant 1.7.1, this task can be used as a Resource Collection that will return exactly one resource". Moreover "since Ant 1.6 it supports nested FilterChains". This is important because FilterChains are formed also by TokenFilters and a TokenFilter can be a ReplaceRegex string filter. Confused? Take a look at the following code:
<macrodef name="replaceStringWithRegExp">
<attribute name="string"/>
<attribute name="searchPattern"/>
<attribute name="replacementPattern"/>
<attribute name="property"/>
<sequential>
<tokens id="id">
<concat>
<string value="@{string}"/>
<filterchain>
<tokenfilter>
<replaceregex pattern="@{searchPattern}"
replace="@{replacementPattern}"
flags="g"/>
</tokenfilter>
</filterchain>
</concat>
</tokens>
<property name="@{property}" value="${toString:id}"/>
</sequential>
</macrodef>
Here I've defined a macro named "replaceStringWithRegExp". It takes four input parameters:
  1. string: text to match against a regular expression
  2. searchPattern: regular expression
  3. replacementPattern: substitution pattern
  4. property: name for a new property that will contain the result of the replacement
The input string is treated as a String resource (see <string>), filtered using a FilterChain (see <filterchain>) that's made of a TokenFilter (see <tokenfilter>). The TokenFilter is a ReplaceRegex string filter (see <replaceregex>).

To extract the result of <concat> task, I've wrapped this task around a <tokens> task and I've assigned an id to the external task. Finally, using the expression ${toString:id} it's possible to extract the toString value from the <tokens> task: this is the result of the search and replacement. Pretty tricky, isn't it?

Now let's try it.
 <replaceStringWithRegExp string="James Bond" 
searchPattern="(\w+)\s+(\w+)"
replacementPattern="My name is \2, \1 \2"
property="result"/>
<echo message="${result}"/>
Here we get: "My name is Bond, James Bond". Well, it works.

1 comment:

Steven said...

if your using maven and want regex text replacement, check out maven-replacer-plugin rather than using ant inside of maven