CleanCode Perl Libraries

Perl

Java

JavaScript

Certified
Class

Testable
Class

Standalone
Mode

Diagnostic
Enabled

NAME

replace.pl - Modify multiple files, including recursive directory search.

SYNOPSIS

replace.pl [options] targetRegExp replaceRegExp {files or folders}...

replace.pl [options] {targetRegExp replaceRegExp}... FILES {files or folders}...

OPTIONS

 -T            do not restore timestamp
 -N            do not make .cmp backup files
 -l            list files which would be updated but don't change files
 -L            list matched text within each file but don't change files
 -q            quiet mode; do not report on files that did not change
 -e ext        process only files ending in ".ext"
 -m            regexp /m modifier (match ^ and $ at lines within block)
 -s            regexp /s modifier (allow . to match newlines within block)
 -b            create auto-block (-ms and extend targetRegExp to full lines)
 -h            help
 -H            longer help (manual page)

REQUIRES

File::Find, File::stat, Getopt::Std, Pod::Usage, Data::Handy, File::Handy

DESCRIPTION

Modify all files specified on the command line. Any named directory will be examined recursively. Each file is examined for the regular expression targetRegExp. The targetRegExp is changed to replaceRegExp, if found, and the original file is saved as filename.cmp. After the change, the timestamp of the file is reset to match the original unless the -T option is specified.

Invocation Options

As you will note in the Synopsis above, there are 2 variants of the command syntax. You may specify either a single target/replacement pair with any number of files/folders or--by adding the FILES marker for separation--you may specify multiple target/replacement pairs with any number of files/folders.

Lines vs. Blocks

The standard mode is line-at-a-time for best use of resources. However, if you wish to operate on a block, use the -m (multiple-line) and/or -s (single-line) regular expression modifiers. These activate the s and m modifiers, respectively, for the regular expression substitution. Using either -m or -s necessitates reading the entire file at one time, so huge files might have issues. If you had this text, for example...

        line before
        - - start superfluous block --
        a
        b
        c
        - - end superfluous block --
        line after

...use this command...

        replace.pl -ms "[^\n]*start superfluous.*end superfluous[^\n]*\n" "" file

...to reduce it to...

        line before
        line after

The leading [^\n]* and trailing [^\n]*\n give you a clean excision at line boundaries. The -b option is a shortcut for the combination of -ms with this prefix and suffix added to the pattern.

Arguments from Files

Either the targetRegExp or replaceRegExp are string literals by default, but either may reference a file instead by using the notation file:filepath. In that event, the contents of the file are read as a single string and used as the argument.

Backreferences

The operands may use standard backreferences in order to copy a phrase from the matched text specified by targetRegExp to the result text specified by replaceRegExp. Example: With targetRegExp of "(stuff.*)=foo.*bar" then "$1=other" may be used to retain the first backreference.

Special Characters

The replaceRegExp string may also use "\n" and "\t"; these are replaced with the actual characters they represent (newline, tab) before being used.

Dynamic Data

replaceRegExp may use embedded variables [FILENAME] or [PATHNAME] to substitute the appropriate string during processing. [PATHNAME] represents a complete file path, while [FILENAME] is the file name without the path. So if you are processing a multi-level directory and want to change the string"foobar" to "foobar in file" where file is the full path, use a command like this:

        replace.pl "foobar" "foobar in [PATHNAME]" .

The "." at the end of the command represents the current directory and all of its children since processing is recursive.

Look Before You Leap

One should, of course, exercise care when updating a large number of files. The -l and -L options can assist quite a bit with this. Use -L to see exactly what is matched without making any changes. For example, in each CleanCode source file I began with a version of this line in a generic header:

        # @doc = xxx

where I was using "xxx" as a marker to be filled in with the path to this file's API. To change the "xxx" to the full path of the source file, I then started with:

        replace.pl -e pl -L "(@doc = )xxx" "$1[PATHNAME]" .

The result shows the lines that would be changed for each file, as in:

        api/perl/bin/a.pl...
        # @doc = api/perl/bin/a.pl
        api/perl/bin/b.pl...
        # @doc = api/perl/bin/b.pl

Since the [PATHNAME] marker is used, the value of that field is shown dynamically updated for each file.

Next changing the -L to -l shows which files will be changed rather than the matches, just to confirm it agrees with what I expect:

        api/perl/bin/a.pl...will be changed
        api/perl/bin/b.pl...will be changed

Now confident in the outcome, I remove the -l to proceed with the changes.

What I really want, however, is "@doc = /api/perl/bin/replace.html" rather than "@doc = /api/perl/bin/replace.pl". (That is, I now need to change the file extension from pl to html.) So I run this second invocation to complete the task:

        replace.pl -e pl "(@doc\s+=\s+/api/perl/bin/\w+)\.pl" "$1.html" .

Filtering

Use the -e option to filter only files matching a particular extension. The above example shows how this is done for files ending in pl. Note that in no case can you operate on .cmp files since that is the extension reserved for a copy of the original file when changes are made. Hence, whether you use -e or not, files ending in .cmp will never be processed.

BUGS

None

AUTHOR

Michael Sorens

VERSION

$Revision: 8 $ $Date: 2006-12-19 21:13:43 -0800 (Tue, 19 Dec 2006) $

SINCE

CleanCode 0.9

CleanCode Perl Libraries