CleanCode Perl Libraries
Multi-Lingual Library Maintainability
available: Perl not available: Java not available: JavaScript not available: Certified
not available: Testable
not available: Standalone
not available: Diagnostic


slice - Return a slice of one or more files specified by pattern and offset.


slice options files


--bodyTag | --nobodyTag

Adds html and body tag brackets around the extracted text, i.e. <html><body>...</body></html>.


String used to generate a title tag and an h1 tag. Requires the bodyTag option to be used also. Changes the bracketing tags to: <html><head><title>...</title></head><body><h1>...</h1>...</body></html>.


String (typically opening HTML fragment) printed preceding each sliced file. See the note on interpolated text below.


String (typically HTML row/cell tag fragments) printed between each pair of sliced files. Not printed if only one file to slice. See the note on interpolated text below.


String (typically closing HTML fragment) printed following each sliced file. See the note on interpolated text below.


Start extraction with first occurrence of pattern.


Stop extraction with first occurrence of pattern. If omitted, or not found, extracts through end of file.

--startAdj=[!]pattern | integer

If a pattern, adjusts starting line determined by startPat by searching forward (or backward with ! prefix). If a number, adjusts the starting line by the number (positive or negative).

--stopAdj=[!]pattern | integer

If a pattern, adjusts ending line determined by stopPat by searching forward (or backward with ! prefix). If a number, adjusts the ending line by the number (positive or negative).


After slicing by rows via the various start and stop options, you may additionally slice by columns by specifying a pattern to match within each line. If omitted, entire line is returned as part of the extraction. If included, you must include exactly one subexpression group (with parentheses) to grab a piece of text; otherwise, you'll just get a count of what was matched. If the pattern fails, the entire line is skipped (i.e. you do not get the original line, nor a blank line--you get no line!).

--verbose | --noverbose

If true, prints info about matched line numbers.


One or more files to slice. If no file specified, reads from STDIN.


Perl5.005, Getopt::Long, Data::Handy, Array::Slice


Slice extracts a piece of a text file (or a set of files). It was named after the analogous array slice concept in Perl. If you think of a text file as an array of lines, slice returns an array slice of that array, but rather than specifying by line number, you specify by pattern (i.e. regular expression).

startPat and stopPat are the main selection patterns to define a range from a file. Both of them match the first occurrence of their respective patterns in the file. You may refine the range, though, with startAdj and endAdj. With these, you may offset the range either forward or backward. startAdj and endAdj may be patterns or signed integers. A pattern p will move the range boundary forward; while !p will move the range boundary backward (i.e. prefix the pattern with a "!"). Similarly, a positive integer moves the boundary forward; a negative integer moves it backward. (All of these movements are by line.)

When this program is used with a web page, one would generally lose the proper HTML structure by extracting a middle section. The command-line options bodyTag, titleTag, startText, middleText, and endText provide some correction for this.

Interpolated Text

The startText, middleText, and endText command-line options are subject to text interpolation as follows. Instances of \n and \t are converted to actuals newlines and tabs, respectively.

<FILE_PATH> is replaced with the full current file specification.

<FILE_NAME> is replaced with the current file name (i.e. no path).

<FILE_BASE> is replaced with the base name (i.e. no path or extension).


Example for market guide screen:

 % --bodyTag \
        --startText="<table>\n" --endText="\t</td></tr>\n</table>\n" \
        --startPat="Total Match" --stopPat="colspan=10" \
        --startAdj=!tr --stopAdj="colspan=10" < input.htm

Example for series of pages from stored in files p01.htm through p14.htm:

 % perl -I/mydocu~1/ms/devel/perl
        --startText="<table>\n" --endText="</table>\n" --bodyTag
        --titleTag="Melbourne E-Book Listing"
        --startPat="search results" --stopPat=zone --startAdj=9 --stopAdj=-7 p*.htm




Michael Sorens


$Revision: 1178 $ $Date: 2011-10-31 14:26:51 -0700 (Mon, 31 Oct 2011) $


CleanCode 0.9


Hey! The above document had some coding errors, which are explained below:

Around line 131:

=back doesn't take any parameters, but you said =back -- end of SYNOPSIS section

CleanCode Perl Libraries Copyright © 2001-2013 Michael Sorens - Revised 2013.06.30 Get CleanCode at Fast, secure and Free Open Source software downloads