CleanCode logo
sitemap
SEARCH:
NAVIGATION: first page in sectionup one levelnext pagefinal page in section

Guideline CD1: Code refinement

When you write code, don't try to make it perfect the first time around. Get something written and working, then go back and improve it as time and resources allow, and as performance requires. (If 10% of your code takes 90% of the processing time, worrying about some portion other than that 10% does not make a lot of sense for performance reasons.

Generated POD
  =item myFunc
  I<OBJ>->myFunc(I<any>, I<vars>, I<here>)
  Purpose TBD
  =over 4
  =item Parameters:
  C<any> - optional; string; TBD
  C<vars> - optional; string; TBD
  C<here> - optional; string; TBD
  =item Returns:
  Returns TBD
  =back
  =cut

But it may make sense for maintainability! Remember, we're after clean code. Let's take a look at an example. This is a real tool I wrote to help me automate the documentation of Perl files with pod. The most repetitive part of documenting a Perl file (and after all, that's what computers are for!) is documenting each function or method. I prefer a rigorous format, borrowed from Java, which specifies the calling interface, a description, each input parameter, and the return value for each external function. External functions are part of the API for the module, and thus it is vital to fully document these. (I do not apply the same rigor to internal (private) functions, though eventually...)

So in the following program, the main (callable) routine is the process method. This simply passes the contents of each file to the processFile method, which searches for each function definition immediately followed by local variable assignments which define the parameters. The tool, using regular expressions, can recognize forms such as my ($any, $vars, @here) = @_; or my $anyVar = shift; or my @anyList = @_ or my $anyVar = $_[0];. Once each function name and its arguments are collected, a pod skeleton is inserted just before the function's definition by the processFunc method. There's still work to be done by you, the developer, but just the meaty parts: (1) Wherever there's a TBD write something. (2) Check if parameters are optional; if not just delete that word. If so, add an extra invocation line near the top which shows the alternate form. (3) Check the parameter type. If not the default shown (string), change it.

EXAMPLEJavaScript
First pass:
process();

sub process {
	local $/ = undef;
	my $multipleFiles = ($#ARGV > 0);
	while (my $text = <>) {
		print STDERR "$ARGV...\n" if $multipleFiles;
		processFile(\$text);
	}
}

sub processFile {
	my $text = ${$_[0]};
	my $oof = $text =~ /sub\s+new\b/;
	my ($subline, $name);
	foreach (split("\n", $text)) {
		if ($name) {
			if (/^(\s*my\s*\((.*)\)\s*=\s*\@_\s*;.*)/) {
				print processFunc($name, $oof, split(/\s*,\s*/, $2));
				print "$subline\n$1";
			}
			elsif (/^(\s*my\s*(\S*)\s*=\s*shift\b.*)/) {
				print processFunc($name, $oof, $2);
				print "$subline\n$1";
			}
			else {
				print processFunc($name, $oof);
				print $subline;
			}
			($subline, $name) = (undef, undef); # clear for next round
		}
		elsif (/^(\s*sub\s+(\w+)\s*{.*)/) {
			($subline, $name) = ($1, $2);
		}
		else {
			print
		}
		print "\n";
	}
}

sub processFunc {
  my ($name, $oof, @args) = @_;
  my $ooffix = $oof ? "I<OBJ>->" : "";
  my $argList = join(", ", map { /.(.*)/ && "I<$1>" } @args);
  my $argList2 = join("\n\n", map { /.(.*)/ && "C<$1> - optional; string; TBD" } @args);
  my $paramSection = (@args) ?
    "=over 4\n\n=item Parameters:\n\n$argList2\n\n=item Returns:\n\nReturns TBD\n\n=back\n\n" : "";
  return "=item $name\n\n$ooffix$name($argList)\n\nPurpose TBD\n\n$paramSection=cut\n\n";
}

I draw your attention to the processFunc method. The other routines are reasonably clear... assuming you are on intimate terms with regular expressions, that is! More on that later. As mentioned earlier, processFunc is what creates the pod block. But it is not at all clear how it is doing it. The fact that the template it is using is part string, part mapping and other code chunks all intertwined, makes it quite difficult to make changes down the road. Heck, I wrote this a couple days ago and I would have to stare at it to remember how it works! So let's look at the next version.

EXAMPLEJavaScript
Second pass:
  use HTML::Template;
  my $tmpl = HTML::Template->new(filehandle => *DATA, loop_context_vars => 1);
  process();

  sub process {
	local $/ = undef;
	my $multipleFiles = ($#ARGV > 0);
	while (my $text = <>) {
		print STDERR "$ARGV...\n" if $multipleFiles;
		processFile(\$text);
	}
  }

  sub processFile {
	my $text = ${$_[0]};
	my $oof = $text =~ /sub\s+new\b/;
	my ($subline, $name);
	foreach (split("\n", $text)) {
		if ($name) { # my ($abc, @def) = @_;
			if (/^(\s*my\s*\((.*)\)\s*=\s*\@_\s*;)/) {
				print processFunc($name, $oof, split(/\s*,\s*/, $2));
				print "$subline\n$1";
			}
			elsif (/^(\s*my\s*(\S*)\s*=\s*(shift|\@_|\$_)\b)/) {
				# "my $abc = shift" or "my @abc = @_" or "my $abc = $_[0]"
				print processFunc($name, $oof, $2);
				print "$subline\n$1";
			}
			else { # no params
				print processFunc($name, $oof);
				print "$subline\n$_";
			}
			($subline, $name) = (undef, undef); # clear for next round
		}
		elsif (/^(\s*sub\s+(\w+)\s*{.*)/) {
			($subline, $name) = ($1, $2);
		}
		else {
			print
		}
		print "\n";
	}
  }

  sub processFunc {
	my ($name, $oof, @args) = @_;
	$tmpl->param(name => $name, oom => $oof);

	# Generate a list of parameters, but strip off the leading $, @, or %
	$tmpl->param(items => [ map { /.(.*)/ && { item => $1 } } @args ]);

	return $tmpl->output();
  }
  __DATA__
  =item <TMPL_VAR name>

  <TMPL_IF oom>I<OBJ>-></TMPL_IF><TMPL_VAR name>(<TMPL_LOOP 
 	 items>I<<TMPL_VAR item>><TMPL_UNLESS __LAST__>, </TMPL_UNLESS></TMPL_LOOP items>)

  Purpose TBD
  <TMPL_IF items>
  =over 4

  =item Parameters:

  <TMPL_LOOP items>C<<TMPL_VAR item>> - optional; string; TBD

  </TMPL_LOOP items>=item Returns:

  Returns TBD

  =back
  </TMPL_IF items>
  =cut

This version is quite a bit cleaner, and actually a bit shorter. Here we separate the template from the code. The highlighted line near the top instantiates a template from a template file. In this case, out of laziness I simply included the "file" at the bottom of the program. But it could almost as easily be an external file. Now, the processFunc method has just 3 statements: fill in the function name, fill in the parameter names and return the filled-in template. The template, then, has some processing instructions and place holders. Granted there's one line there that looks a bit messy, but it is easy to understand once you know the language. And the rest of the template is quite illuminating. I've highlighted the elements in the Perl code which are filling in the elements in the template. See the documentation for HTML::Template for further details of template building.

Now let's get back to regular expressions and state information. The power of Perl is that regular expressions are a primitive data type, and you can often use them to reduce your code size dramatically. Or, in our case, to reduce code complexity somewhat. The loop of processFile runs a simple state machine. Starting in state A, each line is scanned; if it contains a "sub" we remember the name and go to state B. In state B, we continue scanning, but now are looking for parameters. Whatever the result of the parameter search, we output the appropriate material, and go back to state A.

You'll see simple regular expressions used above, but with more expressive ones, we can eliminate the state machine, and in fact, eliminate the entire loop! The function becomes just the seven lines shown below.

EXAMPLEJavaScript
Third pass:
sub processFile {
  my $text = ${$_[0]};
  my $oof = $text =~ /sub\s+new\b/; # identify object-oriented-file or not

  # embellish sub xyz { my (vars) = @_; ... }
  $text =~ s[^(\s*)sub(\s+)(\w+)(\s*{\s*my\s*\((.*)\)\s*=\s*\@_\s*;)]
    ["\n".processFunc($3,$oof,split(/\s*,\s*/, $5))."$1XSUBX$2$3$4"]emg;

  # embellish sub xyz { my var = shift OR @_ or $_[n] }
  $text =~ s[^(\s*)sub(\s+)(\w+)(\s*{\s*my\s*(\S+)\s*=\s*(shift|\@_|\$_)\b)]
    ["\n".processFunc($3,$oof,$5)."$1XSUBX$2$3$4"]emg;

  # embellish sub xyz with no parameters
  $text =~ s!^(\s*)sub(\s+)(\w+)!"\n\n".processFunc($3,$oof)."$1XSUBX$2$3"!emg;

  $text =~ s/XSUBX/sub/g; # now put the 'sub' back everywhere
  print $text;
}

I say "just", yes, but there are those monster regular expressions! With some practice, they can be understood readily. For example, the first reads generally as, "Find 'sub' and a word (the function name), a brace, and a 'my (args) = @_'. Then, replace it with a newline, the pod block, and the original matched function header, retaining the spacing but changing 'sub' to 'XSUBX'." The business with the "XSUBX" prevents infinite recursion. At the end we change them all back to plain "sub" words.

Valid XHTML 1.0!Valid CSS!Get CleanCode at SourceForge.net. Fast, secure and Free Open Source software downloads
Copyright © 2001-2013 Michael Sorens • Contact usPrivacy Policy
Usage governed by Mozilla Public License 1.1 and CleanCode Courtesy License
CleanCode -- The Website for Clean DesignRevised 2013.06.30