CleanCode Perl Libraries
Multi-Lingual Library Maintainability
available: Perl available: Java available: JavaScript available: Certified
Class
not available: Testable
Class
not available: Standalone
Mode
available: Diagnostic
Enabled

NAME

Data::InputOptions - Manages a collection of data inputs and configuration properties.

SYNOPSIS

  use Data::InputOptions;
  
  # use command-line and configuration file data
  $settings = Data::InputOptions->new(\@cmdLineData, \@configData);
  # use command-line and configuration file name
  $settings = Data::InputOptions->new(\@cmdLineData, "settings.conf");
  # casual use--pass the command line arguments directly
  $settings = Data::InputOptions->new(\@ARGV);
  # casual use--pass the CGI input
  $settings = Data::InputOptions->new(CGI->new());
  # also check for required data and legal config properties
  $settings = Data::InputOptions->new(
      $cmdLineData, "settings.conf", $requiredFields, $paramMap);
  
  # retrieve property; use supplied value as default if not defined
  my $prompt = $settings->getProperty("PROMPT", "% ");
  # retrieve property; no default
  my $prompt = $settings->getProperty("PROMPT");
  # retrieve data
  my $index = $settings->getData("indexValue");
  
  # other methods
  if ($settings->isProperty($itemName)) { ... }
  $settings->addProperty("dir", "/foo");
  $settings->addData("dir", "/bar");  # property and data are distinct!
  $configKeys = $settings->getKeySet();
  $dataKeys = $settings->getDataKeySet();

REQUIRES

Perl5.005, Getopt::ArgvFile, Data::Handy, File::Handy, Data::Diagnostic, Data::DumperAbbrev

DESCRIPTION

Feature Summary

This module provides a flexible approach to processing program inputs, separated into configuration properties (settings, options, and switches that specify how the target program should operate) and data inputs (values on which to operate), whether the inputs are from an interactive command, a configuration file, or a web server invocation. The main features are:

Each feature is described in the following text.

Manages command-line and configuration file properties.

Properties are used to specify the configuration of your program. The Diagnostic module for example uses properties to, among other things, specify a diagnostic mask, to specify output channels, and to specify diagnostic levels for different events. Typically these settings do not change very often and are stored in a configuration file read by your program during startup. So you pass the name of this configuration file to the InputOptions constructor to process the settings it contains.

But sometimes you may want to vary certain properties under different conditions. Therefore InputOptions also supports properties from the command line. These properties may be ones already in the configuration file, or they may be new ones. The configuration file is processed before the command line, so you might think of it as establishing defaults for the properties it contains. Then any command-line configuration properties of the same name override those defaults (if allowed), so you can modify properties with each invocation if desired. Note that you may use either command-line properties or configuration file properties, or both, at your discretion.

Designed for interactive use as well as via web services.

As mentioned in the previous section, your command line may contain properties that will override any configuration file properties. Note that the command line is used broadly here. It applies whether your program is used interactively or invoked by a web server, where the "command line" is really the list of form (or other) parameters passed from a web page. And the InputOptions class allows you to mix both properties and input data on the command line. From a web page, then, you may include form data that is entered by the user, and properties that you have stored on the form as hidden parameters perhaps. These properties are distinguished from the user data by naming conventions, described in the next section.

An important use of this is for problem diagnosis. You could use a variety of mechanisms, but the CleanCode Diagnostic module is well-suited for this. Included in your configuration file you would typically specify no diagnostic output during the normal execution of your program. But then you run into some unexplained behavior. Just from the command line you may vary the diagnostic settings to display different information that may help you identify the problem. This will even work through a web browser, since the Diagnostic module can attach diagnostic output directly onto the target web page.

Supports multiple independent property sets.

A configuration can store more than one value for each configuration property, selectable from the command line. Say, for example, that during the week you wish to use properties A=5, B=6, and C=xyz, while during the weekend you want to run the same program with A=24, B=all, and C="--". Each set of properties is assigned a label of your choice called a mode. You could use "weekday" and "weekend" to designate the two modes, for instance. On the command line you simply add a property MODE=weekday (or MODE=weekend) to access the property set you want. In the configuration file, you must then have qualified names for each of these properties, as in A:weekday=5 and A:weekend=24, etc.

A very handy use of this capability is for diagnosing problems. Let's say you have instrumented your program with diagnostic output perhaps using the CleanCode Diagnostic module. For regular users, you do not want any diagnostic output to be displayed, so you would have DIAG_LEVEL=0 in the configuration file. But you will also add DIAG_LEVEL:debug1=0x02 and DIAG_LEVEL:debug-all=0xfff. Then whenever you choose--without making any modifications to your program or configuration file, you may invoke your program with diagnostic output simply by specifying MODE=debug1 for a little bit of stuff, or MODE=debug-all for a lot of stuff. Note in this example that not all property names must be qualified; the normal-user case has an unqualified name so no MODE property is needed to activate it.

The unqualified name is the default or fallback value for a given property. If you specify a mode for which any property is not qualified for, that property will inherit the unqualified property value (if any). So if you have A:testmode=5 and A=10 but you specify MODE=other, then A will be assigned the value 10. If, however, you have A:mode1=5 and A:mode2=10 but you specify MODE=other, then A will have no value, since there is no matching qualified property and there is no fallback or default property.

One caution: properties in the configuration file are used in the order listed in the file. So if you again have A:testmode=5 and A=10 -- in that order -- and you specify MODE=testmode, then A will be assigned 10, not 5 as you might have wanted. While 5 was assigned first when A:testmode=5 was processed, the next definition, A=10, was processed afterwards, and it also matches, being an unqualified name. Simply putting unqualified names first will alleviate this problem, ensuring that your qualified values will always be used. This is not a bug, but rather a feature. Sometimes you want to disallow mode switching, perhaps for security reasons. Simply putting the unqualified names last will accomplish this.

Separates input data from configuration properties.

InputOptions processes inputs into two categories, configuration properties and input data. An item is categorized according to the form of its name. Properties must be a standard identifier (letters, digits, underscore), begin with a letter, and have no lowercase characters, or must begin with the POSIX-standard double-hyphen. Data is anything else. So properties might be DIAG_LEVEL=5, SETTING1=abc, or --foo=bar, while data might be Name=Fred or phone=555-1212.

Command-line options come in several flavors. According to the Perl Getopt::Long API, "Historically, they are preceded by a single dash -, and consist of a single letter." This single-hyphen style is not supported by InputOptions. Again from Getopt::Long, "Due to the very cryptic nature of these options, another style was developed that used long names. So instead of a cryptic -l one could use the more descriptive --long." This long style is supported by InputOptions, as indicated above.

The simplest of the long-style options is the boolean style, with no argument, as in --debug. This is equivalent to this explicit form with an argument: --debug=true [Java] or --debug=1 [Perl]. (Perl users will realize that any non-false value will do here.) You may also negate the boolean style, as in --noDebug. This is equivalent to --debug=false [Java] or --debug=0 [Perl]. This negation technique may also be applied to the other (non-POSIX) InputOptions property format, mentioned earlier, using no lowercase letters. So QUIET=false [Java] or QUIET=0 [Perl] can also be written as noQUIET. Pay particular attention to the capitalization change from --debug=false to --noDebug and the lack of a capitalization change from DEBUG=false to noDEBUG.

Note that InputOptions does not directly support multiple values for a property. However, you can certainly include whatever token separator you wish (e.g. a comma) and then split the argument on that token yourself.

Supports indirect command-line properties via in-place expansion.

Sometimes you may have a lot of properties you want to specify on the command line. Perhaps you are not using a configuration file at all, or you have several properties you want to override, or you have several new properties not in the configuration file. Whatever the reason, you may run into command line length limitations from whatever operating system shell you're using, or you may just not want to type a lot. InputOptions supports in-place expansion of command-line properties using the "@" prefix on an argument. So, for example, if you have --x1=2, --x2=4.3, and --x3=true in a file called inline.conf, then you could mix properties on the command line as in

program_name --x1=4 @inline.conf --x4=0.5

which would be interpreted just as if it had been written

program_name --x1=4 --x1=2 --x2=4.3 --x3=true --x4=0.5

Observe that the x1 property is then given twice; the last one will be used, so the value will be 2, not 4.

So this is an alternate type of configuration file, a pseudo-configuration file, if you will. With this pseudo-configuration file though, properties do not necessarily get processed first; it depends on the order of properties on your command line.

Configuration file control to enable command-line overrides.

When using a true configuration file, you'll recall that configuration file properties will be processed first, then the command-line properties will be processed. If any of these have the same names as those from the configuration file, the command-line item will override the configuration item. On the positive side, this allows for dynamic control of your configuration. On the negative side, this allows any user the potential to change your runtime parameters. This is particularly unsecure when your application serves up web pages, so InputOptions provides a simple access control mechanism to limit access to your runtime parameters.

Your configuration file must specify ALLOW_CMDLINE=1 to permit command-line parameters to be used; otherwise any command-line parameters will just be silently ignored. So if you wish to diagnose a problem by varying certain parameters from a web browser, turn the switch on in your configuration file, do your testing, then turn it back off.

Optional password access on command-line overrides.

For added security, you may also specify that a password is required for accepting parameters from the command line. In your configuration file, specify a value for CMDLINE_PASSWORD. Then command-line parameters will only be used if the command-line parameter CP is supplied with the same value. If you have required a password with this mechanism and the command-line CP does not match the config file CMDLINE_PASSWORD, a warning message will be emitted, indicating the password did not match, and again any command-line parameters will be ignored. The warning is emitted via the Diagnostic.warnPrint method, so it will be output to the web page or to the error log, depending on how you have configured the Diagnostic module.

Provides convenient casual-mode if no configuration file specified.

A complex system typically would require a configuration file in conjunction with command-line inputs. But a more "casual-mode" program might not need to use a configuration file. So if you only want to use command-line inputs, then neither the ALLOW_CMDLINE nor the CMDLINE_PASSWORD properties are needed. You may just grab the command-line parameters and start using them without hindrance from the more advanced features.

Checks for validity of specified property names.

To assist in ensuring that your program is doing what you expect, this module has an optional facility to check that all properties specified in either your configuration file or on the command line are valid properties, i.e. that they are the properties you are expecting. This can be helpful in anything from identifying simple spelling errors to catching completely incorrect names. If an invalid property name is detected, a warning message is output through the warnPrint method of the Diagnostic module, which directs the final output to a channel or channels that you have specified. By default, this will be standard error. Note that this applies only to configuration properties, not to input data.

Checks that all required data items are provided.

To assist in confirming that the user has supplied the necessary data you need for processing (e.g. has the address or the phone number been supplied?), this module has an optional facility to check that all required data items have been supplied. If a data item is missing, a warning message is output through the warnPrint method of the Diagnostic module, which directs the final output to a channel or channels that you have specified. By default, this will be standard error.

Reports on properties and data provided.

So that you don't have to write your own code to dump program inputs for diagnostic purposes for every program you write, this module has a convenience facility enabled via the Diagnostic module which will display all properties and input data received. Properties are displayed as name and value pairs; input data will display only names, since some of the data might be sensitive (like a credit card number). To get this output, you must explicitly enable the INPUTOPTIONS_DIAG level in the Diagnostic module.

Overview of Operation

The intent is to make it easy to make your program configurable. Thus, you can create a configuration file containing things like: the string used for a prompt, the prefix of an error message, the directory in which to store data files, settings of boolean switches, a time-out interval, default values, and so forth. You can--and should--define settings used by your own classes as well as CleanCode library classes, such as diagnostic levels used by the Data::Diagnostic class. Each library class which uses InputOptions properties will have in its documentation a table which enumerates each property, its type, its default value, and what it is used for.

Invoke the new method during your program initialization, specifying your configuration file and your command-line data. The configuration data may be either an array reference or a configuration file name. Either way, it ultimately points to a list of properties as described next. The command-line data may be either an array reference providing similarly structured data, or a single string for convenience. <br /><br /> Once initialized, simply use getProperty to retrieve a property by name, or getData to retrieve data. [Java only] This method usually returns a String, so there are convenience methods getIntProperty and getBooleanProperty as well. (getProperty could also return an integer or a boolean if you use the appropriate two-argument version).

Configuration File

The configuration file consists of a list of properties, one per line. Each property has the form:

name=value #comments

Whitespace is ignored, as are blank lines and comment lines. The value may be enclosed in double quotes, though the quotes are optional. Note that if you wish to include the comment designator itself (#) in a property value, you must disable comments with the directive .NOCOMMENT on a line by itself. Use the corresponding .COMMENT to re-enable comments afterward.

The system allows flexibility in defining and redefining properties.

Command Line

The command line parameters must generally be in the same form as above:

name1=value1 name2=value2 ...

except, of course, that all are on a single line with appropriate shell quoting. If you don't need the command line argument list for anything else, you can simply pass the argument list of main to new. You may also specify a set of command-line inputs from a file using command-line expansion (@filename). You may freely intermix command-line parameters with command-line expansion. So, the command-line actually consists of any combination of name-i=value-i and @filename-i. A command-line expansion file is not the same as a configuration file, however. The former is subject to access control constraints described elsewhere, does not support the .COMMENT/.NOCOMMENT directives, and does not support partial-line comments. (You may use full line comments.) InputOptions also provides a convenient way to pass a configuration file to your program via the command-line with a predefined parameter: --file=filename. This will only work if you do not explicitly pass a configuration file to the InputOptions constructor. So this is more for command-line programs rather than server applications.

Using command line and configuration file parameters<br /> There are three combinations to consider: command-line parameters only, configuration file parameters only, or both.

The typical scenario for using both command-line parameters and configuration file parameters would be with a web application. You have a web page which invokes a CGI program, sending along data which can be either data (form values) or control information used as command-line parameters to InputOptions. Upon startup, your CGI program creates an InputOptions object, passing a reference to its configuration file, plus a CGI object from which to extract the command-line parameters.

The above discussion focussed on using both command-line parameters and a configuration file. If you do not use a configuration file, then the access control mechanism is not used; hence, all supplied command-line parameters will be used. Finally, if you do not use any command-line parameters, then there is no access-control issue.

Other Considerations

Depending on your needs, you should consider other techniques as well: Java provides the System.getProperty method whereby you can access a limited (platform-independent) set of environment properties as well as command-line properties defined with the -D option. Perl, on the other hand, provides access to all environment properties via the %ENV hash. C operates similarly, with the getenv function. (In fact, Java also has a getenv method, but it is deprecated as it is non-portable.) Both Perl and Java libraries provide the getopts function for parsing command-line options with great flexibility. Perl also provides assorted Config modules.

CLASS VARIABLES

$VERSION

Current version of this class.

CONSTRUCTOR

new

Package->new(cmdLineData, configData)

Package->new(cmdLineData)

Initializes the InputOptions property list from the specified configuration file or data with any command-line overrides.

The cmdLineData and configData lists may have either of two formats:

a list of string definitions

[ "name1=val1", "name2=val2", ... ]

a list of arrays of name-value pairs

[ [ "name1", "val1" ], [ "name2", "val2" ], ... ]

Parameters:

cmdLineData - a list of command-line overrides or a CGI object containing command-line overrides

configData - optional; a list of configuration settings, or a file name containing a list of configuration settings

METHODS

configFileLoaded

OBJ->configFileLoaded()

This indicates to other modules whether a config file is being used. This allows, for example, Diagnostic to know it's ok to open log files. Some applications, though, might be just casual users and not want a config file or a log.

Returns:

boolean indicating whether a config file has been loaded

isProperty

PACKAGE->isProperty(name)

Determines if a name qualifies as an InputOptions property. Input items may be either configuration properties (retrievable with the getProperty method) or input data (retrievable with the getData method). Names must conform to a specific pattern to be matched as a property, either an all-capital word, or a word beginning with two hyphens. In the latter case, one must pass in a string containing the two hyphens; otherwise, it could not be distinguished from a data item. Note that for getProperty, however, the two hyphens must not be included.

Parameters:

name - string; name to check

Returns:

boolean indicating whether name represents a property

getProperty

OBJ->getProperty(key, defaultVal)

OBJ->getProperty(key)

Returns a value for the specified property. If the property is not defined, the default value, if any, is used to define the property first.

Parameters:

key - string; name of property to retrieve

defaultVal - optional; string; default value to return if no property

Returns:

value for the specified property

getData

OBJ->getData(key, defaultVal)

OBJ->getData(key)

Returns a value for the specified datum. If the value is not defined, the default value, if any, is used to define it first.

Parameters:

key - string; name of datum to retrieve

defaultVal - optional; string; default value to return if no datum

Returns:

value for the specified datum

addProperty

OBJ->addProperty(key, val)

Adds a value for the specified configuration property.

Parameters:

key - string; name of property

val - string; value to set

addData

OBJ->addData(key, val)

Adds a value for the specified data item.

Parameters:

key - string; name of datum

val - string; value to set

getKeySet

OBJ->getKeySet()

Returns a set of configuration property names.

Returns:

set of property names

getDataKeySet

OBJ->getDataKeySet()

Returns a set of data property names.

Returns:

set of property names

STANDALONE TESTING

You can work with the InputOptions class in isolation to get a feel for what it does, using the main function. Invoke the class as a main program, calling the function from the command-line as in:

        perl -mData::InputOptions -e "Data::InputOptions::Test::main()"

The argument choices are:

  main(?LIST) -- displays recommended test sequence
  main(name-test)  -- displays diagnostics for InputOptions::Test class
  main(name-inputoptions)  -- displays diagnostics for InputOptions class
  main(no-init) -- no diagnostics, as $settings not used by Data::Diagnostic
  main(validate) -- checks validation of parameters by paramMap
  main(all) -- displays diagnostics for all classes
  main(comment) -- checks for active/inactive comment char
  main(mode1 - mode4) -- 4 tests checking mode selection of params
  main(cmd-override) -- override config value with cmdline value
  main(cmd-override-ignored) -- ignore cmdline override due to no request flag 
  main(cmd-override-pwd) -- override config value with password required
  main(cmd-override-pwd-invalid) -- ignore override due to invalid password
  main("a=x","b=y",...) -- uses the specified command-line settings
  main("filename","a=x","b=y",...)
        -- reads config file, and overrides with the command-line settings

(Note that if you use a configuration file, to actually see any output you must at least turn on the bit for STDERR with OUTPUT_DIAG=2.)

Try this, for example: create a configuration file called settings.conf which defines INPUTOPTIONS_DIAG=16. Run this:

        perl -mData::InputOptions -e "Data::InputOptions::Test::main()" \
                settings.conf DIAG_LEVEL=16

to see the diagnostics for InputOptions, or use any other value for DIAG_LEVEL (which does not contain the bit for the value 16) to suppress them.

BUGS

None

AUTHOR

Michael Sorens

VERSION

$Revision: 380 $ $Date: 2008-08-07 07:02:51 -0700 (Thu, 07 Aug 2008) $

SINCE

CleanCode 0.9

SEE ALSO

Data::Diagnostic

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 572:

=back doesn't take any parameters, but you said =back -- end of CLASS VARIABLES section

Around line 651:

=back doesn't take any parameters, but you said =back -- end of CONSTRUCTOR section

Around line 942:

=back doesn't take any parameters, but you said =back -- end of METHOD section


CleanCode Perl Libraries Copyright © 2001-2013 Michael Sorens - Revised 2013.06.30 Get CleanCode at SourceForge.net. Fast, secure and Free Open Source software downloads