Processing Command-Line Options With Perl

Add support for command-line options to your Perl program.

Shell Games

If you've ever used a shell (UNIX or MS-DOS) before, you're probably already familiar with command-line options - they're the little arguments you pass to a program you're executing on the command line in order to modify its behaviour. For example, if you're deleting a directory, you could add a parameter to the command line to tell the system to delete all sub-directories under it as well or, if you're retrieving a directory listing with the "ls" command, you could add the "-l" command-line option to obtain a detailed listing instead of the abbreviated version.

Now, I'm not going to wax eloquent on the wonders of command-line options - the concept is a trivial one, and not so hard to understand that it needs more than a couple of lines of explanation. Rather, I'm going to get into something infinitely more interesting - showing you, the developer, how to add support for command-line options to your Perl program.

It's not as difficult as it might seem at first glance - and no, you don't need to stay up all night to get it done. All you really need is a copy of the Getopt::Long.pm Perl module, and a little imagination. Come on in, and be prepared to be amazed...this is potent stuff!

The Long And Short Of It

First, we need to get the definitions down. In case you're wondering what a module is, don't - for the purpose of this article, just assume that it's a thingamajig that allows you to add new capabilities to your Perl program, or a series of pre-rolled functions which can be plugged in to your Perl program.

There are a number of such modules out there - CPAN, the Comprehensive Perl Archive Network, at http://www.cpan.org/ , has a complete list - and most are available free of charge, and are simple to import into your Perl program. If you're running a fairly recent version of Perl, you probably already have Getopt::Long.pm installed as part of your distribution; if not, drop by CPAN and get yourself a copy.

You're probably wondering just what Getopt::Long.pm brings to the Perl party. Let me enlighten you.

The Getopt::Long.pm module provides a way for developers to read options passed to their program on the command line and act on them. Typically, such options provide a way for users to control the behavior of the program on an as-needed basis, by passing optional arguments to it; these options are usually preceded by a dash, as in the following example:

$ ls -l

In this case, "-l" is an optional argument passed to the "ls" program on the command line.

Under the POSIX standard, it's also possible to use longer, more readable command-line options, preceded with a double dash, as in the following example:

$ ls --color

The Getopt::Long.pm module provides an API for Perl developers to capture these long command-line options, and act on them within the business logic of the Perl script. This API is pretty advanced - it ignores case differences in option names, can resolve abbreviated option names to their longer counterparts (so long as they are unique), and recognizes both single- and double-dashes as option prefixes. For purists and those who are tasked with porting legacy applications, the module also supports the older, single-character form of command-line options (although only if they belong to the alphabetic set).

There is one primary function in the Getopt::Long.pm module - GetOptions() - and it serves as the main control point for you to access the options passed to the program. You can use this to read command-line options into Perl scalars, arrays or hashes; create user-defined subroutines to handle specific options; separate option "bundles" into individual units; and configure the behaviour of the module. More on some of these as we proceed through this tutorial.

Getopt::Long.pm is written in the best traditions of object-oriented programming, fondly known as OOP. If you're a fan of OOP, you can create a Getopt::Long object, which has its own methods and properties, and use standard OO syntax to access its functions, extend it, sub-class it, derive new hybrids from it...all kinds of good stuff, basically. In case you don't know what OOP is, you're probably not impressed. Good for you!

Down To Work

Now, with the hard sell out of the way, let's get down to the nitty-gritty of how Getopt::Long.pm works. Consider the following simple example:

#!/usr/bin/perl

# import module
use Getopt::Long;

# set default value for option
$debug = 0;

# get value of debug flag
$result = GetOptions ("debug" => \$debug);

# print value
print "Debug flag is $debug";

Now, try running this code as is:

$ ./script.pl
Debug flag is 0

And then try running it after adding a "--debug" command-line option:

$ ./script.pl --debug
Debug flag is 1

As you can see, the Perl code now recognizes the "--debug" flag on the command line, and sets a Boolean variable to true within the script.

Most of the magic here lies in the call to the GetOptions() function. This function accepts a series of option-variable pairs, demarcated using standard hash notation and separated with commas. When GetOptions() is called, it reads the program command line, looks for matching arguments, and if found, sets the corresponding option variable to true. Thus when you call the Perl script above with the "--debug" option, GetOptions() recognizes it and automatically sets the $debug variable to true.

You can set more than one option variable at a time as well - consider the following example and its output, which demonstrates:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options
$result = GetOptions (  "red" => \$red,
                "blue" => \$blue,
                "green" => \$green );

# print options
$red ? print "Red is present\n" : print "Red is absent\n";
$blue ? print "Blue is present\n" : print "Blue is absent\n";
$green ? print "Green is present\n" : print "Green is absent\n";

Here's an example of the output:

$ ./colors.pl  --red --blue
Red is present
Blue is present
Green is absent

Half-Life

In addition to setting Booleans, Getopt::Long.pm also supports processing string and numeric command-line arguments entered by the user. To illustrate how this works, consider the following simple example:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options
$result = GetOptions ("age=i" => \$age);

# print value
if ($age) { print "Input age is $age years"; }

Here's the output:

$ ./script.pl --age=89
Input age is 89 years

Here, the "i" data type specifier tells Getopy::Long.pm to expect an integer value after the option name. You can also use "s" for strings, as in the following example:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options
$result = GetOptions ("name=s" => \$name);

# print value
print "Input name is $name";

Here's the output:

$ ./script.pl --name=John
Input name is John

It is interesting to note that the equality symbol (=) sign between the option name and the data type specifier tells Getopt::Long.pm that an option value must be provided for the option. Look what happens if, for example, you specify the "--age" option without a corresponding integer value:

$ ./script.pl --age
Option age requires an argument

In order to make an option value optional (try saying that fast!), use a colon (:) instead of an equality symbol (=), as below:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options
$result = GetOptions ("age:i" => \$age);

# print value
print "Input age is $age years";

Here, in the absence of a value for the option, Getopt::Long.pm will automatically assign 0 (for integer values) or an empty string (for string values) to the corresponding option variable. Take a look at the output of the script above to verify this:

$ ./script.pl --age
Input age is 0 years

Opting In

Getopt::Long.pm can also process command-line arguments containing multiple values, simply by storing all the values in a Perl array. Consider, for example, the following script, which is designed to add email addresses to a subscription list (maybe for an email newsletter?). Here, the user can send as many email addresses as (s)he likes to the script, simply by repeating the "--add" option with different values. Take a look:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options
$result = GetOptions ("add=s" => \@list);

# once all the values are in the array
# do something with them

# open file for writing
open(FILE, ">>subscribers.dat");

# iterate over array
foreach $l (@list)
{
    # write array elements to file
    print FILE "$l\n";
    print "Added $l\n";
}

# close file when done
close (FILE);

Here's what the output might look like:

$ ./editlist.pl --add=me@me.com --add=you@you.com --add=them@them.com
Added me@me.com
Added you@you.com
Added them@them.com

What's In A Name?

Getopt::Long.pm also supports aliases for options, allowing you to provide users with an alternative, sometimes shorter way of accessing the same option. This is accomplished by placing alternative option names after the first one and separating the various alternatives with pipes (|). Consider the following example, which shows you how:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options
$result = GetOptions ("color|colour|c" => \$color);

# print value
$color ? print "Colors are on" : print "Colors are off";

Here, you can attach any of the options "--color", "--colour", "--c" or "-c" to the command line - Getopt::Long.pm will treat them all as one and the same.

$ ./script.pl --color
Colors are on
$ ./script.pl --c
Colors are on
$ ./script.pl -c
Colors are on
$ ./script.pl --colour
Colors are on

Negative Reinforcement

An interesting feature in Getopt::Long.pm is its ability to also support option negation - that is, disable an option by prefixing it with "no". For example, this means that while you could explicitly activate the "trace" option by sending the program "--trace" on the command line, you could also explicitly disable it by sending "--no-trace". However, it's only possible to do this with Boolean options, which can be set to either true or false; you can't do it with options that require numeric or string values.

In order to do this, simply add an exclamation (!) as negation symbol after the option name in the call to GetOptions() - as in the following example:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options
$result = GetOptions (  "eyes!" => \$sight,
                "nose!" => \$smell,
                "ears!" => \$hearing );

# print values
$sight ? print "Can see!\n" : print "Cannot see!\n";
$smell ? print "Can smell!\n" : print "Cannot smell!\n";
$hearing ? print "Can hear!\n" : print "Cannot hear!\n";

A user could now do something like this:

$ ./script.pl --no-eyes --no-ears --no-nose
Cannot see!
Cannot smell!
Cannot hear!

Note that there's an important difference between explicitly disabling an option in this manner, and not passing the option to the program at all. In the former case, Getopt::Long.pm sets the corresponding option variable to false or 0; in the latter, the option variable is null (or whatever default value it was assigned initially). Consider the following example, which illustrates this difference:

#!/usr/bin/perl

# import module
use Getopt::Long;

# default value for counter
$counter = -1;

# read options
$result = GetOptions ("counter!" => \$counter);

# print values
print "Counter is $counter";

Here's the output:

$ ./script.pl --counter
Counter is 1
$ ./script.pl --no-counter
Counter is 0
$ ./script.pl
Counter is -1

Hashing It Up

If you'd prefer, you can also store all command-line options directly in a hash, instead of creating separate scalars for each. The procedure is fairly simple - all you need to do is pass a reference to the hash variable as the first argument to GetOptions(), followed by the names of the options to be stored in it. Consider the following example, which demonstrates:

#!/usr/bin/perl

# import module
use Getopt::Long;

# read options into hash
$result = GetOptions (\%options, "base=i", "height=i");

# print hash values
print "Base = " . $options{'base'} . "\n";
print "Height = " . $options{'height'} . "\n";
print "Area = " .  0.5 * $options{'base'} * $options{'height'} . "\n";

Here's the output:

$ ./triangle.pl --base=10 --height=20
Base = 10
Height = 20
Area = 100

In this case, the values passed to the program on the command line are stored in a Perl hash called %options, from whence they can be retrieved using standard hash notation.

Over And Out

And that's about all we have time for. Over the course of the last few pages, I introduced you to one of the more interesting modules in the Perl pantheon, the Getopt::Long.pm module. This module provides a simple API to parse options passed to your Perl scripts at the command line and convert them into Perl scalars or arrays.

With some simple examples, I showed you how to use this API to detect Boolean options, as well as options taking string or numeric values. I also showed you how to create option aliases, and explicitly allow disabling of options with the negation symbol. Finally, I wrapped things up with a demonstration of how you could convert all the options passed on the command-line into a Perl hash, and also showed you how to customize the module to work as per your specific requirements.

If you'd like to read more about this module, consider visiting the following links:

CPAN, at http://www.cpan.org/

Getopt::Long pages on CPAN, at http://search.cpan.org/search?module=Getopt::Long

Till next time...stay healthy!

Note: Examples are illustrative only, and are not meant for a production environment. Melonfire provides no warranties or support for the source code described in this article. YMMV!

This article was first published on27 Feb 2004.