The Problem with Prototypes

In How a Perl 5 Program Works and On Parsing Perl 5, I mentioned ways to manipulate the Perl 5 parser as it executes. The easiest way to do so is through the use of Perl 5 subroutine prototypes. This early draft excerpt from Modern Perl: the book explains the good and the bad of prototypes and when they're a good idea in modern Perl code.

Perl 5's prototypes serve two purposes. First, they're hints to the parser to change the way it parses subroutines and their arguments. Second, they change the way Perl 5 handles arguments to those subroutines when it executes them. A common novice mistake is to assume that they serve the same language purpose as subroutine signatures in other languages. This is not true.

To declare a subroutine prototype, add it after the name:

    sub foo        (&@);
    sub bar        ($$) { ... }
    my  $baz = sub (&&) { ... };

You may add prototypes to subroutine forward declarations. You may also omit them from forward declarations. If you use a forward declaration with a prototype, that prototype must be present in the full subroutine declaration; Perl will give a prototype mismatch warning if not. The converse is not true: you may omit the prototype from a forward declaration and include it for the full declaration.

The original intent of prototypes was to allow users to define their own subroutines which behaved like (certain) built-in operators. For example, consider the behavior of the push operator, which takes an array and a list. While Perl 5 would normally flatten the array and list into a single list at the call site, the Perl 5 parser knows that a call to push must effectively pass the array as a single unit so that push can operate on the array in place.

The prototype operator takes the name of a function and returns a string representing its prototype, if any, and undef otherwise. To see the prototype of a built-in operator, use the CORE:: form:

    $ perl -E "say prototype 'CORE::push';"
    \@@

As you might expect, the @ character represents a list. The backslash forces the corresponding argument to become a reference to that argument. Thus mypush might be:

    sub mypush (\@@)
    {
        my ($array, @rest) = @_;
        push @$array, @rest;
    }

Valid prototype characters include $ to force a scalar argument, % to mark a hash (most often used as a reference), and & which marks a code block. The fullest documentation is available in perldoc perlsub.

The Problem with Prototypes

The main problem with prototypes is that they behave differently than most people expect when first encountering them. Prototypes can change the parsing of subsequent code and they can coerce the types of arguments. They don't serve as documentation to the number or types of arguments subroutines expect, nor do they map arguments to named parameters.

Prototype coercions work in subtle ways, such as enforcing scalar context on incoming arguments:

    sub numeric_equality($$)
    {
        my ($left, $right) = @_;
        return $left == $right;
    }

    my @nums = 1 .. 10;

    say "They're equal, whatever that means!" if numeric_equality @nums, 10;

... and not working on anything more complex than simple expressions:

    sub mypush(\@@);

    # XXX: prototype type mismatch syntax error
    mypush( my $elems = [], 1 .. 20 );

Those aren't even the subtler kinds of confusion you can get from prototypes; see Far More Than Everything You've Ever Wanted to Know About Prototypes in Perl for a dated but enlightening explanation of other problems.

Good Uses of Prototypes

As long as code maintainers do not confuse them for full subroutine signatures, prototypes have a few valid uses.

First, they are often necessary to emulate and override built-in operators with user-defined subroutines. As shown earlier, you must first check that you can override the built-in operator by checking that prototype does not return undef. Once you know the prototype of the operator, use the subs pragma to declare that you want to override a core operator:

    use subs 'push';

    sub push (\@@) { ... }

Beware that the subs pragma is in effect for the remainder of the file, regardless of any lexical scoping.

The second reason to use prototypes is to define compile-time constants. A subroutine declared with an empty prototype (as opposed to an absent prototype) which evaluates to a single expression becomes a constant in the Perl 5 optree rather than a subroutine call:

    sub PI () { 4 * atan2(1, 1) }

The Perl 5 parser knows to substitute the calculated value of pi whenever it encounters a bareword or parenthesized call to PI in the rest of the source code (with respect to scoping and visibility).

Rather than defining constants directly, the core constant pragma handles the details for you and may be clearer to read. If you want to interpolate constants into strings, the Readonly module from the CPAN may be more useful.

The final reason to use a prototype is to extend Perl's syntax to operate on anonymous functions as blocks. The CPAN module Test::Exception uses this to good effect to provide a nice API with delayed computation. This sounds complex, but it's easy to explain. The throws_ok() subroutine takes three arguments: a block of code to run, a regular expression to match against the string of the exception, and an optional description of the test. Suppose that you want to test Perl 5's exception message when attempting to invoke a method on an undefined value:

    use Test::More tests => 1;
    use Test::Exception;

    throws_ok
        { my $not_an_object; $not_an_object->some_method() }
        qr/Can't call method "some_method" on an undefined value/,
        'Calling a method on an undefined invocant should throw exception';

The exported throws_ok() subroutine has a prototype of &$;$. Its first argument is a block, which Perl upgrades to a full-fledged anonymous function. The second requirement is a scalar. The third argument is optional.

The most careful readers may have spotted a syntax oddity notable in its absence: there is no trailing comma after the end of the anonymous function passed as the first argument to throws_ok(). This is a quirk of the Perl 5 parser. Adding the comma causes a syntax error. The parser expects whitespace, not the comma operator.

You can use this API without the prototype. It's slightly less attractive:

    use Test::More tests => 1;
    use Test::Exception;

    throws_ok(
        sub { my $not_an_object; $not_an_object->some_method() },
        qr/Can't call method "some_method" on an undefined value/,
        'Calling a method on an undefined invocant should throw exception');

A sparing use of subroutine prototypes to remove the need for the sub keyword is reasonable. Few other uses of prototypes are compelling enough to overcome their drawbacks.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on August 20, 2009 1:13 AM.

How a Perl 5 Program Works was the previous entry in this blog.

The Problems with Indirect Object Notation is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?