Chunking, Subtlety, and Whitespace

| 2 Comments

I delayed writing about references in Perl 5 in the Modern Perl book for a long time. References in Perl 5 are useful. They have their warts. They're not as difficult as most people believe, however. Novices have trouble learning how to use references effectively because most tutorials and introductions explain them poorly.

I had to think about explanations for a long time before I found a way to explain them well.

Of course, the syntax for dereferencing gets complex very quickly—but it's also an effective example of what I've been discussing this week. Perl has a handful of subtle design consistencies that, if you understand them, help you read and skim code very effectively. If you don't learn them, you'll get lost in a sea of punctuation soup.

Consider an array reference $monkeys_ref. You can get the number of monkeys by evaluating that reference as an array in scalar context in one of two ways:

# the short way
my $count = @$monkeys_ref;

# the disambiguatey way
my $count = @{ $monkeys_ref };

The former way is shorter and more idiomatic. Anyone familiar with Perl 5 references should understand what the additional sigil means ("I want a list from the following reference"). The latter syntax has the same effect, but it means instead "I want a list coerced from the expression evaluated within this block." The difference is subtle and you don't have to understand the subtleties for this example.

Trouble arrives when you deal with nested data structures or more complex expressions, such as slices:

# the short way
my $monkeys = join ',' @$monkeys_ref[@indices];

# the clearer way
my $monkeys = join ',', @{ $monkeys_ref }[@indices];

The first expression is somewhat more difficult to parse; which takes precedence, the indexing operation represented by the square brackets or the dereferencing operation indicated by the leading sigil? The second expression works because the intended order of operation is clear, at least to anyone who understands how curly-brace grouping works with complex references.

The whitespace is unnecessary, of course, but I find that it adds clarity.

A little bit of disambiguation isn't necessary to help the Perl 5 parser in this case, but it does helps the reader. Students of compiler design might argue that nested expressions this complex belong on separate lines. I can imagine how this would read in a pseudo assembly language (I work on Parrot, after all). There's definitely a balance between the complexity of nested expressions and dereferencing... but this is a place where I consider the idiomatic use of Perl 5 sufficiently expressive that spreading the list slice out over multiple lines would obfuscate the intent of the code.

Certainly it's possible to perform even more complex dereferences of data structures, but when it's difficult to identify individual chunks of the desired behavior, it's time to simplify the code or the expression or the design. Even still, readability of this code does should not depend on the desire to avoid teaching novices about references.

2 Comments

You don't necessarily have to use `@{$array_ref}[$index]`, you could just do `$array_ref->[$index]`

That's true for the scalar form, Ryan. You also don't need the brackets for the hash slice either, but I think they clarify rather than obfuscate.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on February 12, 2010 4:00 PM.

Chunks and Syntax Highlighting was the previous entry in this blog.

A Decade of Lexical Filehandles is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?