From Novice to Adept: Scalar Context and Arrays

| 3 Comments

The Perl community has a notion of "baby Perl". It's the subset of Perl in which you can write useful, one-off programs without learning how to program Perl. It's okay to write baby Perl, but if you come back to a baby Perl program you wrote six months ago and you can't maintain it, that's because you didn't learn how to write grownup Perl.

Now that's a divisive, polemic statement -- but assume it's true for the sake of a more interesting argument.

How can you tell if you've written baby Perl? One tell-tale characteristic is that baby Perl is almost entirely ignorant of the notion of context. Context is a linguistic component (important in tagmemics, for future language archaeologists) which means that the meaning of specific units of speech depends on the interpretation of surrounding units of speech.

There are deep parallels to draw between other computer programmy notions of genericity and polymorphism, but the important point right now in Perl is that the language will do its best to treat expressions as you've treated them.

If you can parse that sentence, you can understand context in Perl. (If you can find the single errors in this sentence, you'll do just fine.)

There are many aspects to context, but one reliable indicator of baby Perl is trying to find the number of elements in an array:

sub count_elements
{
    my $count = scalar @_;
    return "You passed $count elements!";
}

I've emboldened the applicable line.

One axis of context in Perl 5 is the distinction between one and many. You see something similar related to subject-verb number agreement in English: something seem wrong about this sentence, because the number of the verb (to seem) does not agree with the number of the noun (something).

The markers for scalar/list context in Perl 5 are different. In an assignment context, a list (or aggregate variable) on the left-hand side of the assignment induces a list context for the assignment. A scalar on the left-hand side induces a scalar context for the assignment.

The context of the assignment governs the type of evaluation of the right-hand side of the assignment.

That is, if you assign an array to a list, the list will contain elements of that array. If you assign an array to a scalar, the scalar will contain the count of the number of elements in the array.

There are other rules which govern how various Perl 5 constructs behave when evaluated in list versus scalar context, but I'm focusing on the specific case of counting the number of elements in an array.

The scalar operator in the example code can serve to disambiguate context, but it's unnecessary in this case -- clear evidence that someone copied and pasted code without necessarily understanding its purpose.

This code is more likely, however:

sub count_elements
{
    my ($count) = scalar @_;
    return "You passed $count elements!";
}

Note the additional parentheses around $count. In Perl 5, this turns the left-hand side of the assignment from a scalar into a one-element list, resulting in a list context for the assignment and evaluating the argument list in list context. The scalar is there to disambiguate the context -- but that uses two unnecessary constructs.

My guess is that, in this case, the original author thought that my is a function that requires parentheses around its arguments.

Occasionally this construct appears:

my $count = $#array + 1;

I'm not sure why there's never a scalar there. The $# sigil of an array produces the index of the final element of the array. As Perl 5 arrays start at 0 by default (don't correct this in the comments -- this isn't PerlMonks; there's a karmic penalty here for showing off how clevre you are), adding one to that gives the number of elements in the array.

Sometimes it is necessary to find the index of the last element (though it's rare), but I've seen books that recommend this idiom. This is not a joke; one of them recommended this approach over the documented, clearer, and Perlish "evaluate the array in scalar context" approach claiming that scalar context was "an apparent bug which will hopefully be fixed in a future version of Perl."

If that's not an argument for learning Perl before writing about Perl, I don't know what is.

3 Comments

While I understand this might be considered baby Perl:
    my $count = scalar @_;
it also indicates intent, which is perhaps more important than the seemingly unnecessary construct.

My eyes are trained to see $count = @_; as a bug because I don't understand the intention. Was I taking the count of an array, or did I mean one of these instead:

  • $count = @_[0];
  • @count = @_;
  • $count = \@_;
  • %count = @_;
(The latter being most unlikely, but not improbable, given fat fingering - $ and % being next to each other)

> my $count = scalar @_;

This is not baby Perl in any sense. It is clearly written Perl. Anyone can look at it and know instantly what it means.

"=" means "assign something on the right side to something of an equal type on the left side". Since @arrays aren't the same as $scalars Perl does it in the background, thus changing the meaning of = in this case. This can not only confuse newbies, but also slow down seasoned Perl developers by forcing them to change mental contexts when reading the code. Adding the scalar makes it clear and an absolute no-brainer when reading the code.


> I'm not sure why there's never a scalar there.

To me this seems a bit silly to ask. Consider the meaning of = I wrote above, then consider this again:

> my $count = $#array + 1;

You have a scalar on the left, a scalar on the right, so why would you even think about needing "scalar" here? There is nothing there which you would need to convert to scalar.


I really to agree with the middle point you make. That is a clear indicator of someone needing to be educated.


On a completely unrelated sidenote:
You might want to activate some sort of semi-anonymous commenting here. The login systems you have in place are quite annoying. Neither of the services you use are used by me in any way that is not one of maybe 3 blogs i sometimes feel like commenting on. This forces me to look for the login every time and thus pushes me towards not commenting at all because it's a hassle. (Makes me think of your pessimization article actually, the login process here is such a pessimization.) I wouldn't be surprised if there would be a not unconsiderable amount of other people who would comment too, but simply can't be arsed to because it's a hassle. ;)

I disagree, @mithaldu. The vestigial scalar doesn't clarify anything. The context of the assignment is obviously scalar to anyone who knows Perl.

My contention is that being able to identify scalar context assignment is a prerequisite to writing idiomatic Perl, and that identifying scalar context in this sense is very easy. Any so-called seasoned Perl developer who has to switch contexts to understand this code is not a seasoned Perl developer, in my estimation. I find the vestigial scalar more confusing than anything else; it's not idiomatic.

Someone elsewhere pointed out that the naming of variables is more important for clarity than the presence or absence of the useless scalar operator in this case.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on October 12, 2009 1:21 PM.

Remove the Little Pessimizations was the previous entry in this blog.

From Novice to Adept: Declarations and Scope is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?