When Sugar and Semantics Collide

By chromatic on July 1, 2010 11:10 AM | 11 Comments

I use Moose to explain object orientation in Perl in the Modern Perl book. It's much easier to explain the what and why of OO with syntax like:

{
    package Cat;

    use Moose;

    has 'name', is => 'ro', isa => 'Str';
    has 'age',  is => 'ro', isa => 'Int';
    has 'diet', is => 'rw';
}

... than the corresponding code where you must write your own accessors, poke into a blessed hash directly (and bless it yourself), perform your own coercions and verifications, and the like.

Of course, the preferred syntax for doing this within the Moose documentation is different from how I've done things. Moose recommends:

{
    package Cat;

    use Moose;

    has 'name' => (is => 'ro', isa => 'Str');
    has 'age' =>  (is => 'ro', isa => 'Int');
    has 'diet' => (is => 'rw');
}

Sometimes you quote the name of the attribute and sometimes you don't.

Should I drop the parentheses? Should I drop the fat arrow between the name of the attribute and its specializers? I do in my own Moose code for my preferences, and I did in the book. Then I thought about it and realized why I write code this way.

First, a digression. Perrin Harkins mentioned the inability of the "Takes a block!" prototype to replicate builtin syntaxes as a reason to dislike syntax-bending modules such as Error. For example:


use Error ':try';

try
{
    ...
}
catch
{
    ...
};

... really needs that trailing semicolon. For similar reasons, many modules which use Devel::Declare magic go through contortions to add trailing commas and semicolons. Perl 5's syntax is malleable, but when the parser wants something from the lexer, it really really wants something from the lexer. (When it wants to know that a statement or a group of terms has ended, you don't get to lie.)

In other words, even though you have a lot of options for mangling Perl 5's syntax any way you like it, the semantics of the host language will shine through. A parenthesis is a parenthesis. A labeled block is a labeled block. A bare sub { ... } is never an expression on its own, and it can never terminate an outer expression.

This is one of the downfalls of the so-called "embedded domain specific languages". If you haven't written your own parser, you'll have to take what you can get. This is even true if you do write a parser and generate and eval code, and it's especially true if your EDSL desugars to chained function or method calls.

I'm not suggesting a flaw with Moose's approach: it's clever and Perlish and doesn't succumb to the saccharine cutery of so many other so-called DSLs. (To my knowledge, no one in the Moose world has claimed it's anything other than Perl 5 syntax bent slightly into something which looks declarative enough.)

My concern—especially when explaining object orientation in Perl 5 to novices—is that any extra syntactic elements might confuse people to think that they mean more than they mean. You and I might both understand that the grouping parentheses in the Cat attribute declaration are merely visual hints to the reader that the specializers are subordinate to the attribute itself and that the fat arrow between the name of the attribute and its grouped specializers confers the notion of pairing between the attribute and its specializers, but how do you explain that to someone who's still struggling to figure out what this encapsulation thing is all about?

I've attempted caution throughout the book such that the fat comma always signifies a pairish relationship, such as for hash keys or named arguments. Certainly you can always use it in place of the skinny comma (and, barring any quoting changes, vice versa), but is it clear to do so?

Likewise, you can wrap parentheses around almost any old rvalue (barring precedence changes) and not change the behavior of lists, yet this confuses novices all the time:

my @lololol = ( 1, 2, ( 3, 4, (5, 6) ) );

I'm not criticizing the Moose documentation or the standard approaches to formatting Moose code. I'm not suggesting a change. I don't like deviating from community standards for declaring Moose attributes. Even so, avoiding the need to explain the equivalencies of syntax to people for whom learning syntax is still a really big deal is itself to me a big deal.

11 Comments

afoolishmanifesto.com | July 1, 2010 1:25 PM

I think people should be able to understand syntax before they understand encapsulation.

chromatic replied to comment from afoolishmanifesto.com | July 1, 2010 1:35 PM

The logical progression of learning isn't quite that simple. Before I go too far down the TIMTOWTDI mines I want to make sure that people can be productive.

jnareb.openid.pl | July 1, 2010 3:44 PM

Why Error syntactic sugra is considered bad, and TryCatch and Try::Tiny syntactic sugar isn't?

csjewell.comyr.com | July 1, 2010 3:45 PM

I tend to use the with-parenths style with lots of line breaks because I have declarations that would be too long to fit on one line, or would be too unclear otherwise. For consistency, I then break each portion into its own line even if the declaration is simple:

use Moose 0.90;
use MooseX::Types::Path::Class;
#...

has 'debug_stderr' => (
    is      => 'ro',
    isa     => File,
    lazy    => 1,
    coerce  => 1,
    default => sub {
        my $self = shift;
        return $self->output_dir()->file('debug.err');
    },
);

The equivalent, no-parenths style, would be:

has 'debug_stderr',
is      => 'ro',
isa     => File,
lazy    => 1,
coerce  => 1,
default => sub {
    my $self = shift;
    return $self->output_dir()->file('debug.err');
};

or even worse:

has 'debug_stderr', is => 'ro', isa => File, lazy => 1, coerce => 1, default => sub {
    my $self = shift;
    return $self->output_dir()->file('debug.err');
};

There's a readability penalty, I think, with no-parenths, once things get complicated to any significant degree.

I wouldn't use this for an example in your book - it takes 3 or 4 steps at once - but the with-parenths style may or may not be something you want to encourage, anyway. It's your book, you write it the way you think best educates people.

My personal style on the other issue is to quote the attribute name, by the way, even when it wouldn't be technically required. (I should fix the places in my code where I don't.)

https://me.yahoo.com/moleculasdevida#fea03 | July 1, 2010 3:51 PM

"has" seems to simply be a function that takes a list of arguments. However, after looking at the source, I'm not sure how it passes in the extra parameter that is assigned to $meta internally. Sounds like a function that is automatically converted into a method call.

Anyway, just knowing "has" is a function that takes a list of arguments helps me realize what syntax I can use with it.

I remember being confused by the syntax when I first learned Moose. Eventually I just learned it by repeatedly copying from examples. It was unlike anything else that I had seen in Perl. I kept wanting to write "has => {};" instead of "has => ();"

Thanks for the insight.

chromatic replied to comment from csjewell.comyr.com | July 1, 2010 4:09 PM

I like the use of grouping parens for multi-line declarations. That's a good argument to have the option to use them. For simple declarations, less so.

chromatic replied to comment from https://me.yahoo.com/moleculasdevida#fea03 | July 1, 2010 4:12 PM

I haven't looked at the relevant source in months, but if I were to write this, I'd export a closure which closes over the relevant class object as $meta. That's a useful technique.

I've considered explaining has as just what you describe: it's a function which takes a list of arguments. That might be clearest yet, and it gives the opportunity to explain that you can use indentation or alignment or parentheses to clarify whichever parts you prefer.

Ashley Pond V | July 1, 2010 4:32 PM

I have fallen into this style:

has "selenium" =>
    isa => "Test::WWW::Selenium",
    is => "ro",
    builder => "_selenium_client",
    handles => selenium_proxy_methods(),
    required => 1,
    lazy => 1,
    ;

has "browser" =>

    isa => "Str",

    is => "ro",

    required => 1,

    lazy => 1,

    default => sub { $ENV{BROWSER} || "*firefox" },

    ;

It indents without parens fine that way and the semi-colon alone gives the same trailing comma advantage to edits/changes. The top fat-comma is a style consideration only; a real comma there looks quite strange (to me) at this point.

http://oid.fox.geek.nz/kent.fredric | July 2, 2010 7:29 AM

You can take the "Moar Sugar" approach if you really want to.

use MooseX::Has::Sugar; # Shameless own code :/
use MooseX::Types::Path::Class;

has 'simple_example' => ( isa => File, ro, lazy_build, coerce );
has 'another_example' => ( isa => File, ro, lazy_build, coerce );
has 'debug_stderr'       => ( isa => File, ro, lazy,           coerce, default => sub {
    my $self = shift;
    return $self->output_dir()->file('debug.err');
});

Its about as nice as I can cook up without reaching for the Devel::Declare black magics.

I've toyed with other approaches to this, but can't find something nicer and less bug prone.

( By requiring subs to populate the array, you get a nice bit of compile-time typo-checking, sure, Moose will check most the values for you anyway )

pepkaro.myopenid.com replied to comment from chromatic | July 8, 2010 10:15 AM

There's one major reason to use the parentheses everywhere.

Consistency.

It's in the documentation. It's in almost every example they will find online. It's extremely helpful for large declarations, like what have been shown here. If they always see it this one way, then they will get used to it. It will be more confusing for users who see it one way in the Modern Perl book, and then look at the Moose specs. Which way should they do it? The way Moose does it, or the way "Modern" Perl should do it?

A side point: Charles Petzold for years used the Whitesmith style in all of his Programming Windows books. (Eww.) What were programmers to do? Those that never saw a lick of Windows code before then had to assume that that this was the the style that should be used. Turns out, that wasn't they used. It wasn't what most of the industry used. It wasn't even one of the two leading indent types. But people were confused, because that's what this authoritative source of information said they should use.

By NOT using the parentheses you introduce a new style. That will have repercussions on future style wars for ever and ever if you do. We really don't need yet another point to debate about.

nxadm.wordpress.com | August 18, 2010 2:27 PM

I agree with csjewell (the how) and pepkaro (the why).

When Sugar and Semantics Collide

Tags:

11 Comments

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry