Some of my biases are transparent. For example, I believe that many of the complaints of Perl's "unreadability" are from people who've never bothered to learn how to read the language. You often see this from people who say "Sigils? Pfft. They're useless—mere syntactic noise!"

Linguists may disagree.

One of the early inventions in written language was punctuation. In specific, adding spaces between words (and even vowels, in some languages... yes, my history studies have come in useful while programming) makes documents easier to read. The same goes for punctuation. It's easy enough to write sentences with ambiguous meanings, depending on where you put a comma to delineate logically separate clauses. (Languages with greater riches of declensions and tenses and numbers and other forms are more flexible in word order, but they do retain some degree of poetic license. It's not all meter and rhyme scheme however.)

The basic idea behind all of these ancient inventions is that "Communicating is difficult enough without verbal and body language cues. Making different things look different helps."

To read source code, you have to be able to identify nouns and verbs. You have to be able to group related items and ideas while not grouping unrelated ideas. You need to be able to identify separate expressions as well as idioms.

One reason assembly language can be difficult to read is that its regularity (op arg1, arg2 or op arg1, arg2, arg3) precludes skimmability. That may sound odd; if you're reading code, why do you need to skim code, but it's important. Programming encompasses so many small details that you must understand the code in the small in the context of the local component as a part of the system as a whole.

Uniformity of syntax means that you have to rely on cues external to the source code or patterns of repeated details within the source code to indicate structure.

I have the same problem reading Lisp code, with its homoiconicity; the shape of the code gives me few cues as to what's different between sections of code. As well, Python's use of vertical whitespace to end blocks means that my eyes slip off of the end of logical blocks and I can't tell what happens where.

A lot of that is familiarity and personal preference (or quirks of the way my brain works). Some of that is the effect of deliberate design decisions.

If you embrace the idea, like Perl does, that different things should look differently, you reach some interesting conclusions. I don't think you can learn Perl effectively without understanding those conclusions, at least at an intuitive level. I'll write about that next time.

Polemic: anyone who believes that any specific general purpose programming language is inherently unmaintainable has opinions on software development worth ignoring.

Many people claim that the design of Perl 5 has such significant flaws that render it far too difficult to write and maintain useful programs. Many of the supporting arguments are syntactic preferences. "I don't like sigils!" "Context make no senses to my!" "Real men don't need your sissy curly braces to accompany our manly indentation!" "Isn't bless a little bit cutesy for our Serious Enterprise Business Application?"

Other arguments... well, you've heard them.

Perl 5 has some design flaws, but I believe that syntax is such a small part of maintainability that only the most facile discussions focus on syntax to the exclusion of more important concerns. The next time you have trouble maintaining a Perl 5 program, ask yourself:

  • Have I learned the language by reading documentation and working through tutorials, or am I fiddling with changing things by trial and error and guesswork and intuition based on experience in other languages?
  • Do I know how to use perldoc to look up builtins and language features?
  • Have I skimmed the Perl FAQ included in every Perl 5 distribution?
  • Have I used Perl::Tidy to unify the formatting into a consistent style?
  • Do I know the difference between void, scalar, and list context? Can I identify them?
  • Do I know how to use B::Deparse to explain the evaluation plan of complex constructs?
  • Does this program have a set of automated tests I can trust?
  • Did the original programmer understand the problem domain? Do I?
  • Did the original programmer "borrow" this code from elsewhere, change a few lines, and add a modified copyright statement?
  • Did this program grow from a throwaway idea into a critical business component without planning, design, or refactoring?
  • Is the original author available to answer questions, whether in person or through some sort of design notes?
  • Is the program well-factored?
  • Does the program include appropriate documentation for its purpose, its major systems, its APIs, and any surprising design decisions?
  • Do I have a clear understanding of what the program does and why?
  • Does the program have a modular design, with well-enforced encapsulation boundaries between components?
  • Can I configure and build the program on my local system?
  • Can I deploy it?
  • Does the code show examples of idiomatic programming from authors fluent in the language, or is it a pastiche of styles cribbed from documentation and witch-doctor expermentation?
  • Did the original author know how to program in any language?
  • Did the original author take advantage of obvious strengths of the host language in appropriate ways (or did he distrust arrays and continually write to and read from a temporary file instead—I have seen this with my own eyes, and the host language was not Perl)?
  • Does the program take advantage of well-known and trustworthy external libraries?
  • Does the build process spew compiler errors and warnings? Does the program spew warnings and errors when deployed?
  • Does the program contain obvious repetition and near repetition?
  • Would you be proud of writing the program in six months?

Note how few of these concerns have anything to do with Perl—and, of those that do, trivial rewording would make them appropriate for other languages.

In Essential Skills for Perl 5 Programmers I mentioned that no one can be an adept Perl programmer without understanding context. This trips up many, many people -- and you often hear (unfair) criticisms of Perl 5 based on misunderstandings and guesses about how context works.

Context is reasonably easy to explain. (The previous sentence is grammatically correct.) Contexts is not difficult to understands. (The previous sentence is grammatically incorrect, even if you speak the Queen's English.)

If you can find the errors in the previous paragraph, you can understand quantity context in Perl 5: like subject-verb agreement in terms of number, expressions in Perl 5 can behave differently in contexts that imply zero, one, or more results.

fetch_something_awesome();              # void   context
my $item  = fetch_something_awesome();  # scalar context
my @items = fetch_something_awesome();  # list   context

Context gets a little bit trickier when you need to coerce what would normally be one context into another:

my ($item) =        fetch_something_awesome(); # list   context
push @items, scalar fetch_something_awesome(); # scalar context

If you know the visual cues (if you don't randomly sprinkle punctuation about your program until it works), those are easy to understand as well.

The subtlety comes when dealing with complex contexts, usually with nested expressions:

# list context, thanks to say
say reverse $name;

my %values =
(
    # list context, thanks to hash assignment
    name => get_name(),
    rank => get_rank(),
);

# list context (param flattening)
$screen->flip( $fleet->get_spaceships() );

This is often where more fair criticisms of Perl 5 suggest that context may not be worth it, because you have to understand what a line of code means and what it implies to read it correctly.

There's a fair point there, but it's also silly in some ways. Skimming code which calls other functions may give you some idea of what those functions do, but you rely only on the names of those functions and not their documentation to tell you any other details. Do they modify global or thread-local variables? Do they have caching or performance characteristics? Do they block? Do they require special initialization or error handling? Do they return special values?

The valid point is that chaining multiple expressions into complex compound expressions can have interesting effects. (I see this in Haskell code often; invisible partial application means that I personally can't skim Haskell code without tracking down the arity of functions to figure out what happens where.)

That's no argument against language features. It's an argument against making expressions more complex than necessary. Note that the same argument applies against complex prefix-unless expressions. unless can be amazingly useful when used properly. If you abuse it, you make amazing problems. Don't make problems.

I've been writing the Moose section of the Modern Perl book for the past week. Stevan (and other people) suggested that I explain how to create and use objects in Perl with Moose before explaining the bare-bones blessed reference approach. They were right.

I'm assuming that readers don't necessarily have the theoretical understanding of how objects work and why, of why Liskov substitutability is important, of what allomorphism means, and why polymorphism and encapsulation are much more interesting than inheritance. I don't even assume that readers know any of those words.

Yet I've noticed something far more interesting.

The standard approach to teaching Perl 5 OO (at least, the one approach I've seen that works) builds on the Perl 5 implementation ideas. That is to say, "a class is a package, a method is a subroutine, and an object is a blessed reference". If you know how to work with references in Perl 5, you can use Perl 5 objects.

That's true in the sense that a blessed hash reference is still a hash reference. That's false in the sense that treating an object as a struct with a vtable pointer is a terrible way to write robust OO. (I should know; I've been hip deep in multiple implementations.)

I like a lot of things about Moose, but what I appreciate most from a didactic standpoint is that I can explain object attributes:

{
    package Cat;

    use Moose;

    has 'name',        is => 'ro', isa => 'Str';
    has 'diet',        is => 'rw', isa => 'Str';
    has 'birth_year',  is => 'ro', isa => 'Int',
                       default => (localtime)[5] + 1900;

}

... and it doesn't occur to readers that they can poke into a Cat instance directly (even though they can). Moose encourages people to do the right thing by using accessors and respecting encapsulation and polymorphism and allomorphism and substitution by making something different--encapsulated access to instance data--look different from the well-understood mechanism of its implementation.

Objects may still be blessed hashes, but users treat them differently because they have different expectations.

In writing the examples for this chapter, I changed the implementation of the class to make correctness easier (and to discuss the value of immutable objects). The refactoring was trivial, thanks to Moose features, but the interface of the class could stay the same, thanks to Moose's subtle encouragement to program to an encapsulated interface.

I always enjoy encountering such a serendipity in code, and I made sure to mention it in the book. The Perl world needs more such serendipities.

perl5i is back on the CPAN. perl5i is important because it may help shape the future of Perl 5. (Perl 5 experts and CPAN cognoscenti already know how to add dozens of pragmas and utility modules to every Perl 5 file they write, but that's annoying even for us and inaccessible to the other six and a half billion people on the planet.)

When people writing about Perl 5 in 2010 can find better answers in the sparse Ruby documentation than the Perl 5 documentation, something is wrong... but that's a far different story.

perl5i is again available and you should experiment with it. Why was it gone so long? The old nemesis of confusing version numbers.

The obvious use of perl5i is:

use perl5i;

... but what does that mean? If it's difficult to discern the version of a language used without explicit notation, how much more difficult to discern the version of a pragma or CPAN distribution intended? A CPAN author could upload multiple versions of a distribution in a day, each one with a subtly different API or semantics. How do you know which version you have? How do you know which version your code needs?

How do you know when you should upgrade and when you should keep the existing version? How do you know when an upgrade will change the behavior of code in a positive or negative way?

How do you write reliable, redistributable software which depends on external components with their own ideas about stability of interface?

Schwern has posted some thoughts on perl5i version numbering, distribution, and use, but this all is an attempt to cram too much meaning in a single number.

I'm starting to believe that the best approach is to use a regular release cycle -- perhaps every three months -- and support only the most recent couple of releases. The interface may change with every quarterly release. Interim releases can fix bugs. Use a date modifier as the argument to import() for best reliability:

use perl5i as_of => '2010-01-25';

Stop fussing with the MAJOR.MINOR.PATCHLEVEL scheme and "What constitutes a major API change?" and "But I just incremented MAJOR last week, isn't it too sooooooon?" and "What if someone wants to use a really old version and reports a bug?" distractions. Let's stop trying to work around change. Instead, let's take advantage of change to produce improvements.

Find recent content on the main index or look in the archives to find all content.

Recent Comments

  • Amir Karger: As you say, you can write maintainable or unmaintainable code read more
  • sigzero.myopenid.com: It may not be "my code" that I am looking read more
  • mirod.myopenid.com: I would include "does the code come with any documentation". read more
  • theclapp.myopenid.com: A comment on Reddit about pre-++ vs post-++ reminded me read more
  • axqd.net: In most cases, I intend to avoid to use 'context', read more
  • chromatic: I don't see that at all, Colin. By "correctness" here read more
  • Ovid: @colinwetherbee: New Java programmers are constantly bit by the fact read more
  • http://openid.colinwetherbee.com/cww: Do you feel at all like "correctness" may just be read more
  • Ovid: That's something I love to see. When you have code read more
  • jon.jesnetplus.com: Sounds oddly like Haskell type classes. (Saying this as a read more

Categories

Pages

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.23-en