December 2009 Archives

Is It, Can It, Does It, and Robust Perl 5 OO

By chromatic on December 30, 2009 10:50 AM

A recent thread on p5p about warnings from Pod::Abstract with Perl 5.11.3 brought up an old debate over the nature of the UNIVERSAL package in Perl 5.

A recent deprecation of importing isa and can as functions from UNIVERSAL is the source of the warnings. This has the potential to affect plenty of code, because plenty of existing Perl 5 code does the wrong thing in the wrong way.

That's not entirely the fault of the coders; there's almost no right way to ask the right question in the right way. This is a design flaw in Perl 5.

The problem is that you can't always be sure that a scalar you have supports the operations you want to perform on it. For example, suppose that you receive as a parameter an object that may or may not allow logging with the log() method. If you call log() on it, you may get an exception.

One approach is to call can() on the object to see if Perl 5 knows about a log() method defined on the object:

if ($obj->can( 'log' ))
{
    ...
}

That's all well and good (except that it is susceptible to the false cognate problem) until $obj may not in fact be a valid object at all. In that case, trying to call a method on a non-invocant may not work.

I don't know why you'd write an API where this is possible, but apparently it's a popular hobby.

Some people think that the proper approach is to avoid the method call altogether. I used to think this too. I was wrong. Here's the broken code they write:

if (UNIVERSAL::can( $obj, 'log' )) # broken code; do not use
{
    ...
}

The warning introduced in 5.11.3 by the linked patch doesn't catch this case. (You have to use the shouldn't-be-so-controversial UNIVERSAL::can module to get warnings in this case.) It only catches the case where someone has written:

use UNIVERSAL 'can'; # broken code; do not use

if (can( $obj, 'log' )) # broken code; do not use
{
    ...
}

The safest approach is to change the API so that you never have to guess if you have a valid invocant. If you can't do that, the safest check is:

if (eval { $obj->can( 'log' ) })
{
    ...
}

... to catch the exception (or use Try::Tiny).

This is a pattern repeatable with isa as well.

As mentioned earlier, this still leaves the code open to false cognate problems, where the mere presence of a method (or function) named log() is not sufficient to determine whether that method (or function) actually behaves as expected.

This is the purpose of the DOES() method added in Perl 5.10 — if you use roles instead of duck typing.

Why is it unsafe to call can() as a function? AUTOLOAD(). The implementation of can() in UNIVERSAL has no idea if any class or object will generate its own methods. It only knows how to look in the appropriate namespace for CVs; it can't even distinguish between functions and methods. Of course, if you write your own AUTOLOAD for a class, you must also override the can() method to return the appropriate sub reference when queried. Code that neglects to do so — far too much code, in truth — is also broken. (Then you can have the fun experience of debugging other modules which call can() as a function and break your carefully-written objects.)

Things get trickier when dealing with isa(); more about that next time.

More Perl Packaging Possibilities

By chromatic on December 28, 2009 1:47 PM | 2 Comments

The discussion of Helping Perl Packagers Package Perl glossed over a couple of points I find incredibly important. Granted, I neglected to mention BSDPAN or Gentoo's g-cpan or projects such as GoboLinux's /System/Aliens.

Yet there's more.

Working with CPAN can prove difficult in a couple of ways. The first is initial configuration:

Do you want to install or upgrade distributions which may conflict with the system Perl 5 and any system utilities?
Do you have permission to install into system directories?
Which mirror repositories are best to use?
Which utilities do you have installed that the CPAN client may want to configure or to use?
Do you want a local installation location?
Do you want an application-specific installation location?

That's why a smarter CPAN client configuration system is useful, and that's why an OS-specific or distribution-specific configurations would be helpful.

The other place where CPAN can be troublesome is installing distributions with non-Perl components. Consider, for example, bindings to libxml. A CPAN distribution needs to indicate somehow that it expects a specific version of a shared library called libxml.so or libxml.dylib or libxml.dll.

The CPAN client could avoid a lot of fragile, platform-specific guessing if it could ask the local packaging system for the appropriate information about that dependency. It could even invoke the local packaging system to install it.

There are dependency versioning concerns; this represents a lot of code to write and some complexity. Yet it also merges two separate systems which perform essentially the same function.

Integration with the packaging system also means that distributions which have components which require compilation can depend on packages including the header files for Perl 5 and any shared libraries, as well as a compiler and make utility and linker and....

(A lot of this problem goes away if a project such as a port of ctypes from Python to Perl 5 appears. Dependency resolution is still an issue, but the need for a compilation environment disappears in many cases.)

The CPAN infrastructure -- with local::lib configured -- has an advantage over most packaging systems I've used on free Unix-like operating systems, in that CPAN allows parallel installation of potentially conflicting libraries. This is a limitation of packaging systems.

Of course, none of this works on operating systems without packaging systems, so Windows and Mac OS X users again have trouble. At least the improved CPAN configuration (and better distributions of Perl 5) as well as a ctypes system will help them anyway.

If I could wave a magic wand, I'd love to see an easy way to package a Perl 5 application and its dependencies to create a package for a modern free Unix-like system....

Helping Perl Packagers Package Perl

By chromatic on December 23, 2009 12:58 PM | 7 Comments

I know I often shouldn't, but I use the Perl 5 installed through Ubuntu packages for most of my local development. I could maintain a parallel installation myself, but I have better things to do. (I do have bleadperl available if I need it.)

Every time I get a new machine, or perform an OS upgrade which changes the major version of Perl 5, I have to reconfigure the CPAN client to install distributions from the CPAN appropriately.

That's ridiculous, for two reasons.

First, the CPAN.pm configuration has traditionally asked too many questions. I understand that it's nice to have configurability and the ability to run on all sorts of platforms with odd behaviors and strange utilities and baffling constraints, but I also think it's plausible to assume that most new installations of Perl 5 on a modern Unix-like system can speak HTTP, for example.

The second problem is that the Perl 5 version has come from a system package — a .deb file, in my case — and the CPAN client prefers to install tarballs from the CPAN itself.

Even though Debian and Ubuntu packagers (especially the admirable Debian Perl Group) have made plenty of CPAN distributions available as .debs, I have to configure my CPAN client myself, and it does not work with the system package manager.

There's no reason it couldn't.

Imagine that the system Perl 5 included in the default package (or included an optional package) which had a CPAN client configured appropriately. It has selected an appropriate mirror (or uses the redirector). It knows about installation paths. It understands how to use LWP or wget or curl to download tarballs. It requires a make utility and the Perl 5 development headers.

Why stop there? There could be an alternate package (or an alternate Perl 5 installation or a program to switch paths) which set up local::lib for each user to install modules without overwriting the global installation.

Go another step further. A system such as Debian or Ubuntu or Fedora or one of the BSDs may include OS packages of CPAN distributions. If you want to install WWW::Mechanize, why can't a custom CPAN subclass translate that into a request to install libwww-mechanize-perl through the packaging system if it's available?

I realize that plenty of experienced Perl 5 developers dislike the idea of giving up control over every aspect of their own installations. That's fine. They can keep that control. Improving the defaults of the Perl 5 experience does not have to mean removing customization possibilities for experts.

The Perl 5 community has to produce at least three artifacts before this is possible:

An agreement that it's okay for distributions to customize their CPAN client configurations at installation time.
A set of guidelines for how to do so safely—probably backed up by code.
The will to improve the experience of installing, maintaining, and upgrading modules from CPAN distributions, especially for novices.

Each goal is achievable, though the latter likely requires the active Perl 5 community to refuse to support specific vendor customizations in any official capacity.

The result—a Perl 5 that's easier to develop, easier to begin, and exactly as customizable as it is now—is better for everyone.

Safety in (Version) Numbers

By chromatic on December 16, 2009 4:29 PM | 2 Comments

If you believe what I wrote in The "Guess the Version" Game, there's no completely reliable way to determine the appropriate version of the Perl 5 language which applies to a given file. You can use a static analysis tool such as Perl::MinimumVersion to give you a best guess based on a set of heuristics... but that's possible to fool.

Unlike Perl 5 static parsing, where only a malicious developer would write the maintenance-nightmare code that foils a smart static analysis tool, it's reasonably easy to imagine that a well-meaning and maintenance-savvy developer might write code that fools the language version checker. After all, keywords like class and say and err make a lot of sense as keywords because they describe concepts effectively and efficiently. There's little difference between choosing good keywords and good symbols for identifiers.

There's only one good way to specify the intended version of the Perl 5 language in use in any given source code:

package SecretMonkey::Utilities;

use 5.010;

...

If you want to future-proof your code against future language changes, document your assumptions. (Note that well-packaged CPAN distributions have had this policy for years. It's not a new idea; it's a good idea unevenly distributed.)

If Perl 5 had adopted this policy a decade ago (or, better, even earlier!), we could avoid a lot of current problems with the debate between making new features available by default or being as compatible with Perl 5.000 as possible by default. perl can't assume anything about the features used or not used within a particular file -- especially new keywords -- because it doesn't know which version of the language to use to parse the code.

Any new Perl 5 code written after this point which lacks a language version specifier in every file carries a very preventable maintenance risk. You don't have to use Perl 5.10.x (though you should), but you should specify the version of the language you expect.

The "Guess the Version" Game

By chromatic on December 14, 2009 3:38 PM

What version of Perl 5 is necessary to run this example?

say $message;

It depends. The obvious answer is any version newer than 5.10, at least if you use feature ':say'; or use 5.010;.

It can also be Perl 5.6 or 5.8, if you use Perl6::Say.

What version of Perl 5 is necessary to run this example?

sub foo
{
    state $foo = shift;
    say $foo;
}

foo( 100 );
foo( 200 );
foo( 400 );
foo( 800 );

Again, the obvious answer is any version newer than 5.10, but I can imagine having written code something like this for Perl 5.6 or 5.8:

my $foo;
my @lexicals;

sub state :lvalue
{
    push @lexicals, \(my $foo);
    ${ $lexicals[-1] } = shift;
    return $lexicals[-1];
}

The semantics aren't exactly the same, but that's the point. The syntax is similar enough that you can't reliably read the code and determine to which Perl 5 version the original author wrote!

This problem comes up over and over when someone proposes adding a new keyword to Perl 5. The likelihood that someone wrote Perl 5 code with this syntax is vanishingly low:

class Foo extends Bar {
    ...
}

... but it's not entirely impossible.

There is no general purpose heuristic by which the Perl 5 parser can determine the appropriate version of the Perl 5 syntax to use to parse any given piece of code. (That's not a problem specific to Perl 5, but I'm only talking about the Perl language family now.)

No one wants to break existing code. Yet if there's value to adding new features, there are only a few possibilities to find a balance between adding new features which may conflict with existing programs and not adding new features.

The current Perl 5 approach prefers to maintain existing programs unchanged. It's possible to take code written for Perl 1 (and last modified in 1987) and run it unmodified on Perl 5.11.2, released last month. I repeat, unmodified.

In other words, given a Perl 5 program which does not specifically declare the intended version of the Perl 5 syntax and (more important) keyword set, the Perl 5 parser prefers the oldest available keyword set. (Of course, any syntax which was impossible in older versions and is now available in newer versions may be available... but that doesn't require modifications to existing programs.)

That's not likely to change soon, if ever in Perl 5, and I can understand that policy especially given its two-decade history.

That does suggest changing how the Perl 5 community thinks about writing new code and modifying existing code under active development. I'll write more about that next time.

Perl and the Multiversion Grammar

By chromatic on December 11, 2009 10:59 AM

A programming language is syntax and semantics.

Spend some time designing a programming language, implementing a parser and compiler, and working on a virtual machine, and you'll discover that both syntax and semantics influence each other. Even so, you can separate them. Myriad syntax forms can produce the same semantics. There are differences in expressivity, abstraction, and concision, but within the same language, you can support multiple syntactic forms (even with differing semantics) from the same set of low-level semantic primitives.

Even though the most effective way to develop a software project over the long term is to get feedback about the problems that users face in their own work and iterate through potential solutions to find the best approach, users tend to hate this kind of churn in the syntax of their programming languages. Even if the semantics don't change (much or at all), especially at the low levels, rewriting for the sake of rewriting is an expensive exercise.

In the long term, better expressivity, maintainability, abstraction, and correctness may make change worth it, but who practices long-term thinking?

There's a lot of pressure to get things right the first time, even if what's right doesn't stay right for long, because there's even more pressure to stick with what's there because it's there and it's good enough.

Consider this in the context of The Replaceable Batteries of Your Standard Library and Replacing the Standard Library with Distributions. Those posts discussed a language's core library. They could very well describe the core language itself. The difference between a library and a language (given a sufficiently expressive language) is one of convention alone.

That convention is important; it's easier to convince people of library changes than language changes. Of course, people believe it's easier to change a library (or provide an old version or a wrapper for backwards compatibility) than to change the syntax of a programming language.

They may be right.

However, Perl 6 separates the grammar of the language from the rest of the language. This is somewhat new for Perl, but nothing new in the world of programming languages.

Perl 6 also lets you create your own local grammar modifications. The most important attribute of these modifications is lexical scoping. You must declare which modifications you use, and they take effect only in the innermost lexical scope of that declaration.

Perl 6's grammar is a self-hosting grammar. It's written in itself. Combine that with lexical scoping and a Perl 6 specification which includes the self-hosting grammar and puts a version number on it, and you have something special.

You can have multiple Perl 6 grammars co-existing in the same program. You can have components written in Perl 6.0.0, 6.0.1, 6.1.0, and 6.2.0. Yes, that could become a recipe for a maintenance disaster if you don't encapsulate your code well. It can also allow you to refactor your code to take advantage of new features and remove deprecated syntactic features component by component.

In other words, rather than Perl 6 defaulting only to those features which are supremely unlikely to conflict with features existing in the earliest releases of Perl 6 (as is the case with Perl 5), the language can evolve and change without leaving existing programs in the dust.

Add to that the policy that you must declare the specific version of the parser you wish to use -- and if you don't declare a version, you get the most recent version available -- and Perl 6 can avoid some of the backwards compatibility concerns that have influenced Perl 5 development.

Replacing the Standard Library with Distributions

By chromatic on December 10, 2009 5:47 PM | 2 Comments

Languages evolve.

Some people want to put closures in Java. (Set aside the fact that only 10% of the people I've seen talk about closures in Java know the difference between first-class functions and closures.)

C++ will eventually get a new specification.

COBOL has OO extensions.

Despite the fact that Microsoft can't seem to support C99 in Visual Studio (hey, give them a break -- they're still working on supporting PNG and CSS 1.x, two standards even older than C99), C# gets new features.

You don't have to use the new versions. You don't have to use the new features of the new versions. You don't have to like the new features of the new versions, and you don't even have to know they exist.

You have to delve deeply into delusional solipsism to deny that they exist... and that's merely language features. Consider all of the changes in libraries available for those languages.

In The Replaceable Batteries of Your Standard Library, I argued that change is even more constant when you consider the available libraries for a language. This is true of standard libraries blessed by the implementors (try using gets() in a modern C project and see the reactions you get) as well as libraries made popular by common use (remember Class::DBI?).

Change happens.

Language designers and developers and maintainers and patchers and porters and releasers ought to consider how to deal with the inevitability of change.

Should a new Perl project use CGI today? Plack seems like a better option. Should a new Perl project start with Perl 5's basic OO or use Moose? Is there any excuse to recommend the use of File::Find, when the basic API requires the use of features that 90% of Java programmers (in my unscientific and unrepresentative experience) do not understand?

The best argument for using CGI and bare-bones Perl OO and File::Find is that they're available on whichever version of Perl 5 you're likely to have installed (even if that's Perl 5.005, which has been obsolete for the whole of the 21st century and even if they have bugs which have been fixed in the past several years). The second best argument for using them is that there's plenty of example code which demonstrates how to use them (though if you write Perl 5 code based on examples you can find online... ouch).

The third best reason is that, once something is in a language's core library, only an act of the language designer can remove it. In theory, you'll get a lengthy period of deprecation warnings (if you pay attention to that sort of thing, but convince me that the people not using Moose and sticking with Perl 5.6.1 know how to get deprecation warnings from Perl 5). In practice, you're in trouble anyway, unless your language has a specification or a comprehensive test suite and you read it to figure out if specific behavior on which your code relies will always be present in the future.

In other words, change happens.

How can a language designer or release manager or maintainer or porter walk the thin line between providing a bare-bones language with few libraries and cramming everything into a core bursting with subtly-different alternatives and growing less relevant to real-world adept uses because of too-early blessings of approaches which time has proven inopportune for the common uses?

I've argued before that the right approach is a small core which contains only those libraries absolutely necessary to bootstrap an intelligent library loading system. This works for operating systems such as Gentoo GNU/Linux, Debian in its flavors, and many of the BSDs. Parrot is exploring a similar approach.

The Perl 5 core could do very well to include only a couple of libraries necessary to build and install Perl and to download (or otherwise build from a download or checkout) a bare-bones CPAN client. Perl 6 may do something very much like this.

That leaves one question unaddressed. What happens to those unfortunate novices who download the core distribution themselves and find out that they can't do much with it? What happens to people with cruel, cruel sysadmins who refuse to install anything other than the core?

The solution comes from realizing that the people who are best at maintaining and porting and releasing a new version of a language and its core are not necessarily the same people who are best at discovering what the language's community considers most useful, most usable, or most appropriate.

You can already see the solution in something like Strawberry Perl, which unabashedly includes a compiler and is on its way to including an IDE and a GUI toolkit -- not because you can't use Perl 5 effectively without any of those, but because Perl 5 is much more powerful and much more useful with them.

That line of reasoning is very similar to the one which includes any given library in the standard library.

Why not embrace it? Let distributions come up with their own lifecycles and deprecation and suppport policies. Let them identify which libraries people should have available by default for the next eighteen to twenty-four months. Let them perform the work of integration and testing and bug reporting and working with upstream authors.

Upstream would always remain the same. If you can access CPAN, you can ignore distributions or install your own additional packages or build your own distribution. (Strawberry makes its own tools available.)

A similar process works well for operating systems -- and, arguably, operating system users are less savvy than programmers. Many distributions already package Perl 5 this way.

Could you live with such a change?

The Replaceable Batteries of Your Standard Library

By chromatic on December 8, 2009 2:18 PM | 4 Comments

How much should a language's core library contain?

Perl 5 has included the CGI module as long as I can remember. (Module::CoreList suggests a first release in Perl 5.004, released in May 1997.) It has support for the API of a Perl 4 library called cgi-lib.pl, which came around even longer.

You can count on having the CGI module installed on any complete Perl 5 installation deployed in the past decade.

(Even so, many novices asking questions on sites such as PerlMonks still show off their own copied and pasted CGI parsing routines which have somewhere between four and ten bugs, misfeatures, and security holes.)

You can argue that a CGI parameter and HTML generation module was essential throughout the late '90s and early 2000s, but look at the growth of PHP (with much easier deployment than Perl 5) and the popularity of frameworks such as Ruby on Rails (which lets you go from nothing to a database-backed CRUD application in ten minutes) and tell me that CGI.pm as a core library meets the needs of the dominant set of likely users.

I don't mean that people don't use CGI or shouldn't, but that it's worth considering if its presence in the core meets the current needs of the current Perl 5 users and likely Perl 5 users. You can make the argument that there's little cost to including this module for novices and expecting adepts to be able to install something else, but every additional core library adds a maintenance cost.

The thesis behind this site--behind the idea of "Modern Perl"--is that writing Perl today the same way you did in 1997 misses most of the power of the language. Yet if you look at the internals of the CGI.pm module, you see that it reads like the author doesn't know how to write modern Perl.

There's a good reason for this; it's stayed essentially the same in design and implementation since its earliest conception. Fifteen years ago, no one knew the best way to use objects in Perl 5. No one knew the right way to delay loading frequently-unused code to speed up compilation time. It wasn't staggeringly obvious that separating the concerns of HTML generation and argument processing was necessary or good. Perl 5 developers hadn't gone through the pain of implementing, understanding, using, and maintaining an API that wants to be stateful and procedural and OO while reusing almost all of the same code.

This is not a slight on Lincoln or Perl 5 developers in the '90s or anyone. We (and notice that this is a collective noun) didn't know what code we would wish we had written back then. We know better now, and hopefully code we write now will survive better the next decade.

This task of evolution could have been easier, but tradition impedes all but the most serious changes to APIs shipped with Perl 5. Backwards compatibility means never having to say your first attempt wasn't perfect.

It's a tricky situation. A library that was the obvious best option at the time makes sense in a the core library. As time goes on, it becomes less obvious, but it's difficult to change because people are using it and it becomes difficult to remove because people are using it, and new people start using it because there's little friction to start using it, but the people in the know don't use it because it's not obviously the best choice... and you have a world in which people in the know can write really great Perl 5 code and people who start using Perl 5 write clunky Perl 5 code because the whole process makes it easiest for them to write clunky code.

Assume that decent programmers in the know will always be able to improve their ability to write great code. What if we focused on encouraging novices to write great code? (They won't write great code immediately, but we can encourage them to write less clunky code by discouraging them in ways they won't even notice from writing clunky code.)

How do you avoid the problem of effectively freezing good libraries forever by putting them in the core where they're easy for novices to use and too tempting for adepts to ignore?

I think you subvert the whole process and get rid of core libraries.

Next up, how this can work in practice. (For fun, ponder the existence of Perl 6 and its answer to this problem at the language level.)

DWIM and the Marketing Gap

By chromatic on December 4, 2009 11:57 AM

This is the last m-word post for a while, I promise.

In Lack of Ceremony and the Marketing Gap, I talked about how Perl 5's deliberate refusal to force people to arrange their problems in any particular style helps good programmers write programs effectively and lazy programmers write poorly-structured code. One Perl perception problem is that there are a lot more lazy programmers than disciplined programmers.

In Whipuptitude and the Marketing Gap, I discussed Perl's suitability as a glue language for projects great and small and how the same ability to arrange effective large programs quickly lets people write awful small programs quickly... and there are a lot more small programs available for people to see than large programs.

The third aspect of Perl which I believe contributes to the perception that Perl is difficult to manage is its DWIMminess -- its tendency to work very hard to intuit what the programmer meant to do and do it. Often this DWIMminess goes unnoticed; it's when Perl does the wrong thing that people realize that Perl's heuristics do not match their expectations.

Consider the polymorphic print statement in almost any language that's not C. Give it an integer and it prints an integer. Give it a string and it prints a string. Give it a floating-point value and it prints a floating-point value. Give it a reference object and -- well, DWIM suggests that it invoke some sort of .repr method on that object and produce some intelligible form of output. Whatever the case, you expect print to produce some meaningful output for every type of parameter you might possibly pass.

The same goes for simple arithmetic operators. Imagine the hassle of requiring different infix operators for adding an integer to a float versus a float to an integer versus two floats versus two integers. There are, admittedly, still complications regarding the result of such operations, but the potential combinations there make the problem worse.

It's much simpler for the compiler writer to lie a little bit and make these operators polymorphic even if the rest of the language does not allow such behavior.

Perl takes DWIM further.

Because Perl's type system cares more about context and container type than value type, it does provide separate operators to indicate the type of operation the programmer intended. In a sense, values aren't typed; operations are typed. (You can argue that this is the same behavior as forcing casting or conversions, with less boilerplate and ceremony. The real questions are how much caching you need to do to improve performance and how much type safety your type system can provide. C loses on both counts.)

In other words, it's no surprise when you want to compare two strings with the eq operator. It's little surprise when comparing two numbers with the eq operator works in many cases, but it's a big surprise when comparing two strings with the == operator doesn't work the way you expect.

The question is whether your expectations come from other languages which provide some DWIM (even if the implementation is inconsistent with the rest of the language) or from understanding how Perl works.

If you understand how operators and other grammatic components of Perl enforce context of number and type, you can take advantage of DWIM. It's obvious why 0 but true is true in a boolean but zero in a numeric context.

If you don't understand operators and context but you're fortunate enough to enable strict and warnings or run Perl::Critic, you'll have the opportunity to learn what's happening when one of those tools identifies a situation where you might have done the wrong thing.

Sadly, far too much code exists without the benefits of either conceptual understanding of Perl's typing and contexts or the assistance of good tools which point out likely mistakes and recommend corrections.

For whatever reason, the Perl community hasn't done well enough explaining Perl's underlying concepts. People can still solve their problems with a minimum of ceremony and boilerplate by joining together multiple, interesting small pieces -- but until they understand Perl's philosophy and its strengths, they condemn themselves to writing verbose, clunky code that works against Perl's natural DWIMmery.

It's a good problem to have that novices to Perl and to programming can accomplish productive things without having to become Perl experts. Yet we also need to find ways to encourage them to greater understanding before they find themselves maintaining (or sharing or documenting or cursing) a big pile of spaghetti code.

No language can prevent that in and of itself. That leaves the community to fix technical concerns and these Perl marketing problems. How do we do it?

Whipuptitude and the Marketing Gap

By chromatic on December 2, 2009 2:03 PM | 4 Comments

In Lack of Ceremony and the Marketing Gap, I argued that Perl's unwillingness to force any particular development style or structure has contributed to the misperception that it's difficult to write consistent, coherent, and maintainable code (as if microblogging makes sestinas impossible).

I also mentioned whipuptitude as a factor in Perl's power and perception. Adam Turoff gave the best definition:

Whipuptitude is the property where, starting from zero with a large library of easily combined tools, you quickly hack a solution to a moderately simple, but annoyingly tedious problem that occurs frequently.

— Adam Turoff, Manipulexity and Whipuptitude

Whipuptitude is a deliberate design decision. Some people call Perl a distillation of the Unix principle; read Usenet from the early '90s for long discussions of this concept and implications. In short, the design goal is to give people the ability to combine small tools in ways unforeseen by the language designers to solve their problems.

Sometimes that means shelling out to grep and find. Sometimes that means writing Perl one-liners to stick in a long piped shell command. Sometimes that means writing a language agnostic testing protocol and communicating with software written in a dozen languages to produce a single good report.

Some people call this scripting. Other people call it glue.

Combine this whipuptitude with the lack of ceremony, and you have the ability to create nice abstractions as well as the ability to throw some spare parts together and hope it holds long enough for you to get your job done.

When you see ugly glue code written in Perl, you see scrappy programming, chicken wire and paper mâché, thrown together in a hurry by someone who didn't take the time and the effort to put things together the "right" way -- as if there were a single right way. Perl doesn't hold your hand and force you into a theoretical axis. Perl doesn't take away features it thinks you're not ready to use. Perl's like English; you can speak a pidgin if you like, if you think that'll communicate what you need to communicate effectively. Yet woe to you if you write the instructions for baking a cake in a pidgin that leaves out important information and you end up with a soggy brick that not even your cat will eat.

It's all up to you how maintainable and portable and efficient and effective your code needs to be. (Most programmers are lazy.)

There's a flip side to this whipuptitude, and that's how it scales to larger projects. Perl's not only good for throwaway ten-line programs that massage data from one text file to another (though it's shockingly good at that; yesterday I debugged a garbage collection tuning problem in Parrot by annotating the GC with simplistic fprintf() statements and massaging the output). Properly applied whipuptitude works at the macro level as well.

You see this at its best when you look at the CPAN. Consider it not as a collection of additional libraries, but as a set of easily-combined tools intended for quick assembly to help you solve modest but annoying problems. Moose? A better object system than the default. Try::Tiny? Effective and complete encapsulation of everything you need to know to handle exceptions safely. WWW::Mechanize? Simple automation of web requests. Modern::Perl? Scrapping the boilerplate to enable Perl 5's nice new features. autodie? Remove tedious return code checking without reducing the safety of your code.

Perl programers extract tools and make them reusable. Whipuptitude isn't just for small programs. It's how to make great medium and large Perl programs as well.

(There's a dark side to this, however. ExtUtils::MakeMaker is a glue wrapper around the make utility which attempts to enable portable shell programming around operating systems, shells, and make variants. Sometimes even core Perl gets things wrong.)

Next time: DWIMmery.

« November 2009 | Main Index | Archives | January 2010 »

December 2009 Archives

Is It, Can It, Does It, and Robust Perl 5 OO

More Perl Packaging Possibilities

Helping Perl Packagers Package Perl

Safety in (Version) Numbers

The "Guess the Version" Game

Perl and the Multiversion Grammar

Replacing the Standard Library with Distributions

The Replaceable Batteries of Your Standard Library

DWIM and the Marketing Gap

Whipuptitude and the Marketing Gap

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Archive