June 2009 Archives

Imaginary, Integrated, and Ideal

| 1 Comment

My favorite project of the year may be Schwern's perl5i. His goal is to fix as much of Perl 5 in one swoop. Unlike my Modern::Perl, he's not limiting himself to core modules and pragmata. Anything on the CPAN that fixes a deficiency or problem in the language is fair game.

That goal may remind you of the goals of modern Perl.

That means autobox and autobox::Core and signatures and Devel::Declare.

The i stands for imaginary in the same way that the square root of negative one is an imaginary number. (Yes, clever Reddit and OSNews kiddies. It can also stand for "irrational", thus Perl 5 is irrational. Your scathing wit withers discussion. Have a lollipop.)

This is important for several reasons.

First, it's a great, grand experiment to discover what Perl 5 could be right now. It's a great language and you can do amazing things with it, but it has its flaws. What if someone could correct them? Perl 5.10.1 won't fix them. Perl 5.12 probably won't. The DarkPAN is too big and scary to change Perl 5's defaults all even by the time of Perl 5.14 -- but if you choose to use perl5i, presumably you know what you're doing. What would the language look like and how would it work then?

Perhaps some features will make their way into the core.

Second, it's a great, grand opportunity to make sure that all of these new pragmata and features and syntaxes work together nicely. It's much easier to makechanges now -- to discover incompatibilities and subtle infelicities -- when they all get used together. Better yet, the perl5i hackers can file bugs and work with individual distribution maintainers to seek out consensus and compatibility.

Third, it's a great, grand educational tool to demonstrate corners of CPAN that deserve far more popularity than they have. For example, today I suggested the use of the Want module, which fixes a poorly named core keyword and extends it to greater utility. This module may even help the included autobox::Core handle reference contexts appropriately.

Right now it's not stable; it's experimental. It's worth downloading and testing in your own code, however. Write software. See what happens. File bugs. Suggest new features. See what's compatible and what isn't. Explore. Explain. Enjoy.

To help with development, please visit perl5i on GitHub or visit #perl5i on irc.perl.org.

The Value of a Pumpking

I like creating software.

In truth, I like solving problems. I like making annoying problems disappear, especially before people encounter them. I enjoy the realization that a little bit of work on on my part will make the world a little bit cleaner, a little bit simpler, and a little bit nicer.

Sometimes I have to write code to do that.

I have no desire to be a pumpking, however.

What Your Pumpking Does

People like Nicholas Clark and Rafael Garcia-Suarez also love to write code. They love to clean up ugly problems and replace it with elegant solutions. They love efficient code, and tidy code, and bug-free, stable code.

They're a special class of person called a pumpking. In the Perl world, the pumpking has ultimate responsibility for leading and releasing stable versions of Perl. Think of them as Larry's lieutenants.

If you love writing code and solving problems, that sounds great. You won't catch me volunteering, however, because a pumpking must also:

  • Manage bugs, including triaging them and fixing the ugly ones no one else wants to fix.
  • Manage patches, rejecting them when appropriate, revising them when necessary, and applying them and watching for black smoke when unfortunate. They must also do this in a timely fashion so as not to drive away volunteers.
  • Manage volunteers, helping direct their energy and enthusiasm in productive ways while responding quickly enough to take advantage of their available time but not blocking the project on promises they can't quite fulfill.
  • Write documentation.
  • Manage test reports, identifying when and where a commit caused problems and which commit might fix them.
  • Understand various portability concerns, not just of Perl code but of core code, across operating systems of expanding lineages and hardware platforms but compiler and library support.
  • Uphold the implicit support and deprecation policy, even in the face of often competing goals for features and bugfixes.
  • Communicate the direction of the project through development to stabilization and release and support even to handoff to another pumpking.
  • Understand the long term maintainability concerns of proposed features, existing code, and development policies.
  • Liase with the maintainers of core modules, especially when a release is imminent and full release stability and behavior is important.
  • Keep Perl perlish.

Sometimes they get to write code.

Nothing but Praise

I've written critically about the existing Perl 5 development process here, and I expect to do so in the future. I want to be very clear about one fact, however.

I have nothing but respect for anyone who voluntarily does the job of a pumpking. It is a difficult, thankless position. It requires someone with an amazing attention to detail and a superhuman ability to manage a lot of fiddly, underappreciated work.

Past, present, and future pumpkings deserve our praise and support and thanks.

I'm critical of the process by which we develop Perl. I have no intention to criticize the people who develop Perl, and I offer my most sincere apologies and appreciation to everyone who's felt criticized or attacked or slighted.

Now how do we make the job of a pumpking easier?

I just gave a talk at YAPC::NA 2009 entitled Take Advantage of Modern Perl. I use an extemporaneous style with terse slides, but I mentioned several modules and links worth exploring. I plan to explain some of those modules in more detail in the future. Until then, here are terse explanations and recommendations.

  • Enlightened Perl is a great organization exploring similar issues. If you don't read or participate in Planet Perl Iron Man, please do.
  • The SUPER module works around a problem in method dispatch to overridden methods.
  • The UNIVERSAL::ref module makes the ref() builtin more reliable and allows objects to override what ref() reports for them (good for overloading).
  • The Time::y2038 module makes your lexical scope in Perl -- regardless of compilation settings for 32/64 bitness and IV size -- safe for calculating dates and times past 2038. If you don't know what that subordinate clause means, you need this module. The Time::y2038::Everywhere module makes these functions safe globally.
  • The CLASS module lets you replace ugly __PACKAGE__ with pretty CLASS.
  • The autobox and autobox::Core modules turn primitives into objects so that you can call methods on them. This is difficult to explain; play with it.
  • The Moose distribution provides a powerful, extensible, usable object system for Perl 5.
  • The signatures and Method::Signatures modules provide usable function and method signatures to Perl without source filtering (but with a bit of black magic).
  • The Exception::Class module lets you throw exceptions based on objects in a clear and maintainable way.
  • The Modern::Perl module lets you enable several important but optional features in Perl 5.10 and newer with a single command.
  • The Why of Perl Roles gives several links to explain what roles are and why they're important in OO programming
  • The Module::Build is a core module that's much easier to work with (and cleaner and better maintained) than ExtUtils::MakeMaker.
  • The autodie module replaces Fatal to provide better error messages, remove boilerplate code, and allow extensibility for error reporting on builtins.
  • The Devel::Declare module provides the black magic necessary for wonderful modules such as Method::Signatures to extend Perl 5 without using Filter::Simple.
  • The MooseX::Declare module extends Perl's syntax to make Moose code prettier and easier and more declarative.
  • The Devel::NYTProf module is a working Perl profiler that doesn't randomly crash like the core profilers sometimes do.
  • The Perl::Critic module provides a framework for static code analysis of Perl 5 code. This means you can identify problems and poor style, tailoring your analysis to your local style.
  • The MooseX::MultiMethods module provides a nice syntax for multiply-dispatch methods. This is another feature easier to understand if you play with it than if I write about it.
  • The perl5i project pulls together these modules and more to work together nicely to provide better defaults for Perl 5.

The Parsimonious Language

Rafael Garcia-Suarez's The Future of Perl 5 is worth your time to read. If you think I'm crazy, you might find him sane. If you think I'm sane, you should consider his arguments seriously. (I think Hegel refuted himself, but he may have had a point.)

Rafael is completely right on some points:

The big advantages of outsourcing syntax experiments to CPAN is that the community can run many experiments in parallel, and have those experiments reach a larger public (because installing a CPAN module is easier than compiling a new perl); that enables also to prune more quickly the failed approaches. This is a way to optimize and dynamize innovation.

The cost of the average Perl programmer experimenting with a new form of syntax is low. There's little risk to push that code to production servers. (It happens, though; my company had Ingy and Schwern on one project back in the Spiffy days.) Of course, this keeps feedback low, but the risk of backwards-incompatible changes for experimental code is minimal.

Given sufficient experimentation and testing, it's possible to revise and refine these experiments into something more stable and workable which addresses the necessary uses effectively. That doesn't happen enough, but I believe the CPAN ecosystem can and should encourage it. See also Schwern's perl5i.

I believe Rafael is absolutely wrong to write, however:

a patch to add a form of subroutine parameter declaration, or to add some syntactic sugar for declaring classes, are probably not going to be included in Perl 5 today. Those would extend the syntax, but not help extensibility -- actually they would hinder it, imposing and freezing a single core way to do things. And saying that there is only one way to do it is a bit unperlish. It's basically for the same reason that we don't add an XML parser or a template preprocessor in the core: no single, one-size-fits-all solution has emerged.

It's true that there's plenty of debate over advanced forms of subroutine signatures -- slurpy arguments, named versus positional arguments, default values, et cetera, but when most existing Perl subroutines look something like:

sub foo
{
    my ($bar, $baz, $quux) = @_;

    ...
}

... it seems silly to suggest that supporting something as simple as:

sub foo ($bar, $baz, $quux)
{
    ...
}

... won't be common to every potential serious signature declaration form.

Likewise can you imagine a class declared in Perl 5 that didn't immediately start something like:

class Foo
{
    ...
}

Given that Perl 6 has already blazed a (stable) trail through these syntactic weeds, and given that a goal of modern Perl 5 is to provide a smooth transition path between Perl 5 and Perl 6, I have trouble imagining the kind of seismic shift necessary to render this syntax unsupportable.

Perhaps there's no single syntax that provides every potential feature necessary -- but a default that works effectively for 90% of code and offers an improvement for 98% of existing programs seems like a useful use of time.

A stronger argument is that the steadfast refusal of the core to provide anything other than the roll-your-own-object-system tinkertoys first conceived in 1994 is responsible for the explosion of modules in Class::* on the CPAN in the past several years -- and that's only from the top 2% of Perl programmers worldwide with the stubbornness, will, and permission to make their improvements over stubborn and parsimonious defaults available to the world at large.

I suspect that if the defaults were really as good as some people might like us to believe, the Class::* namespace wouldn't be so crowded.

I don't want to misrepresent Rafael's arguments. He's a thoughtful man and a good programmer; he can represent his own arguments. You should read them. You should think about them. He may be right.

However:

But large-scale experimentation on CPAN enabled the community to make Moose much better than whatever a handful of P5Pers could have designed by themselves.

... what prevents the Perl community from doing the same and improving core syntax? Certainly that's how Perl 6 works. Certainly that seems what pumpkings would like to see -- motivated volunteers doing some work, getting some feedback, and refining new ideas and enhancements to existing ideas.

Perhaps I don't understand why a modest improvement (such as parent over base, with the corresponding subtleties that many people will just ignore) is okay in the core in a module but a modest improvement that could clarify and simplify and shrink almost every existing program is not.

... unless the fragmentation in Class::* is somehow a good thing.

The Tom McCall Waterfront Park is attractive and popular. It's a great location. Portlanders consider it quintessential Portland (though I favor the Boise River Greenbelt, Washington Park, and the South Park Blocks as far as urban green spaces go).

Barry Johnson is a writer for The Oregonian, a newspaper. His Waterfront Park: A balancing plan discusses a common problem with a use of a lovely public space in downtown Portland: big festivals on the waterfront limit the usability of the park for the rest of the year.

You probably don't care about my opinion on parks (and perhaps less so for Barry Johnson's), but his column has a gem of a paragraph. Given that festivals bring in money, that Oregon has an unsustainable tax base, that schools are out of money again, that unemployment here is ... brisk, and given that the city council has plenty of other pressing concerns, not limited to a pending recall of the mayor and controversial decisions to rename streets and relocate a professional sports arena, why bother rethinking a park maintenance policy? From the column:

In the great order of things, this might seem like a small problem. But one way to look at a city is as a collection of thousands of small problems which are small opportunities to make things better. When we successfully apply our ingenuity to one small problem, it points the way toward other improvements. In this way, our city gradually gets better, and at some undefinable point it actually becomes great.

That's also true of software design and development:

  • The easier for users to do the right thing, they less likely they'll do the wrong thing.
  • The more artificial bottlenecks you remove through profiling and optimization, the more apparent are actual bottlenecks.
  • The fewer spurious warnings generated at compile time or run time, the easier it is to identify useful warnings.
  • The faster and more effective your test suite, the more frequently you can run it.
  • The fewer obstacles to getting feedback from users, the better you can design your software to meet their needs.
  • The cleaner your code and better its design, the fewer nasty hacks you need to work around infelicities.
  • The more expressive your language (whether in builtins or abstraction possibilities), the more precision you have available to solve specific problems.

Unlike software, many minor city improvements are obvious and immediately available. When the Park Department mows the grass in the park by my house, I can enjoy the park the same day. When they approve the budget to install automatic sprinklers, I can enjoy the improved flora within a week of installation. Not all improvements fall into this category -- but many do. (Every time I pick up a discarded wrapper and put it in the garbage can, the park is a little prettier for the next person.)

These are the kinds of little improvements that build on each other. I don't mind if I can't produce a big, earth-shaking change every time I release a new version of my software. I care that every time it's even a little bit better I can give it to users so that their software can be a little bit better and they can identify the next little thing that could be a little bit better.

Next time I'll talk about how frequent but minor improvements produce major benefits.

I write a lot about Perl 5 and its ecosystem, but I spend most of my hacking time helping with Rakudo and contributing to Parrot.

I realize this leaves me open to charges of hypocrisy (though what a tired claim; the only modern sin for which you cannot blame your upbringing, your parents, your diet, society, poverty, or your circumstances is when someone believes your words do not match your actions to a surgical degree) -- if I know what Perl 5 needs, why don't I donate my precious time to make it happen?

That's a good question, but I want to talk about something else first.

"Just Write it in C!" is a Terrible Long-Term Strategy

One of my long-term goals in Parrot is to remove as much C code as possible from the project to make it run faster.

This seems counterintuitive to people used to dynamic languages such as Perl, Python, and Ruby. "If it's too slow, you can always drop down to C!", or so the claim goes. Sometimes that's even true.

TraceMonkey shows that it's not true (at least if "fast" is enough and "really fast" is better). Dynamo, HotSpot, Strongtalk, Forth, and Smalltalk are also good sources of inspiration and information -- especially the latter two.

C's overhead for bit-twiddling, character-by-character banging, and dispatch in tight loops is minimal. If you can coerce your algorithm into C code amenable to perform a lot of work there before returning, your code can get faster with smart use of C.

Of course, if your language has a smart compiler, you've likely thrown away a lot of chances for optimization: in particular, escape analysis and inlining (specialized or not) are difficult, if not impossible. You also have to pay the penalty of converting between your language's calling conventions and C's calling conventions. (You have to run a lot of C code very fast to make this worthwhile sometimes.)

Parrot crosses this C boundary often.

Sometimes it's okay to write code in PIR (Parrot's native language) that operates a little bit more slowly than the corresponding C code because calling the PIR code is faster than calling C code from PIR code (or worse, calling back into PIR code from C code called from PIR code, and so on).

My goal of removing as much C code as possible from Parrot -- writing Parrot in itself -- is a Parrot 3.0 goal. Parrot 1.3 comes out next week.

I believe we can achieve all of these goals without stopping the world, scrapping large parts of Parrot, and leaving piles of steaming debris in our wake.

Rebuilding a Charging Locomotive on the Go

How's that?

One of the not-so-hidden themes of all of my writings is gradual and inexorable progress in software. I believe that all Parrot contributors help to improve Parrot with every commit. (Some commits are regressions and some need further commits, but we're heading in the right direction and fixing more bugs than we create.)

A lot of projects can say that.

One of the poorly hidden themes in my writings is that backwards compatibility, like a dairy product, has an expiration date beyond which your fridge -- and your project -- gets scary.

I've written some great code for Parrot and I've maintained some awful code. I believe the awful code is less awful for my work, but one of the greatest joys of software development for me is deleting bad code because it's unnecessary.

The strategy we've discussed for removing as much C code from Parrot as possible has several phases.

Our built-in data structures -- hashes, arrays, subroutines, et cetera -- are special files written in a mixture of an unnamed language and C. A series of Perl libraries parses this code and generates a (much longer) plain C file for compilation and linking into Parrot itself.

Step one of the plan is to write a compiler using Parrot's compiler tools that understands this unnamed language and can emit C code equivalent to what the current parser/generator produces. This step is underway.

Step two of the plan is to develop a new language (or extend the unnamed language) that the compiler from the first step can transliterate to C code equivalent to what the current parser/generator produces. This means that the new language can represent the same operations that the C code in the current language uses.

Step three of the plan is to replace the current Perl 5 parser/generator with the PCT-based compiler.

Step four is to add an emitter to the PCT-based compiler to produce opcodes Parrot understands which perform the same functions that the existing C code perform.

Step five is to remove the C emitter and run everything through the native Parrot instructions.

Although this process has several dramatic changes, only one is visible to users: the switch from writing PMCs in the current mixture of semi-parsed C to writing PMCs in a PCT-hosted language. Though we can deprecate the C version at step two, we only have to remove it at step 5.

I believe that this is possible both technically and socially primarily because of the project's organization:

  • We have a defined interface with which to write code which interacts with Parrot. (Admittedly we haven't formalized this interface yet, but that's in progress.)
  • We have a documented support policy which notifies users as to our deprecation schedules and defines our backwards compatibility plans.
  • We have a regular release schedule which, when combined with our documented deprecation and backwards compatibility policy, allows us to refactor and modify Parrot on a predictable schedule at a pace which allows for frequent improvements.

I'm sure some people could argue -- some of them even successfully -- that these types of changes after a project has reached 1.0 would be unnecessary if we'd only tried to plan the project in more detail years ago. Perhaps they're even right. Yet when was the last project you know of that both tried to do something new and successfully predicted the future such that it needed no architectural changes?

I don't mind rewriting code or even writing code I'll throw away in six months if it progresses toward something better that's faster, cleaner, easier to understand, easier to maintain, simpler, better designed, and/or more featureful.

Sources of my Unmotivation to Contribute to Perl 5

... which brings me to the long-promised explanation of why I have so much trouble contributing to Perl 5.

When I posted the final version of the patch to add class { ... } to Perl 5, I had just finished wrestling with the Perl 5 parser to work around the Perl 4 apostrophe package separator superseded by double-colons and recommended against in Perl 5. You may recall the release of Perl 5 in 1994.

I wrote that patch to Perl 5 because I felt guilty about complaining about Perl 5's flaws (despite my contributions to Perl 5, such as they are) without at least attempting to address them.

I had little hope the patch would be accepted, but I was open to the possibility that someone might say "You know, it doesn't break backwards compatibility, it does make boilerplate code easier to read, to write, and to explain to novice programmers, and it is compatible with Moose, MooseX::Declare, and Perl 6.

Maybe it'll be a part of Perl 5.12, if Perl 5.12 ever comes out. Maybe someone else'll pick it up and maintain it and argue for it and get it committed to the core.

Meanwhile I can get the urge to fix a memory leak in Parrot or Rakudo, fire up Valgrind, and commit it today for people to use from the stable monthly release next Tuesday (for Parrot) and next Thursday (for Rakudo) or the packaged release included in free software distributions in July and everyone's life is a little bit better in days or weeks. The same goes for improving performance or fixing another bug or adding a feature.

If it's a big feature or an architecture change or something otherwise disruptive I still have to discuss it with other developers and figure out how it fits with our backwards compatibility and deprecation policies, but if we decide it's worthwhile, it won't languish for years between releases. The lag between producing something wonderful for other people to enjoy is months at most.

Perhaps people critical of my views are right in that it's inappropriate to compare Parrot to Perl 5. Perl 5 certainly has more users. Perl 5 has had more contributors. Several orders of magnitude more people and businesses rely on Perl 5 as it exists now.

Yet I wonder if they're perfectly happy with Perl 5 as it is -- or do they secretly wish that 5.10's argument passing performance problem found and fixed in source control three weeks after the release of 5.10 were released in a form they could use, or that Perl 5 had working subroutine signatures, or that Perl 5 had declarative classes, or that strict and warnings weren't optional for all new code?

The Perl 5 committers are right in that the best way to ensure that there's a Perl 5.10.1 sometime this year is to fix bugs, to write documentation, and to perform other thankless maintenance tasks. I hope to write about how to do that and I may do some of it myself. (Goodness knows I've done plenty of that for Parrot.) That's noble work and it deserves far better praise and far more respect than it gets. Even though I criticize of Perl 5's project management, I still believe that maintainers deserve much credit and appreciation for their work.

Still, the long-term viability of the project concerns me. If Perl 5 has a volunteer time and effort problem (and it's clear that it does), I wonder how many other potential contributors have declined to participate for reasons similar to mine. (After several people have told me in person that they agree with most of what I've written but don't want to say anything in public, I wonder how many other ideas for process improvements the community has lost.)

I take two lessons from this. First, sometimes it's easier to rebuild a train barrelling down the tracks than a train stopped in a weedy cow pasture. I wouldn't have believed that either before now.

Second, any project management strategy which relies on a sense of guilt to recruit new and prodigal developers has problems no technical mechanism can solve.

What does "Stable" Mean?

| 3 Comments

By now it's clear that the people behind the Perl 5 release process consider "stability" one of their primary constraints. Support isn't easy to define for community-produced software; neither is "stable". That's no reason not to try.

Expectations of Stability

A reasonable first definition of "stable" is that an average user should be able to follow the documented process of obtaining, configuring, and building the software on a supported platform. After that, the user should expect that the automated tests pass.

There may be glitches in that process, but any software release for which the developers cannot make that guarantee is not stable.

Any disagreements? Good.

Does "stable" guarantee that all tests will pass on your system? No; your system may have unique characteristics that the developers did not anticipate. No plan survives first contact with the enemy (and feel free to quote me on this truism: no test plan survives first contact with entropy). Careful design and revision can ameliorate this, as can extended testing in heterogenous environments, but the only true test of whether the software works as intended is when real users use it for real purposes.

This is why we have bug trackers, patches, and subsequent releases.

Does "stable" guarantee that all previous uses of the software will continue to work as expected? No; your specific uses may rely on the presence of bugs, unspecified features, happy accidents, or other behaviors not captured in the test suite. Again, careful attention to detail can ameliorate this, but you can never guarantee it unless you vet all uses of your software.

This is why we have bug trackers, patches, subsequent releases, deprecation policies, and comprehensive documentation.

Does "stable" guarantee that the software will meet all of your needs? No; the software only does what it says it will do. You may be able to coax it to perform other functions beyond those documented and supported, but if you depart from what the developers expect, intend, and promise, you are on your own.

This is why we have roadmaps, overviews, and comprehensive documentation.

Does "stable" guarantee that there are no bugs? No; though whole-program analysis and computer-aided proofs can assist in verifying algorithms, this is infeasable for most software. There will be bugs of implementation, bugs of unclear specification, bugs of poor documentation, bugs of misunderstood requirements, bugs of portability, bugs of testing, and bugs of many other types.

This is why we have bug trackers, bug reports, subsequent releases, and you get the picture by now.

Does "stable" guarantee that you can replace the previous version of the software with the new version in your production environments on the day of release and sleep soundly that night? No; I cannot imagine anyone who would not call you a fool for doing so.

That's why you have comprehensive test suites, test plans, deployment environments, and access to the bug tracker, release notes, deprecation plans, support policies, and lines of communication to the developers.

That's why you have access to testing releases (not that anyone uses them).

That's why maintaining your software is your responsibility -- not because developers are jerks who hate you and would burn down your office if they could, but because the only effective way to prove that a piece of software is sufficient for your needs is for you to verify it against your needs.

Implications

Should developers get sloppy about their release processes? Of course not. Automated testing, exploratory testing, smoke testing, and other techniques to keep code quality high are amazingly important. If anything, most projects don't take them seriously enough.

Do developers get a free pass if a release has embarrassing bugs? Of course not. Mistakes are mistakes, and software that's not suitable for its intended uses needs immediate attention.

Are users part of the testing and verification process? They always have been, especially for general purpose tools such as a programming language. If software developers could get it all right the first time, we wouldn't need to argue over release processes or write automated test suites or include FAQs and disclaimers in our documentation. You wouldn't need to test software against your own purposes.

We don't live in that world, though.

Can regressions and embarrassing bugs be rare? Of course. The quality of the code and development process is exceedingly important. A well-maintained test suite verifying a well-factored and maintainable codebase updated regularly with thoughtful analysis of feedback, bug reports, and feature requests from active users can achieve very high quality. Mistakes will happen, but there are ways to reduce their risk and frequency.

Is it worth pretending that we can achieve perfect stability by delaying the release of software? In my opinion, no. It's more important to me to reduce risk by the flexibility and agility of being able to release software frequently.

I suppose that a development process that has to coordinate dozens of separate projects to converge upon a stable point on multiple platforms simultaneously while operating on all sorts of other constraints can eventually converge on that miraculous point of stability... but it'll still have bugs. There'll still be patches. Users will still need to test their own software against it.

Are users are better off waiting months and months for fixes to problems they have right now than they are dealing with mythical bugs and regressions and problems they might have in a future so-called "stable" release? Given that you can't know those problems until users encounter them, and given that users don't test release candidates, they'll discover these problems sooner or later, and only after a release.

The question is whether it's better to address those problems sooner or later. My bias is toward fixing problems as people report them. Certainly released software with fixes for known bugs must be more stable than unreleased and untested software.

My recent entries here hit a nerve in fellow p5per Steffen Müller. In a comment on 's Guide to CPAN Needed, Steffen claims that my arguments are "pie-in-the-sky fantasies" from someone who "has done nothing but finger-pointing".

That's a dangerous line of thought when addressing criticism of a software project.

I can imagine a fair few people wondering when the Perl 5.10 performance regression when assigning a list from @_, patched in January 2008 will ever be present in a release version of Perl.

(It's doubly amusing, in a very sad way, when one of the excuses for avoiding regular releases of Perl 5 is that "We just can't cause regressions!" and when Red Hat gets pilloried for a somewhat rarer Perl 5 performance regression.)

Is pointing out that a patch for a known performance regression languishing, unreleased, in bleadperl for 17 months a problem? Is that criticism dismissible from everyone who isn't currently a Perl 5 committer or the maintainer of a core module?

Perl 5.10.1 could have come out on 07 January 2008 with only that patch and been an improvement over Perl 5.10. There is no shortage of minor version numbers. There are no API changes. There are no backwards compatibility concerns. (I don't know if the smokes were clean on all target release platforms, but given that the patch is very simple; most of it is moving code a few lines, I believe that's a low risk.)

Yet Perl 5.10.1 still hasn't come out, seventeen months later -- and imagine the furor if a downstream Perl distributor applied the patch and something went wrong. In what universe is it a good thing that a trivially fixable performance regression affects more and more people every day when it could have been fixed three weeks after Perl 5.10's release -- before it had affected anyone besides early adopters.

Imagine applying the same line of reasoning ("Your criticism is wrongheaded and useless") to anyone who said "If the Switch module is dangerous and not recommended, why not remove it from the core?" (Fortunately, it is on the docket for deprecation and eventual removal.)

Imagine applying the same line of reasoning to everyone who says "You know, Perl 5 really could use optional function signatures, or Perl 5 could use a declarative syntax for classes.

That's what bothers me most about Steffan's comment; it elevates "plain old hackers" above people with ideas, questions, concerns, and the desire to improve the efficacy of other plain old hackers.

Yet if it takes a core hacker to make these arguments, permit me a biblical allusion to Philippians 3:5 to answer the silly charges of "Philosophy and project management and advocacy and education and writing documentation are all worthless; you should write code!"

  • As of last Friday, I owned more of bleadperl than Steffan. (Yes, this is a stupid comparison, because....)
  • I wrote a huge amount of tests for the core and standard library in the early 2000s when we quadrupled the number of tests between Perl 5.6 and 5.8. (A lot of that work doesn't show up in the silly line-number Internet measuring contest link in the previous item because I wrote those tests for many other people who have taken over maintenance -- this is exactly what you expect when you write tests.)
  • I wrote the original version of Test::Builder upon which nearly all modern Perl testing exists -- including the testing used to demonstrate the core's stability.
  • I co-wrote the book on Perl Testing.
  • As linked earlier, I grew tired of suggesting that someone add the class keyword to Perl 5 and added it myself (though sadly, the patch was rejected).
  • I backported the yadda-yadda-yadda operators from Perl 6 to Perl 5 and implemented DOES.
  • I can predict, to the day, the next 25 stable releases of Parrot, including the releases where we will have removed deprecated features and the releases we expect downstream distributions to package and distribute. (I believe -- and I believe most, if not all Parrot committers also believe -- that the policy of stable monthly releases saved the project from almost certain irrelevance.)

I don't mean to disparage anyone else's contributions -- far from that, I believe that everyone who's reported a bug, suggested a change to the wording of the documentation, or posted the results of a test run has just as much right to voice concerns about a project as a core developer.

That doesn't mean my critcisms or concerns are accurate or correct or useful. They could be wrong. I welcome corrections, if so.

Yet there's something deeply wrong with a release process that lets known and fixed problems that almost every Perl 5.10 program will encounter linger for years, especially when fixing them is trivial.

You shouldn't have to be a pumpking to be able to say that.

(Or "Why the solution has been obvious from the start.")

If my hypothesis is correct, that Perl 5's current development process is unsustainable, in part because synchronizing dual-lived Perl 5 modules with the core is a fool's game, there's one obvious way to get Perl 5 on regular release schedule and add important features that just about every other serious programming language supports.

Most of the pieces are already in place.

The success of the CPAN and its ecosystem already demonstrates that this strategy can work, as does the presence of virtual dependencies in packaging systems for free operating system distributions.

The idea isn't new at all, but you may not have heard it expressed this way: to solve (some of) Perl 5's maintenance problems with coordinating the stable release schedule of dozens of modules and distributions maintained outside of the core, stop coordinating the stable release schedules of dozens of modules and distributions maintained outside of the core.

Remove dual-lived modules from the core. (Update: Leave those modules necessary to bootstrap a full CPAN installer, of course.)

What replaces them?

If the underling premise of modern Perl and enlightened Perl is correct -- that the world's best and most effective Perl programmers take full advantage of the CPAN to make up for missing language features, to improve their productivity, and because solving a problem once and for all and sharing it is the ultimate expression of laziness, impatience, and hubris -- then the problem in the Perl world is that we don't do this enough.

To bring up a repeated example, releasing software regularly can be difficult. The best way I know to make it easy to release software every month is to release it every month. If it's difficult to release your software every month, you have a strong motivation to make it easy.

The Perl world hasn't made it sufficiently easy to find, install, maintain, and upgrade non-core code. Projects such as local::lib, CP5.6.2AN, and CPAN Testers go a long way. They don't provide everything, but the tools and knowledge are present to add the next few necessary features.

Please note that all of these features depend on CPAN remaining an uploading service and a mirroring/distribution service. CPAN itself doesn't need to change. CPAN works just fine for these purposes: especially as its simplicity has allowed a solar system of related services and tools to form around the shining star that is the CPAN distribution service.

The benefit of removing modules from the core are simple and compelling:

  • There is less code for p5p to maintain.
  • Releases of modules can occur on their own schedules; releases of the core can occur on its own schedule.
  • Bug reports, feature requests, and patches will not as easily get lost in the cracks between the maintainer's preferred communications medium and the p5p list.
  • Crufty old modules superseded long ago by something more decent will lose whatever patina they had from their "Of course they're core modules! This means we should use them!" status.
  • Places of work which refuse to vet or install non-core modules will be revealed as the soul-sucking portals of hate that they are and all right-minded people will ignore them.

(The final reason is conjecture, but very satisfying to imagine.)

For this to work, the Perl community must:

  • Produce a graph of modules and their dependencies in a form consumable by other services. David Cantrell's CPAN Deps (link not provided in deference to his bandwidth and CPU resources) is a great start.
  • Exploit the CPAN testing reports to identify potential changes in leaf nodes of the graph due to upedge changes.
  • Revise the testing strategy (and its reporting) to alert upedge authors of the effects of changes on downedge authors and vice versa.
  • Publish metadata on the best-possible recommendation of versions for modules along a graph edge. That is, if you want to install Catalyst on FreeBSD 7.0, you may want Scalar::Util x.y, Test::Simple x.y, and Moose x.y.
  • Given this metadata, provide a single, installable (source) tarball which bundles all of the nodes of a graph edge.

This is all feasible, if the Perl community has the will to do so.

Some may object that the final two steps may occasionally fail. That's right. You can't solve that problem -- not completely. You can try, but you will fail. That doesn't mean you shouldn't try. It only means that you're a fool for designing a process that can't cope with the occasional upgrade glitch.

Others may object that this does not address the education or the advocacy problem of teaching Perl novices to look for these bundles or system administrations that they're hateful little trolls unless they support these bundles on their system. Nor does it address the technical problem that occasionally plagues installing CPAN modules. Those are solutions for a different post.

One final objection is "What's the point?" or "The Perl 5 release schedule works for me!" or "But you ought to support these bundles for eight to ten years." Those are fine opinions, if wrong, for one simple reason.

The right way to write better software is never to go more slowly.

A nice little piece of Perl advocacy made the rounds of programming sites the other day. I recommend Fast, concise and reliable code? Try Perl!; it's simple, it's accurate, and it's reasonably free of bad advocacy.

Now read the comments on the Reddit Perl advocacy story. Ignore the grandstanding and Internet-based ruler-wielding contests. Ignore all of the posts that repeat a 45 year old APL joke (that wasn't funny in the '60s). Ignore all of the posts that pretend that software maintainability has anything to do with the presence or absence of sigils or the ability to iterate over a list more than one way or the requirement to indent your code instead of using braces, because those are the dumbest comments about maintainability and their only value is in identifying people who know so little about what makes software maintainable that you can ignore everything they say on the subject.

Read just the posts that say things like "Why doesn't Perl have function signatures?" and "You know, modules can be hard to install." and "The default OO system is very minimal."

Then read some of the responses.

Read them again. This time take off your I Heart Perl Advocacy hat. Close your IRC window to #perl on freenode. Read them one more time.

Are you sick of them yet? Good.

Badvocacy

It's okay to say "Yes, all parameters to Perl functions get passed in a list you must explicitly unpack, but you can write your own validation code." It's not okay to say "You can write your own validation and unpacking code, so it's not a problem that you must write your own validation and unpacking code."

It's okay to say "Here are a couple of good CPAN modules that handle parameter unpacking and validation for you. They're efficient and they provide good syntax." It's not okay to say "The existence of those CPAN modules means that it's okay that after 15 years, Perl 5 still makes you unpack your arguments by yourself or install third party code to do so."

It's never okay to say "Why would you ever want to do anything other than unpack your arguments yourself?"

Perl is a great language. It has many great features. Its ecosystem is impressive, and I believe that many of its features compare very, very favorably to other languages and communities.

It's not perfect, however. The reference syntax is awful. The lack of function signatures is embarrassing. The minimal OO system is too minimal.

Ignoring these problems is bad advocacy. Dismissing these problems is bad advocacy. Telling people they're wrong for wanting good defaults (this being Perl, of course there are tuning knobs!) is hostile and malicious advocacy. If I were the sheriff of Perl Advocacy City, you'd end up wailing and gnashing your teeth in the outer darkness for lying to other people about Perl.

There's no Perl Advocacy City and I'm not the sheriff.

Even so, we should be honest about Perl.

It is wrong that you have to add two lines of boilerplate code to every module to ask Perl to help you write robust programs.

It is wrong that there is no declarative syntax for parameter passing in the core, 15 years after the release of Perl 5 and several years after the release of the relevant Perl 6 specification.

It is wrong that the default metaprogramming model of Perl 5 classes requires poking in package global variables and symbol tables, and even then method dispatch to overridden parent methods is very, very wrong in certain circumstances.

It is wrong that there is no way in the core to distinguish between methods and functions when performing reflection.

It is wrong that the syntax for defining a class in Perl 5 is to use the keyword package.

Perl 5 gets a lot of things right, but it gets all of these things wrong. They get more wrong every year that passes.

The Perlian Knot

Why do they remain wrong? There's no single reason; a big jumble of reasons feed into each other.

  • Adding new keywords to Perl may be backwards incompatible in some cases, provided that someone actively upgrades to a new version of Perl, does not have adequate testing and change management strategies for existing code, does not read changelogs, is unlucky enough to suffer collisions of ambiguous barewords, and does not explicitly use core mechanisms to give the parser hints as to what's a keyword and what's not.
  • Modifying the Perl 5 core means wading through a big wad of poorly-encapsulated C code strewn throughout with concise and easily confused macros that aren't self-documenting in any meaningful sense.
  • The Perl 5 core wad of C macros has escaped to the DarkPAN because there's little enforced encapsulation between even the internal data structures used in the core and what embedders or extenders can manipulate.
  • There's a backwards compatibility layer for this big wad of C code intended to run on at least three major versions of Perl 5, spanning the gap between a release in 2000 and the latest stable major release. (If the thought of maintaining that doesn't make you queasy, you might be the Buddah. More likely you need psychiatric assistance.)
  • Despite a specification for the same features in Perl 6 (and existing implementations for both Perl 5 and Perl 6), no one has taken the time to create a plan to port those features to Perl 5, or write test cases for them, or identify limitations in Perl 5 as it exists to make certain features difficult, or even suggest a timeframe or milestone plan to implement them.
  • Some people in the Perl 5 world don't even want these features.
  • Despite the fact that refactoring the internals of Perl 5 has been a sore necessity since at least 1998, if only to provide a stable and well-encapsulated API behind which the guts can actually change, no one has provided a plan by which to do so.
  • It's supremely difficult to tell whether a change causes test failures on other platforms because Perl 5 must run on several platforms where people have access to test results once a week (if not less frequently), where getting results is a manual process.
  • Only a handful of people in the world know the core well enough to start to address some of those problems, and those people have day jobs and do not wish to work on the core full-time even for money.
  • There's not enough money to entice anyone who already knows how to work on the core to do the cleanup work necessary to enable other people to work on the core.
  • Even if there were, there's little obvious interest in other people in working on the core because there's little guarantee that any effort put into these areas would be useful. (Yes, I'm talking about myself here in part.)
  • Even if there were hordes of volunteers willing to code new features or fix bugs, it's not clear that hordes of people are willing to help maintain these features or to stick around to ensure that the bugs stay fixed, especially on all of the platforms they need to stay fixed, especially as people find more and more corner cases that require you to revise code.
  • Even if there were maintainers, it's not clear that anyone has the good taste or the time to refactor the system regularly to ensure that it's clean (especially as so much of the code is of the "Dear sweet Cthulthu, please don't touch this!" variety.)
  • Even if there were ready volunteers willing to fix bugs and run tests and identify changes which passed several different compilers on several different platforms, there's still no roadmap. No one really guides the direction of development.
  • Even if there were a roadmap, there's no schedule, so you can't even tell when you'll be able to use a feature or rely on a bugfix in a stable release.
  • Even if there were a stable release schedule, the push toward backward compatibility in new code means that you can rarely rely on users having the most recent stable version of Perl available, so you must persist in your backwards compatibility workarounds and hacks until some point in the glorious future when you may be able to remove them and write clean, concise code again....
  • ... which persists cruft and workarounds and brittle, fragile code that's difficult to maintain and difficult to understand.

I'm as guilty as anyone of contributing to this problem. I've written a few (thousand) tests, wrote a core library, fixed a few bugs, added a couple of features, and had a feature or two shot down... but the process is hugely demotivating.

I can point out flaws in every programming language I've ever used (and that's a lot) and design flaws in every programming language I've ever helped implement (and that's a few). That's not the point.

The point is that Perl 5 is missing some simple features present in almost every other programming language of its class, and the reasons why are not primarily technical.

I'm not sure how to cut this knot, except to say that Perl 6 suffers far different problems, namely that inventing the future is difficult to do even if we weren't working primarily with volunteer labor. That's why I spend precious development time working on Perl 6 and Parrot -- they're software where I can make a difference.

I wish I could say I could make a difference on Perl 5, but I'm not sure I can.

(In which your author mocks a pervasive and wrong belief about the future of the Internet.)

Many people believe that the Age of the Internet necessitates the passing of the Age of the Desktop Application, in the same way that the Age of the Desktop Computer heralded the end of the Age of the Mainframe.

This brings up three questions: why, what, and how?

The first question is the easiest. A desktop application requires that you find the software, obtain the software, install the software, and then learn how to use the software. There may be licensing fees or a difficult installation process or compilation or cursing when you discover that the binary doesn't work on your operating system. If you can get past all of that, you may discover that the brilliant User Interface with bells, whistles, widgets, and themes may not only be ugly as sin but also impossible to use correctly.

Then try sharing work you create in that application with other people, or using it on your smart phone, or accessing it from a different computer.

Given that the Internet is a global, loosely-connected, fault-tolerant, distributed network of more computing power than you can imagine, it makes sense to move at least some of that processing (and certainly the data storage) to somewhere you can access it from any Internet-connected machine.

The magic is figuring out the right kind of interface makes the most sense and can scale from desktop to laptop to kiosk to tablet to smart phone.

That right kind of interface is simple e-mail.

E-mail relies on all of the good parts of SMTP: it's widely distributed, it's federated, it has a store-and-forward system, it's an open standard, and there are countless clients (for humans and machines) capable of understanding it.

Not convinced? Imagine commenting on this article via e-mail. You read it in your favorite mail client. You hit Reply. You type your comments and hit Send. That's it.

You don't have to muck around with logging in or creating an account (you already have one). You don't have to worry if my server configuration or a web gateway in between us will silently drop your message. You don't have to wonder if you need to re-enable JavaScript to solve a CAPTCHA and hope that the mandatory refresh won't destroy your carefully typed submission. If you decide you want to save your comment for later, hit Save as Draft and come back to it an hour, a day, a week, a year later.

Some might say that a plain-text medium is insufficient for all of the graphical goodness necessary in a modern application. That's bunkum; plain-text e-maill supports text areas natively, as well as text boxes, combo boxes, and buttons:

Name: ______________ (textbox)

Medical history:




(textarea)

Age (choose one):
    ( ) Under 18
    ( ) 19 - 25
    ( ) 26 - 39
    ( ) 40 and older (combobox)

Preferred pie flavors (choose many):
    [ ] Berry
    [ ] Peach
    [ ] Apple
    [ ] Pudding
    [x] Strawberry/Rhubarb

Mac users may appreciate that it's possible to manipulate graphics even in plain text. Including a complete demonstration of PhotoShop-quality output would bloat this essay, but for a hint please see Programming Sprites - Another Look.

Privacy- and identity-minded individuals will be glad to note that well-configured e-mail systems support both modes of operation. SMTP does not require strong identification of senders, but it can support strong identification. In particular, if your outgoing mail server requires authentication, it can attest with some authority that a message purporting to be from you is indeed from you (thus solving the identity management problem inherent in other protocols, such as HTTP). If you wish to use the Internet more privately, simply disable this identification system in your outgoing server. (Perhaps a clever e-mail client could offer a mechanism to toggle secure/anonymous mode.)

I could wax poetic about all of the other benefits of this system, but two of the most compelling are the composability of client and server-side operations as well as offline mode. The latter is easiest to understand: you can batch client-side operations when you do not have easy access to the Internet at large and submit them only when bandwidth is available. At the same time, you may receive batched responses from previous requests. You do not have to be online with an active connection to continue to perform useful background and foreground work!

Composability sounds trickier, but you likely already perform it. I have access to my incoming mail server, where I use Email::Filter to categorize all of the Internet applications I use through SMTP through the use of organizational tags, based on message content. For example, all of my work programming the Parrot virtual machine is available under the folksonomic tag "p6i". Please note that this occurs via analysis of incoming communications based on characteristics I've noticed; this tag is of my own invention and is neither a feature of the application on the server side, nor normative behavior of other users of the application. I have the freedom and flexibility to manage my own tags as I see fit.

Another clever feature on my incoming server is the ability to combine several incoming messages into a single message, thus filtering out duplicates.

Of course, it's easy to write server-side applications as well. I wrote the underused Mail::Action software several years ago and have used it productively ever since.

I should note one potential (though solveable!) drawback with the use of pervasive plain-text e-mail as the replacement for desktop applications: it does often require (at least for now) the use of a client-side desktop application which can send and receive e-mail. In the future, e-mail-only devices such as the Blackberry may grow more common (I shouldn't have to mention how much easier it is to archive the textual transcript of a conversation than an unrecorded phone call!).

Some users may require persuasion to leave webmail systems for the more powerful and future-proof desktop clients suitable for unlocking the full range of applications made possible by pervasive SMTP. This may be difficult, but the result is inevitable. No one sane could possibly believe that a web browser would make an effective universal desktop client.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Archive

This page is an archive of entries from June 2009 listed from newest to oldest.

May 2009 is the previous archive.

July 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?