August 2010 Archives

Backporting Features

| 6 Comments

One current discussion on p5p is the namespace of fixed versions of reftype(), refaddr(), and blessed(), recently added to bleadperl.

These functions are part of Scalar::Util right now. Unfortunately, they return undef for non references. This leads to code like:

my $ref_type = reftype( $maybe_ref );
do_something_with_ref( $ref_type ) if defined $ref_type;

(If you don't follow all of the reasons why you have to do that, that's good; you haven't imagined all of the odd and strange ways people might name their classes in various parts of the Perl 5 internals and DarkPAN modules. Safety dictates finding these corner cases.)

Anyhow.

Making these fixed functions available in the core without having to load Scalar::Util offers some advantages but also some disadvantages. It's the "You can't add new keywords to Perl 5" problem. What if someone else defined functions of that name?

Worse, what if someone loads Scalar::Util and imports its functions and expects those semantics instead of the new semantics?

(Apparently it's impossible to detect the declaration or importation of symbols with conflicting names so as to produce warnings or exceptions in this case, though I don't see why. If we were in the habit of declaring the version of Perl 5 our code uses, we wouldn't have these problems.)

The discussion inevitably circled back to the ineffable question of "If this new feature is going in the core, how do people use it in previous releases of Perl 5?" That is, "You've added a new feature to Perl 5 which will be in Perl 5.14 released next year. Is there any way to backport that feature to Perl 5.12? How about Perl 5.10? You know, RHEL 3 is still around. Why do you hate Perl .5.8?"

At some point the weight of ensuring that code written for a future version of Perl 5 can run correctly (if you cram in enough flying buttresses of CPAN shims) on code written for obsolete versions of Perl 5 will pull down the entire edifice. Allowing code written for previous versions of Perl 5 to run on modern versions unmodified (or with a simple use 5.10; at the top—I know a great combination of find and sed which can perform this kind of textual manipulation {and if you're on Windows, use PPT to get portable versions of these tools}) is often good and useful.

Going the other way... well, I don't understand it. If it's possible and someone wants to do it, fine. Why should it block improving the next release of Perl 5 though?

Mechanism versus Policy

First, read Tyler Curtis's Age discrimination in Perl 6 using subsets and multiple dispatch.

You can do everything in that post manually in Perl 5. (You're better off using a module such as MooseX::MultiMethods to handle the plumbing for you, but the point stands.) That's the fallacy of the Turing tarpit and the Blub programmer: everything is possible, but language support makes certain constructs easy and obvious, and easy and obvious code tends to exist.

If you think about Tyler's post that way, you can squint and see a general language design principle at work. Let me belabor the point somewhat more. Consider how to declare a new class in Perl 5. You have multiple possibilities. A class is merely a namespace, so:

package My::Class { ... }

That's well and good, and it works. Now do it at runtime:

eval "package $classname { ... }";

Again, that works. Yet if you consider the Class::MOP approach, you can start to see the flaw:

my $class = Class::MOP::Class->create( $classname );

I admit the flaw isn't staggeringly obvious until you need to add methods to the class. Again, it's not awful in standard Perl 5:

sub My::Class::method { ... }

... or at runtime:

{
    no strict 'refs';
    *{ $classname . '::' . $methname } = $method_ref;
}

... but how ugly when compared to:

$class->add_method( $methname, $method_ref );

The difference is the design principle at work: specify policy, not mechanism. In other words, the problem with the standard Perl 5 approach is that the how of how you store a subroutine in a package overshadows the intent of the approach. What's important is not symbolic references to typeglob slots in packages which represent classes. What's important is adding a method to a class.

The same rule applies when specifying type constraints on function parameters or multidispatch candidates or even declaring classes themselves:

class My::Class
{
    method awesome_method { ... }
}

... because even as much as I know how to implement a language on top of a virtual machine (or how to compile a language to native code), at the point where I read this code, much less write it, I don't care about the mechanism. I care about defining the policy.

Putting It All Together

| 3 Comments

The Perl community gets a lot of things right right now. Consider the CPAN: we expect a few standards of compatibility and kwalitee, but as long as you adhere to rough consensus, your work is useful and usable to a million other Perl programmers.

We have a good understanding of how to package your work and how to mark dependencies and how to document it, and that rough consensus allows sites such as search.cpan.org to provide infrastructure that allows me to read cross-linked documentation of the 84,944 modules on the CPAN (as of this afternoon).

Even the standard documentation mechanism for Perl modules (POD, adhering to a loosely-agreed organizational scheme) contributes to the utility and usability of the Perl ecosystem. You expect perldoc Your::Distro::Name to display something useful for the distro, even if it's a table of contents to other documentation. Larger projects or frameworks have introductions and tutorials and guided recipes (you know, like a "book" for "cooking", except without infringing on a trademark). If you know what you're doing and need a mere syntax refresher—or if your problem is small enough that a handful of lines of code can solve it—the SYNOPSIS section of the documentation is often enough.

Other times it isn't. Explain Plack in a paragraph, or DBIx::Class, or Perl::Critic, or Bread::Board (or explain Bread::Board at all; sometimes I think Stevan and Yuval are the only people who understand it).

Many of the wonderful new tools I want to use in Perl right now have a steep learning curve. I understand that some of them (Devel::Declare) have essential complexity that users must master before using the tool effectively. Not everything does—look at Test::Tutorial, which takes a subject which seemed complex and confusing in 2000 and 2001 and is commonplace and expected in 2010. Writing effective tests well is still an art of experience and good taste, but half an hour with that documentation has been enough to explain the basics to thousands and thousands of people.

I recognize that a SYNOPSIS won't suffice for demonstrating even a simple Catalyst application, with user authentication and logging and route dispatching and a model which is more complex than a Blog with Posts and Comments, but perhaps there's something in between a "Hello, world!" tutorial and the gory documentation of individual components and their methods. (Catalyst does this better than almost every other CPAN project.) Maybe larger projects should consider guided walkthroughs of real and modest-sized applications, especially if they include discussions about design goals and tradeoffs.

We're good at documenting reusable pieces of larger systems—CPAN encourages us to build applications that way. Can we improve further?

On Deployment

| 12 Comments | 1 TrackBack

Claudio Ramirez raised a perceptive question about the Modern Perl book, specifically How do you deploy a modern Perl application?

I can think of several approaches:

  • As a simple program which uses only core modules and relies on the system Perl.
  • As a distribution on the CPAN itself.

  • As a distribution on a custom, private CPAN.
  • As a custom CPAN repository.
  • Through the platform native packaging system.
  • As a tarball of all of the dependencies installed already.
  • Through the use of PAR.
  • Through the use of a proprietary tool such as perl2exe.
  • Installed manually with a custom build of Perl (whether with or without perlbrew).
  • Through the use of another dependency management and bundling system such as Shipwright.
  • As a service, not an installable application.

Have I missed any?

Under which circumstances would you choose one over another?

Is this subject appropriate to discuss in the book?

What I Did Wrong (Test::MockObject)

| 4 Comments

I admit it, I avoid languages such as Java because they make it so difficult to write useful libraries such as Test::MockObject. When you're writing lots of test cases and flipping back and forth between code and test code and refactoring every time you make a test pass, the more ceremony and boilerplate you have to move around, the more friction you feel and the less work you can get done easily.

I wrote T::MO several years ago when I noticed a pattern in test code I'd written. I wanted to isolate a couple of cases of behavior to test all of the code paths within specific functions and methods. Invariably I took advantage of Perl's dynamism to construct objects which adhered to a specific protocol of behavior but which were under my direct control.

Think of it this way. You want to test the error handling code for a database abstraction layer. If the network connection goes away or if the remote process crashes, you want to retain the data and perform the appropriate logging and recovery.

How do you test that? You force a failure. No, you don't hire an intern to yank the network cable during your tests. You mock up the code which raises the "Wow, it's suddenly quiet in here!" exception.

This is where I wrote T::MO wrong. Instead of mocking just one piece of the underlying plumbing, T::MO encourages people to mock the entire system, from the faucets down to the sewer system. You do get more control over the process, but you can lose yourself in a sea of sweaty little details.

T::MO can be an invaluable tool. Sometimes it's exactly what you need, especially when you have to clean up a pile of legacy code written without good design taste. Even so, I wish I'd written Test::MockObject::Extends first to encourage people to mock only part of the behavior of an object. If your code is anywhere close to well-factored, you should be able to splice in the specific behavior you need, scalpel precise, and have confidence that the rest of the code, the real code you run when you run the code for real, behaves appropriately.

The more you need to mock, the more coupled (and, generally, the poorer) your design. The less you do mock, the better your tests.

Modern Perl: The Book Seeks Comments

| 1 Comment

Q3 2010 is a good time for Perl books. Modern Perl: The Book is almost ready to go to the printer. If you're interested in understanding Perl 5, I'm happy to recommend the book to you.

(I also heartily recommend Effective Perl Programming 2E. I haven't finished reading it yet, but what I've read has been very good.)

You can help make this books even better. I've gone through hundreds of comments on the Modern Perl draft book, and the text is much better for it. Thank you to everyone who's contributed. I'd love to get more comments, especially on later chapters (regular expressions, objects, style) from anyone interested in Perl, whether the freshest novice or Larry's brother in law (Perl user #1). The easiest way for me to take comments is if you fork the Modern Perl Book Github repository and send me a pull request, but you can file issues there or mail me directly if you prefer.

We'll keep updating the books even after we publish paper versions, and we'll keep updating electronic versions, and certainly we'll happily revise the paper versions—but the more feedback we get in the next two weeks, the better the books will be for people who aren't as well connected in the Perl world as you are. (You're reading this; you're way ahead of a lot of people we'd like very much to reach.)

Thanks for all of your help. It's a good time to be programming Perl again.

Editor's note: an earlier version of this post referred to another book which has since been cancelled. We apologize for the inconvenience.

As I wrote in The Stringceptional Difficulty of Changing Error Messages, using strings in place of what should be structured data when reporting errors from Perl 5 makes improving Perl 5 more difficult than it has to be.

This is fixable.

From the conceptual side, all someone has to do is to change what Perl 5 throws for its core exceptions and warnings from a string to an object. That object can overload stringification so that all Perl code which treats it as a string will continue to get the string value. All code which treats it as an object will continue to work correctly even if the string value changes. (I haven't thought about how this might break XS code which pokes into SV guts with macros....)

Someone could even provide numeric overloading so that you can compare exception types numerically without having to call methods to figure out exception information.

Reconfiguring Perl 5's guts to make this possible is fairly simple, at least at the point of the API which actually throws the errors. A few functions in util.c such as Perl_croak() need to build an object instead of a string, but that's a modest amount of code. It's much more difficult to find every place in the Perl 5 core which calls Perl_croak and friends to change them to use the new API...

... because the right way to make this API work better is not to pass C strings as the text of error messages but instead to pass symbolic constants which represent error messages. For example, instead of calling Perl_croak( "Can't invoke non-invocant" );, the calling code should instead use something like Perl_croak( PERL5_NON_INVOCANT_EXCEPTION );. This allows a quick lookup of the right exception information as well as another benefit: localization of exception messages.

Even so, that's still a lot of code to change (590 uses of Perl_croak* in the .c files of bleadperl alone, not to mention everything on the CPAN)—and this code won't be available for wide use until 5.14 next spring at the earliest. In a few years, maybe enough people will use exception objects by default that it's possible to clean up error messages throughout the core without worrying about breaking fragile old code. Then again, fragile code tends to do the lazy thing, not the correct thing.

As an intermediary step, perhaps it's possible to refactor the core to use exception type concepts instead of literal (or sprintf-style) strings within the bleadperl source code. That would allow for localization, if that's desirable, and it gets Perl 5.13.x closer to making real exception objects useful.

The real question is whether real exception objects in the core are sufficiently worthwhile to justify this change. If the best rationale for this work is "Someday, we may be able to fix old, misleading, crufty, or wrong diagnostic messages!" then ... well, how soon will be too late? This is why to plan for making—and correcting—mistakes in your language design.

Perl 5.10 modified a warning message in a wonderful and useful way. If you've used Perl 5 much at all, you've accidentally stringified an undefined value. If you enable warnings, you've seen the message Use of uninitialized value....

In Perl 5.10, that message changed to include the name of the uninitialized variable when it's available. This was a huge usability improvement and it's one of my favorite features of modern Perl. Sometimes changing a warning or error message for user clarity is the kindest improvement ever.

(Having an interactive shell react to the user typing exit with the message Type Ctrl-D to exit! is a response to user confusion in the wrong direction.)

Unfortunately, changing error and warning messages in Perl 5 is a quagmire because parsing unstructured data is a mess. Want to catch an exception in Perl 5? You can do it safely with a module such as Try::Tiny, but to find out what the exception is, you have to perform string manipulations on $@.

If you're performing an exact match, a substring match, or a regular expression, any change to the text of that message in a subsequent release of Perl 5 could change the way your code behaves. Thus p5p must be very, very, very careful about even improving the text of internal exceptions and warnings, lest they break the DarkPAN.

The only real solution in Perl 5 is never to attempt to parse the text of exceptions and to consider throwing exception objects with something like Exception::Class. Adding true exception object support to the Perl 5 core in a backwards-compatible fashion is also very possible, though that's a subject for another post.

I made a silly mistake the other day.

A client had asked me to review the memory usage of a suite of programs running on a shared host. His account had a low hard memory limit, and when multiple programs ran simultaneously, the Linux kernel's out of memory killer killed one of the processes.

I ran the program on my machine and saw about half the memory usage. "Ah," I thought to myself. "I have a 32-bit Perl 5 installed and he has a 64-bit Perl 5 installed. "There's lots of little data in the program, so the double sized pointers on his installation compared to mine account for the memory difference."

This is where I should have realized that profiling such an assumption would have taken only a few minutes and could have saved an afternoon.

Installing a 32-bit Perl 5 on a 64-bit machine takes some work. You have to convince your C compiler to compile in 32-bit mode, and this must take place during the point at which you configure Perl 5. After a few failed attempts and a couple of patches to perlbrew, I had a new custom installation of Perl 5.

I installed the dependent modules and ran the application... and the memory use was only slightly better.

Then I thought for a bit. Then I instrumented some code for a quick profile and realized the problem. The application fetches a list of tasks out of a data store it uses as a queue. This queue has an iterator interface, as the list of tasks can be arbitrarily large. Within the code which performs the fetch, I'd assumed that the dataset would be reasonably short at any given time, so I'd flattened the iterator into an anonymous array and kept the whole thing in memory.

The program now processed more data than I'd ever planned, and the problem had only grown worse as more and more tasks piled up, as the consumer couldn't keep up with the producer thanks to OOM killing.

Five minutes later, I'd switched back to the iterator interface, and the process used a fraction of the memory than it did before. If I'd asked myself "What within the program grows over time?", I'd have figured out my mistake before I recompiled Perl 5.

So it goes sometimes.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Archive

This page is an archive of entries from August 2010 listed from newest to oldest.

July 2010 is the previous archive.

September 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?