February 2010 Archives

Perl 5, Support, and Bugfixes

I wrote what I understand to be the strategy behind releasing new minor releases of Perl 5. Though the development branch of Perl 5 follows a monthly release cycle, the maintenance branch currently does not. If it's difficult to predict what changes volunteer developers will make in the future, it's doubly difficult to predict which bugs they will fix (or will need to fix).

Thus any support document must explain the responsibilities of users who encounter bugs and what they should expect from developers.

A bug you discover in a new major release of Perl 5 is a candidate for a new minor release if it is:

  • A security or dataloss bug
  • A regression introduced in the new major release
  • A failure to build on a supported platform combination
  • A failing core test on a supported platform

I've organized that list in rough order of decreasing severity. The most likely candidate for a fix is the second; it indicates an undertested aspect of the system. Behavior should not change between Perl 5 releases accidentally. If a patch modified behavior on which your code depends and if that change did not occur as part of a deliberate, communicated plan, a fix is likely.

Of course, any fix in a minor release needs to maintain binary compatibility within the release family.

The easiest way (at least for developers) to find and fix such bugs is before the release of a new major version of Perl 5. That's one goal of the monthly releases: to encourage you to test all of the code you care about with versions of Perl 5 in development. That's also one reason for the Perl 5 release candidates (though it's likely too late for big fixes by then).

If you can't do that, the next step to reporting a bug is to reproduce it in the smallest example possible. 10-15 lines of Perl 5 is good. 2-3 is ideal. More than 20 is usually too big. If you can provide a test case suitable for adding to the core test suite, so much the better. From there, test your code with multiple releases of Perl 5. It helps to browse the perldelta documentation, but the amount of detail between even minor releases can be daunting. A post on PerlMonks is a good step.

If you've gone through all of that, see perldoc perlbug.

This is all no guarantee that your bug will get fixed in a minor version—you should prepare for the possibility that, given enough time since the release of the corresponding major version, the best approach for p5p is not to backport a fix to a new minor version. Even so, you will likely get one of several options:

  • An explanation of why it's not a bug
  • A suggested workaround
  • A fix in the current version

In the latter case, you will have the option of applying the relevant patch yourself or asking someone to backport it to your own custom Perl 5 if you wish. That may not seem like the ideal situation (it isn't!), but at least with free software such as Perl, you always have that option.

When a new major version of Perl 5 comes out, history suggests a new minor release will follow.

Some of this reasoning is pragmatic. For all of the requests of the Perl 5 Porters for people to test development snapshots (5.11.0 through 5.11.5) and the inevitable release candidates (in this case, 5.12.0 RC1 through... hopefully not RC2 and RC3), nothing gets more testing or bug reports than new major releases. Bugs get reported. Changes get requested. Changes occur.

The traditional view of new major releases is that they're somewhat unstable. "Wait for the patch release," people say. This was true of 5.6.0. (The ancient Camel third edition describes an unreleased version of Perl 5 somewhere between 5.6.0 and 5.6.1). This was true of 5.8.0. This was even more true of 5.10.0, where the CPAN itself suggested that 5.10.0 was a "testing" release for bleeding edge users.

Given the size of the Perl 5 test suite and the daily and weekly test reports produced from the bleeding edge of Perl 5 itself as well as the monthly releases, most of the obvious bugs appear and get corrected quickly. Even so, bugs happen. It's software. Changes occur and people notice only in odd or complex situations. Sometimes a new compiler warning appears, or underlying libraries change. Sometimes a few updates help get Perl 5 building on a platform which itself has changed.

A patch release is inevitable.

However, the Perl 5 Porters make no promises about when a point release will occur. Nor do they promise how many point releases will occur in a family.

The Perl 5.8.0 family had nine point releases between July 2002 and December 2008. If Perl 5.10 hadn't taken five and a half years, Perl 5.8.0 might have had fewer point releases.

22 months passed between the release of Perl 5.10.0 and Perl 5.10.1. There may never be a 5.10.2.

A point release needs two things: a steady stream of bug fixed in the core without breaking source compatibility and someone to identify appropriate those patches and to make the release. In other words, one or more people need to be able to cherry-pick patches from the development track of the next release of Perl 5 to the branch which will become the new point release. The weight of history and expectations is sufficient to assume that p5p will be able to find or herd enough volunteer effort to make a Perl 5.12.1 and a Perl 5.14.1 and so on, but if monthly Perl 5 releases continue and produce a new major release every 12-18 months, the likelihood of a Perl 5.12.2 and a Perl 5.14.2 decreases.

The single unpredictable factor is the presence of a major bug discovered in that release family; a major security bug or a data loss bug is one possibility. In that case, a single-patch minor release is likely. Beyond that, minor releases have diminishing returns.

As mentioned in What Perl 5's Version Numbers Mean, the written Perl 5 support policy must explain several guidelines and their implications.

If you've ever upgraded between major versions of Perl 5 on the same machine, you've likely noticed that you have to install new versions of modules. Various resources spread across the Internet suggest the use of CPAN autobundles, but even that's likely enough to make you curse a little bit as you babysit a CPAN shell for an hour or two to get back to where you started.

This is due to the binary compatibility guidelines to which the Perl 5 Porters adhere. While there's a strong desire to ensure that programs and modules written in the past several years will continue to work unmodified on new major versions of Perl 5, it's almost impossible to ensure that compiled XS code remains compatible across major versions of Perl 5.

Certainly the porters attempt to maintain source code compatibility of XS wherever possible, but ensuring that an XS module compiled in 2000 for Perl 5.6.0 will continue to work unchanged on Perl 5.12 in 2010 requires a great deal of foresight, plenty of tolerance for workarounds in the core, and no small amount of luck. There's a limit to what's practical to provide for how long, and the price of reinstalling extensions (and recompiling a few) is worthwhile.

While this choice may seem odd, considering the degree to which Perl 5 retains backwards compatibility, it reflects a philosophy from the earliest days of Unix: source compatibility is important, but everyone should have access to the source code to recompile as necessary.

Within a release family—Perl 5.8.0 through 5.8.9, for example—the porters attempt to maintain binary compatibility. If you installed DBI on a fresh Perl 5.8.0, it should continue to work even if you install 5.8.9 and remove 5.8.0. That's the intent.

Note that the Perl 5 configuration and installation process enforces this by default; minor releases within a major release family reuse the same module installation paths. Major releases have new installation paths. You can reconfigure Perl 5 to use paths of your choosing, but you do so at your own risk.

What Perl 5's Version Numbers Mean

| 1 Comment

Perl 5.11.5 comes out tomorrow and Perl 5.12 should be out soon. (Much credit goes to people such as Jesse Vincent and David Golden, to name two, for getting Perl 5 on a regular release cycle.) I've long promised to write about the Perl 5 support and deprecation policy and how that affects users.

Perl 5.10.1 was, by definition, a minor revision. Perl 5.12 is a major revision. The nominal difference is which component of the version number increases. By intent, users of 5.10 (actually 5.10.0, but often abbreviated) should be able to upgrade that installation in place to any subsequent minor release in the 5.10 family. The upgrade isn't always completely transparent, but the intent is that, modulo bugfixes, it should be.

When 5.10.0 came out, work started on a new Perl 5 release family called 5.11 (that's not entirely true, but it's sufficiently true for this explanation). This is the unstable series intended for development and testing which will become 5.12 in the next couple of months. You are welcome to download, configure, build, test, and even install 5.11, but you should be comfortable without support from p5p for upgrades and changes.

The monthly releases in the 5.11 (and soon, the 5.13) series represent points of stability and review so that the Perl 5 developers can concentrate on the quality of what will become 5.12.0.

When 5.12.0 comes out, you will notices changes from 5.10.0 in terms of new features, removed features, and upgrades to the standard library. While most code should work unmodified with 5.12.0 as it did with 5.10.0, some modules will need updates. You likely also have to recompile any modules with XS components.

In subsequent entries, I'll write more about the implications of all of this, when you should upgrade, how deprecations and changes work, and the binary compatibility policies of Perl 5.

Why SDL Perl Matters

| 1 Comment

I read a book proposal years ago on the subject of teaching kids to program with C++. "After a week," it said, "children will know enough to create their own simple text games and animations."

I was perhaps six years old when I saw my first minicomputer. I flipped open the first page of the manual and typed in the lines verbatim—except I left off the line numbers, likely thinking that they were merely a convenience for readers. Perhaps I've had good taste from the beginning.

My typing skills were, as you might expect, abysmal. Even so, I had feedback from the computer within fifteen minutes or less. If I'd had to spend a week learning things to move characters around on the screen, I'd have given up.

I like games. I enjoy thinking about how they work. I like writing stories. I play games. The mechanics of rules and balance and design and enjoyment and player participation and perception are fascinating. Even more important is the idea that games can have a didactic purpose.

I spent a lot of time in my childhood years playing games but also breaking games. A bit of work with a hex editor could give my party more experience points so that two or three well-placed fireball spells would clean out the kobold lair. (Any role-playing system which starts magic users with four hit points won't have them surviving the tetanus shot before they get their passports.)

Because I could only get time on computers at school if they had an educational purpose, I taught myself how to write programs so I could write games. I don't suggest my experience is representative of all children, but it's not so far different from that of many of my friends.

A few years ago, I tried to help revive SDL Perl when the maintainer retired. The experience was difficult; it's a big wad of XS code that needs plenty of probing and configuration for a handful of somewhat-optional libraries. I don't even want to think about everything required to detect which version of OpenGL you have installed and available in a cross-platform fashion.

Fortunately, Kartik Thakore is everyone's hero (and plenty of other people are helping too).

I've heard the arguments that "Kids these days are too busy texting each other!" or "It's okay that kids make YouTube mashups of pop songs and clips of their favorite anime characters, that's creativity!" and "You can teach a kid PHP and HTML and call him a programmer, and that's super fun!" I don't believe any of them.

I think instead that you can plop your smart seven year old in front of a real computer with a real keyboard and show her that typing something makes a picture appear and typing something else makes it move and give her a few other commands and boom she'll play with that for a while. Not everyone's suited to the deep, dark logic of understanding the bindings from a high level language to a shared library and memory management techniques thereof, but what a privilege to teach a younger generation that a computer isn't merely an appliance to read Wikipedia and text their friends, but a general purpose device they can control.

Show a few of them how to make pretty graphics move around on screen per their command—per textual instructions they have to reason about and maintain themselves—and you just might have something. Sure, Pygame and Pyglet are great. I've used them productively. Even so, more options for free software and free environments can only help.

A Decade of Lexical Filehandles

| 3 Comments

Perl 5.6.0 is almost a decade old; perldoc perlhist gives a release date of 22 March 2000.

My favorite feature of Perl 5.6.0 is lexical filehandles. Instead of having to access the IO slot of package global typeglobs, I could use lexical variables to contain filehandles -- without having to muck about localizing symbol tables or worrying about action at a distance or lifetimes of global symbols.

Yet to this day, almost a decade later, I still see the old way with all of its disadvantages (Tell the truth; do you understand every word of "the IO slot of package global typeglobs"? Do you want to explain that to novice programmers?) in new code.

Perl 5.6.2 is long dead. Perl 5.8.9 is the last of its release series too. The argument for running new code on old installations of Perl 5 is awfully thin, in that light.

Likewise I can't make a simplicity argument for the old approach. Making old-style filehandles work like people might expect is anything but simple. Throw in a local here or there and the typeglob sigil and maybe a gensym() call for good measure. Fun!

Reasonable people differ on style and technique, but I wonder what makes a feature such as pseudohashes or 5.005-style threads so hated that it eventually gets deleted, while difficult-to-use-correctly features superseded by better replacements stick around far longer than necessary. My guess is that the Perl 5 world suffers here, as usual, from a questionable abundance of old code, old tutorials, old books, and copy and paste coding from ancient sources of dubious wisdom. (This probably means I should submit patches to perldoc perluniintro and other offenders in the core documentation.)

Perhaps it's time to consider a gradual, intentional, well-tested and well-reviewed campaign to update tutorials and example code with somewhat more modern examples of maintainable Perl.

(For fun, imagine a world where the canonical printed Perl 5 reference covered a version of Perl 5 released this millennium. Then again, Perl.com thinks that 5.6.2 is "the previous version of Perl" 5.)

Chunking, Subtlety, and Whitespace

| 2 Comments

I delayed writing about references in Perl 5 in the Modern Perl book for a long time. References in Perl 5 are useful. They have their warts. They're not as difficult as most people believe, however. Novices have trouble learning how to use references effectively because most tutorials and introductions explain them poorly.

I had to think about explanations for a long time before I found a way to explain them well.

Of course, the syntax for dereferencing gets complex very quickly—but it's also an effective example of what I've been discussing this week. Perl has a handful of subtle design consistencies that, if you understand them, help you read and skim code very effectively. If you don't learn them, you'll get lost in a sea of punctuation soup.

Consider an array reference $monkeys_ref. You can get the number of monkeys by evaluating that reference as an array in scalar context in one of two ways:

# the short way
my $count = @$monkeys_ref;

# the disambiguatey way
my $count = @{ $monkeys_ref };

The former way is shorter and more idiomatic. Anyone familiar with Perl 5 references should understand what the additional sigil means ("I want a list from the following reference"). The latter syntax has the same effect, but it means instead "I want a list coerced from the expression evaluated within this block." The difference is subtle and you don't have to understand the subtleties for this example.

Trouble arrives when you deal with nested data structures or more complex expressions, such as slices:

# the short way
my $monkeys = join ',' @$monkeys_ref[@indices];

# the clearer way
my $monkeys = join ',', @{ $monkeys_ref }[@indices];

The first expression is somewhat more difficult to parse; which takes precedence, the indexing operation represented by the square brackets or the dereferencing operation indicated by the leading sigil? The second expression works because the intended order of operation is clear, at least to anyone who understands how curly-brace grouping works with complex references.

The whitespace is unnecessary, of course, but I find that it adds clarity.

A little bit of disambiguation isn't necessary to help the Perl 5 parser in this case, but it does helps the reader. Students of compiler design might argue that nested expressions this complex belong on separate lines. I can imagine how this would read in a pseudo assembly language (I work on Parrot, after all). There's definitely a balance between the complexity of nested expressions and dereferencing... but this is a place where I consider the idiomatic use of Perl 5 sufficiently expressive that spreading the list slice out over multiple lines would obfuscate the intent of the code.

Certainly it's possible to perform even more complex dereferences of data structures, but when it's difficult to identify individual chunks of the desired behavior, it's time to simplify the code or the expression or the design. Even still, readability of this code does should not depend on the desire to avoid teaching novices about references.

Chunks and Syntax Highlighting

| 1 Comment

If I'm right—if reading source code requires identifying parts of speech—then familiarity with syntax and grammar is important to programming as an adept.

Consider Damian Conway's SelfGOL. As an experienced Perl programmer, I can pick out various pieces of the code at a glance. There's an assignment. There's quoting. That's a variable. That's a list slice.

If you've never encountered Perl before (or programming in general), you might recognize some English words, such as print and die, and that's all.

One of Perl's design ideas borrowed from linguistics is that "different things should look different". To novices, everything looks different. $name isn't obviously a single chunk. It's an English identifier and one of several punctuation symbols apparently sprinkled at random throughout the program.

Good use of whitespace helps. So does the good use of parentheses as grouping constructs (though as in prose, they often get overused by novices).

One of the most subtle mechanisms to identify individual chunks floating in a sea of code is with syntax highlighting. I can't prove this. I haven't studied it in repeatable situations. Even so, I hypothesize that (modulo color choice concerns) merely highlighting different types of terms in the grammar in different ways will help novices understand how to pick out individual chunks in code.

This requires training. This demands practice. Unless you spend time reading code, you won't understand how expressions fit together, and you have little hope of understanding code. I believe it's impossible to skip this step, and thus I don't care if someone who's used C or ML has trouble reading Perl 5 code. Of course people have trouble reading when they don't know the grammar.

(Don't worry, Lisp fans. Homoiconicity—apart from additional complexity of quoting forms and reader macros—means that novices have to spend their time learning to recognize idioms and abstractions at a level higher than tokens and chunks without the benefit of patterns of chunk types as mnemonics to idioms. Then again, I think in patterns, rarely words.)

Chunking and Programming Languages

| 1 Comment

Some of my biases are transparent. For example, I believe that many of the complaints of Perl's "unreadability" are from people who've never bothered to learn how to read the language. You often see this from people who say "Sigils? Pfft. They're useless—mere syntactic noise!"

Linguists may disagree.

One of the early inventions in written language was punctuation. In specific, adding spaces between words (and even vowels, in some languages... yes, my history studies have come in useful while programming) makes documents easier to read. The same goes for punctuation. It's easy enough to write sentences with ambiguous meanings, depending on where you put a comma to delineate logically separate clauses. (Languages with greater riches of declensions and tenses and numbers and other forms are more flexible in word order, but they do retain some degree of poetic license. It's not all meter and rhyme scheme however.)

The basic idea behind all of these ancient inventions is that "Communicating is difficult enough without verbal and body language cues. Making different things look different helps."

To read source code, you have to be able to identify nouns and verbs. You have to be able to group related items and ideas while not grouping unrelated ideas. You need to be able to identify separate expressions as well as idioms.

One reason assembly language can be difficult to read is that its regularity (op arg1, arg2 or op arg1, arg2, arg3) precludes skimmability. That may sound odd; if you're reading code, why do you need to skim code, but it's important. Programming encompasses so many small details that you must understand the code in the small in the context of the local component as a part of the system as a whole.

Uniformity of syntax means that you have to rely on cues external to the source code or patterns of repeated details within the source code to indicate structure.

I have the same problem reading Lisp code, with its homoiconicity; the shape of the code gives me few cues as to what's different between sections of code. As well, Python's use of vertical whitespace to end blocks means that my eyes slip off of the end of logical blocks and I can't tell what happens where.

A lot of that is familiarity and personal preference (or quirks of the way my brain works). Some of that is the effect of deliberate design decisions.

If you embrace the idea, like Perl does, that different things should look differently, you reach some interesting conclusions. I don't think you can learn Perl effectively without understanding those conclusions, at least at an intuitive level. I'll write about that next time.

Polemic: anyone who believes that any specific general purpose programming language is inherently unmaintainable has opinions on software development worth ignoring.

Many people claim that the design of Perl 5 has such significant flaws that render it far too difficult to write and maintain useful programs. Many of the supporting arguments are syntactic preferences. "I don't like sigils!" "Context make no senses to my!" "Real men don't need your sissy curly braces to accompany our manly indentation!" "Isn't bless a little bit cutesy for our Serious Enterprise Business Application?"

Other arguments... well, you've heard them.

Perl 5 has some design flaws, but I believe that syntax is such a small part of maintainability that only the most facile discussions focus on syntax to the exclusion of more important concerns. The next time you have trouble maintaining a Perl 5 program, ask yourself:

  • Have I learned the language by reading documentation and working through tutorials, or am I fiddling with changing things by trial and error and guesswork and intuition based on experience in other languages?
  • Do I know how to use perldoc to look up builtins and language features?
  • Have I skimmed the Perl FAQ included in every Perl 5 distribution?
  • Have I used Perl::Tidy to unify the formatting into a consistent style?
  • Do I know the difference between void, scalar, and list context? Can I identify them?
  • Do I know how to use B::Deparse to explain the evaluation plan of complex constructs?
  • Does this program have a set of automated tests I can trust?
  • Did the original programmer understand the problem domain? Do I?
  • Did the original programmer "borrow" this code from elsewhere, change a few lines, and add a modified copyright statement?
  • Did this program grow from a throwaway idea into a critical business component without planning, design, or refactoring?
  • Is the original author available to answer questions, whether in person or through some sort of design notes?
  • Is the program well-factored?
  • Does the program include appropriate documentation for its purpose, its major systems, its APIs, and any surprising design decisions?
  • Do I have a clear understanding of what the program does and why?
  • Does the program have a modular design, with well-enforced encapsulation boundaries between components?
  • Can I configure and build the program on my local system?
  • Can I deploy it?
  • Does the code show examples of idiomatic programming from authors fluent in the language, or is it a pastiche of styles cribbed from documentation and witch-doctor expermentation?
  • Did the original author know how to program in any language?
  • Did the original author take advantage of obvious strengths of the host language in appropriate ways (or did he distrust arrays and continually write to and read from a temporary file instead—I have seen this with my own eyes, and the host language was not Perl)?
  • Does the program take advantage of well-known and trustworthy external libraries?
  • Does the build process spew compiler errors and warnings? Does the program spew warnings and errors when deployed?
  • Does the program contain obvious repetition and near repetition?
  • Would you be proud of writing the program in six months?

Note how few of these concerns have anything to do with Perl—and, of those that do, trivial rewording would make them appropriate for other languages.

In Essential Skills for Perl 5 Programmers I mentioned that no one can be an adept Perl programmer without understanding context. This trips up many, many people -- and you often hear (unfair) criticisms of Perl 5 based on misunderstandings and guesses about how context works.

Context is reasonably easy to explain. (The previous sentence is grammatically correct.) Contexts is not difficult to understands. (The previous sentence is grammatically incorrect, even if you speak the Queen's English.)

If you can find the errors in the previous paragraph, you can understand quantity context in Perl 5: like subject-verb agreement in terms of number, expressions in Perl 5 can behave differently in contexts that imply zero, one, or more results.

fetch_something_awesome();              # void   context
my $item  = fetch_something_awesome();  # scalar context
my @items = fetch_something_awesome();  # list   context

Context gets a little bit trickier when you need to coerce what would normally be one context into another:

my ($item) =        fetch_something_awesome(); # list   context
push @items, scalar fetch_something_awesome(); # scalar context

If you know the visual cues (if you don't randomly sprinkle punctuation about your program until it works), those are easy to understand as well.

The subtlety comes when dealing with complex contexts, usually with nested expressions:

# list context, thanks to say
say reverse $name;

my %values =
(
    # list context, thanks to hash assignment
    name => get_name(),
    rank => get_rank(),
);

# list context (param flattening)
$screen->flip( $fleet->get_spaceships() );

This is often where more fair criticisms of Perl 5 suggest that context may not be worth it, because you have to understand what a line of code means and what it implies to read it correctly.

There's a fair point there, but it's also silly in some ways. Skimming code which calls other functions may give you some idea of what those functions do, but you rely only on the names of those functions and not their documentation to tell you any other details. Do they modify global or thread-local variables? Do they have caching or performance characteristics? Do they block? Do they require special initialization or error handling? Do they return special values?

The valid point is that chaining multiple expressions into complex compound expressions can have interesting effects. (I see this in Haskell code often; invisible partial application means that I personally can't skim Haskell code without tracking down the arity of functions to figure out what happens where.)

That's no argument against language features. It's an argument against making expressions more complex than necessary. Note that the same argument applies against complex prefix-unless expressions. unless can be amazingly useful when used properly. If you abuse it, you make amazing problems. Don't make problems.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Archive

This page is an archive of entries from February 2010 listed from newest to oldest.

January 2010 is the previous archive.

March 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?