How to Change a Running System

I write a lot about Perl 5 and its ecosystem, but I spend most of my hacking time helping with Rakudo and contributing to Parrot.

I realize this leaves me open to charges of hypocrisy (though what a tired claim; the only modern sin for which you cannot blame your upbringing, your parents, your diet, society, poverty, or your circumstances is when someone believes your words do not match your actions to a surgical degree) -- if I know what Perl 5 needs, why don't I donate my precious time to make it happen?

That's a good question, but I want to talk about something else first.

"Just Write it in C!" is a Terrible Long-Term Strategy

One of my long-term goals in Parrot is to remove as much C code as possible from the project to make it run faster.

This seems counterintuitive to people used to dynamic languages such as Perl, Python, and Ruby. "If it's too slow, you can always drop down to C!", or so the claim goes. Sometimes that's even true.

TraceMonkey shows that it's not true (at least if "fast" is enough and "really fast" is better). Dynamo, HotSpot, Strongtalk, Forth, and Smalltalk are also good sources of inspiration and information -- especially the latter two.

C's overhead for bit-twiddling, character-by-character banging, and dispatch in tight loops is minimal. If you can coerce your algorithm into C code amenable to perform a lot of work there before returning, your code can get faster with smart use of C.

Of course, if your language has a smart compiler, you've likely thrown away a lot of chances for optimization: in particular, escape analysis and inlining (specialized or not) are difficult, if not impossible. You also have to pay the penalty of converting between your language's calling conventions and C's calling conventions. (You have to run a lot of C code very fast to make this worthwhile sometimes.)

Parrot crosses this C boundary often.

Sometimes it's okay to write code in PIR (Parrot's native language) that operates a little bit more slowly than the corresponding C code because calling the PIR code is faster than calling C code from PIR code (or worse, calling back into PIR code from C code called from PIR code, and so on).

My goal of removing as much C code as possible from Parrot -- writing Parrot in itself -- is a Parrot 3.0 goal. Parrot 1.3 comes out next week.

I believe we can achieve all of these goals without stopping the world, scrapping large parts of Parrot, and leaving piles of steaming debris in our wake.

Rebuilding a Charging Locomotive on the Go

How's that?

One of the not-so-hidden themes of all of my writings is gradual and inexorable progress in software. I believe that all Parrot contributors help to improve Parrot with every commit. (Some commits are regressions and some need further commits, but we're heading in the right direction and fixing more bugs than we create.)

A lot of projects can say that.

One of the poorly hidden themes in my writings is that backwards compatibility, like a dairy product, has an expiration date beyond which your fridge -- and your project -- gets scary.

I've written some great code for Parrot and I've maintained some awful code. I believe the awful code is less awful for my work, but one of the greatest joys of software development for me is deleting bad code because it's unnecessary.

The strategy we've discussed for removing as much C code from Parrot as possible has several phases.

Our built-in data structures -- hashes, arrays, subroutines, et cetera -- are special files written in a mixture of an unnamed language and C. A series of Perl libraries parses this code and generates a (much longer) plain C file for compilation and linking into Parrot itself.

Step one of the plan is to write a compiler using Parrot's compiler tools that understands this unnamed language and can emit C code equivalent to what the current parser/generator produces. This step is underway.

Step two of the plan is to develop a new language (or extend the unnamed language) that the compiler from the first step can transliterate to C code equivalent to what the current parser/generator produces. This means that the new language can represent the same operations that the C code in the current language uses.

Step three of the plan is to replace the current Perl 5 parser/generator with the PCT-based compiler.

Step four is to add an emitter to the PCT-based compiler to produce opcodes Parrot understands which perform the same functions that the existing C code perform.

Step five is to remove the C emitter and run everything through the native Parrot instructions.

Although this process has several dramatic changes, only one is visible to users: the switch from writing PMCs in the current mixture of semi-parsed C to writing PMCs in a PCT-hosted language. Though we can deprecate the C version at step two, we only have to remove it at step 5.

I believe that this is possible both technically and socially primarily because of the project's organization:

We have a defined interface with which to write code which interacts with Parrot. (Admittedly we haven't formalized this interface yet, but that's in progress.)
We have a documented support policy which notifies users as to our deprecation schedules and defines our backwards compatibility plans.
We have a regular release schedule which, when combined with our documented deprecation and backwards compatibility policy, allows us to refactor and modify Parrot on a predictable schedule at a pace which allows for frequent improvements.

I'm sure some people could argue -- some of them even successfully -- that these types of changes after a project has reached 1.0 would be unnecessary if we'd only tried to plan the project in more detail years ago. Perhaps they're even right. Yet when was the last project you know of that both tried to do something new and successfully predicted the future such that it needed no architectural changes?

I don't mind rewriting code or even writing code I'll throw away in six months if it progresses toward something better that's faster, cleaner, easier to understand, easier to maintain, simpler, better designed, and/or more featureful.

Sources of my Unmotivation to Contribute to Perl 5

... which brings me to the long-promised explanation of why I have so much trouble contributing to Perl 5.

When I posted the final version of the patch to add class { ... } to Perl 5, I had just finished wrestling with the Perl 5 parser to work around the Perl 4 apostrophe package separator superseded by double-colons and recommended against in Perl 5. You may recall the release of Perl 5 in 1994.

I wrote that patch to Perl 5 because I felt guilty about complaining about Perl 5's flaws (despite my contributions to Perl 5, such as they are) without at least attempting to address them.

I had little hope the patch would be accepted, but I was open to the possibility that someone might say "You know, it doesn't break backwards compatibility, it does make boilerplate code easier to read, to write, and to explain to novice programmers, and it is compatible with Moose, MooseX::Declare, and Perl 6.

Maybe it'll be a part of Perl 5.12, if Perl 5.12 ever comes out. Maybe someone else'll pick it up and maintain it and argue for it and get it committed to the core.

Meanwhile I can get the urge to fix a memory leak in Parrot or Rakudo, fire up Valgrind, and commit it today for people to use from the stable monthly release next Tuesday (for Parrot) and next Thursday (for Rakudo) or the packaged release included in free software distributions in July and everyone's life is a little bit better in days or weeks. The same goes for improving performance or fixing another bug or adding a feature.

If it's a big feature or an architecture change or something otherwise disruptive I still have to discuss it with other developers and figure out how it fits with our backwards compatibility and deprecation policies, but if we decide it's worthwhile, it won't languish for years between releases. The lag between producing something wonderful for other people to enjoy is months at most.

Perhaps people critical of my views are right in that it's inappropriate to compare Parrot to Perl 5. Perl 5 certainly has more users. Perl 5 has had more contributors. Several orders of magnitude more people and businesses rely on Perl 5 as it exists now.

Yet I wonder if they're perfectly happy with Perl 5 as it is -- or do they secretly wish that 5.10's argument passing performance problem found and fixed in source control three weeks after the release of 5.10 were released in a form they could use, or that Perl 5 had working subroutine signatures, or that Perl 5 had declarative classes, or that strict and warnings weren't optional for all new code?

The Perl 5 committers are right in that the best way to ensure that there's a Perl 5.10.1 sometime this year is to fix bugs, to write documentation, and to perform other thankless maintenance tasks. I hope to write about how to do that and I may do some of it myself. (Goodness knows I've done plenty of that for Parrot.) That's noble work and it deserves far better praise and far more respect than it gets. Even though I criticize of Perl 5's project management, I still believe that maintainers deserve much credit and appreciation for their work.

Still, the long-term viability of the project concerns me. If Perl 5 has a volunteer time and effort problem (and it's clear that it does), I wonder how many other potential contributors have declined to participate for reasons similar to mine. (After several people have told me in person that they agree with most of what I've written but don't want to say anything in public, I wonder how many other ideas for process improvements the community has lost.)

I take two lessons from this. First, sometimes it's easier to rebuild a train barrelling down the tracks than a train stopped in a weedy cow pasture. I wouldn't have believed that either before now.

Second, any project management strategy which relies on a sense of guilt to recruit new and prodigal developers has problems no technical mechanism can solve.

6 Comments

autarch.myopenid.com | June 11, 2009 8:24 PM

FWIW, I agree with much of what you've written, and I'm willing to say it publically.

Of course, I don't think that's worth much. My contributions to the Perl 5 core itself are pretty minimal.

iamjaph | June 12, 2009 12:53 AM

One question.
First PIR was translated in PASM, then in PBC.
Now PIR is translated directly in PBC.
Will PASM be removed at all from Parrot in the future?

Ovid | June 12, 2009 7:34 AM

(The following rambles a bit. My apologies)

I have nothing but the utmost respect for Rafael and Jonathan's positions on the class keyword debate, but I have to agree with you on this.

Rafael is in the terribly unenviable position that if a release of Perl breaks a lot of people's code in hard-to-untangle ways, he's going to be remembered as "the guy who broke Perl" and having an incautious pumpking would be far, far worse than having an overly cautious one. I don't think, however, that we should let fear hold us back. Perl 5 needs to be able to take risks to help defeat a lot of the FUD going around. And to be fair, not all of that FUD is misplaced.

Jonathan, on the other hand, seems like like the status quo because it allows top-notch people like him to do amazing things. LISP does this too and look how widespread its user-base is. Given that you want to make Perl 5 more modern and easier to read and Jonathan wants to maintain flexibility, I'm hard-pressed to see why these two goals can't both be met. They're hardly at odds with one another.

At the end of the day, core Perl provides a set of ingredients rather than a set of box dinners. Frankly, I'm a good cook, but I want to shop somewhere which has both. The CPAN is fantastic, but there are too many obstacles to saying that this where you find all of your box dinners (you should see our list of locally patched CPAN modules at some point). Some of those box dinners need to go into the core to help turn Perl 5 into a modern language.

One of the issues we're dealing with at the BBC is trying to bring many of our projects onto a common platform. Java, for example, had some pretty standard technologies for making this happen but it seemed every Perl project insisted that TIMTOWTDI is a requirement, not a suggestion.

There is some truth to this at times, but finding out that everybody has a different object-system to work around Perl's low-level system is a bit of nightmare. And frankly, I get the willies diving into some new code bases, even if they're written by competent programmers because Perl itself is too low-level and with so many companies getting silly and saying "no CPAN", it's a nightmare. Heck, just having standard OO (MooseX::Declare?), logging (Log::Log4perl?) and exception handling (Exception::Class?) in the core would make some of our (BBC) internal issues go away.

I also want to upgrade our Perl to 5.10, but the list unpacking performance regression means that I can't, in good conscience, recommend 5.10. There are soooo many good features there that I definitely want to just slap use Modern::Perl at the top of my code, but I won't feel comfortable until the regression is fixed.

Perl 5 is releasing too slowly. Perl 5 is not aggressive enough in its feature development. Too many experienced programmers with little appreciation of the needs of newer programmers are acting as if those needs are not significant. But at the end of the day, those syntactic sops to the newer programmers will help me as well.

You keep writing, I'll keep reading. We don't always agree on the details, but I'm with you on the big picture.

mst | June 12, 2009 12:49 PM

Actually, quite a few of us were in support of your class keyword proposal until the point at which you made it not compatible with MooseX::Declare by lifting the operation to BEGIN time - for consistency, MooseX::Declare doesn't do this so it can provide 'class $foo {' and anon-class construction capabilities.

In fact, I even mention this in the thread you link to above - an email that went unreplied to, so far as my memory and that web interface show.

If the compatibility problems had been fixed I'd've happily argued for its inclusion in blead, but in the absence of that I didn't feel like helping perl 5.12 have two subtly different class keywords depending on if you were using a CPAN module or a built in feature. Didn't smell like part of the solution to me, on the whole.

chromatic | June 12, 2009 1:01 PM

@mst, I don't remember that discussion about BEGIN (and I believe we corresponded after that and I discovered I'd never received that message from you). I certainly regard compatibility with MooseX::Declare as more important than the BEGIN feature (which wasn't even my idea at the start).

I have neither intention to maintain the patch nor to plead with p5p for its inclusion, but that particular feature should be easy to remove from the patch. In particular, the code in perly.y that creates a newATTRSUB and ends that block needs to change.

chromatic replied to comment from iamjaph | June 16, 2009 1:54 PM

@iamjaph, there was never really a translation from PIR to PASM. PASM won't go away -- it's a textual representation of PBC -- but it's a lot more hassle to write than PIR or anything in NQP.

"Just Write it in C!" is a Terrible Long-Term Strategy

Rebuilding a Charging Locomotive on the Go

Sources of my Unmotivation to Contribute to Perl 5

Tags:

1 TrackBack

6 Comments

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry