"Developer versus Distributor" is Not Even Wrong

To start: the title is a false dichotomy. I develop some code and I distribute other code. I happily let Debian and Ubuntu package software such as Vim and X.org for me, not to mention Kmail and glibc, but I also happily use CPAN.pm to manage Perl 5 modules.

A rather silly debate stirs in the Perl 5 community now and then. Someone claims "Your project has too many dependencies!" One response is "Don't reinvent the wheel."

If I could strike words and phrases from polite debate, "bloat" and "reinvent the wheel" would disappear shortly after "utilize" and the verb "to task".

You May or May Not Like Forest Creatures

This time, the debate is about Moose startup time -- in particular, whether the benefits of using Moose for Padre outweigh the disadvantages.

The advantages are:

More declarative class declarations (especially through MooseX::Declare)
Better object and class flexibility than Perl 5 provides by default
Access to a wide range of Moose design patterns and plugins
Generally less code to maintain to achieve the same feature set

The disadvantages are:

Moose and Class::MOP add a few (12?) extra dependencies to a modern version of Perl 5
Creating Moose objects increases startup time and memory usage by a measurable amount. (Note that I used the word measureable, but not large. You can measure this amount. Whether the amount is trivial or significant depends on your problem domain.)
Rewriting existing, working code may not prove beneficial at this point in the project.

Keep that in mind for a moment.

Truth in Advertising Distributed Software

LWN covered a debate on p5p and Fedora mailing lists about the Red Hat practice of distributing Perl 5. If you install the perl package, you don't get what most people reading this would consider "Perl". In particular, you can't use CPAN.pm because it's not installed.

If you want to use CPAN.pm, you have to install the perl-core package. Some might say that the perl installed from the perl package is broken. Certainly it doesn't do what I might expect.

One difficulty that distributors such as Red Hat, Sun (with Solaris), FreeBSD, and Apple (with Mac OS X) discover is that users uses of Perl 5 vary. A couple of megabytes of perl and a few core libraries may suffice to run basic system administration programs necessary to the installation and ongoing maintenance of a system, but I want all of the Perl documentation, the Unicode tables, and even the shared library for Scalar::Util installed correctly before I consider that the preinstalled Perl 5 is usable and complete.

I can understand Red Hat's choice, and mostly I consider their choice of nomenclature buggy.

In some ways, that poor taste grates more than a mistaken technical decision. I have no love lost for the Perl 5 distribution YAML::Tiny, written deliberately not to parse YAML, yet attached to the name like some alien parasite determined to suck the precious bodily fluids from its host.

Yet I also understand why the ::Tiny distributions exist: to do a job quickly, using as few resources (runtime and dependency-wise) as possible, solving 80% of the problems without fuss. That's good for developers and good for distributors, at least to a point.

The Debate is Not Even Wrong

Sometimes I hate long dependency chains, usually when I have to chase them down during a long installation.

Sometimes I'm happy to reuse code, like a garbage collector or Unicode library or binary-coded decimal or date and time handling I don't have to code or debug or even understand myself.

Sometimes I'm even happy to remove a dependency if it means that more people can use software to which I've contributed, or if it makes the software easier to maintain or easier to install or faster or simpler.

The difficulty is that we value different criteria differently at different times for different tasks. I don't mind if Padre takes two seconds to start, if I use it for two hours a day and it doubles my productivity. Contrarily, I'd like Callgrind to run faster, but it's valuable enough as it is (and some of the software I profile has its own flaws) that I don't mind the speed hit it takes for the crazy job it does.

The problem is that the "Your software is bloated!" and "You reinvent the wheel, badly, and you lie about its name!" debate is also a false dilemma. We have a wealth of other options to attempt to make people happier.

Imagine if you could get a single bundle of all dependencies for any CPAN distribution. Obviously there are complications: can you compile XS code, do you have alien library dependencies, are the licenses compatible? Yet improving the distribution of code -- especially with regard to dependency graph version compatibilities and test reports -- could help.

Imagine if Perl 5 borrowed just enough of Moose's declarative class/attribute syntax to make the easy things easy, remove 75% of the boilerplate, and leave Moose and Class::MOP for the other difficult things where it's obvious that you need that full power.

I wouldn't call it Moose::Tiny, but I suspect that a handful of features (declarative class, attribute, and method declarations, auto constructors and accessors) could banish blessed references in new code at almost no startup time cost.

Then again, imagine if Perl 5 could see an order of magnitude improvement in performance. Could that render many of these discussions moot? Certainly these are design goals for Perl 6.

3 Comments

http://oid.fox.geek.nz/kent.fredric | September 3, 2009 10:44 PM

No opposition to Padre taking on extra deps from the distribution standpoint myself. On Gentoo, Padre is reasonably kept well up to date in the perl-overlay¹ and MooseX::Declare is already accessible in the same repository², Moose and Class-MOP are already in the main tree.

Its not much work to get a given distribution into the overlay, pleasant asking or willingness to produce a working ebuild will probably get you far. If you can hang around and help maintain the other dists that's appreciated too （ Not hard, all you need to understand is basic Bash scripting ） .

These days I've been seeing turn-around times of new-release on CPAN to a matching build in the overlay in as little as 12 minutes :) （ that is, its not even listing on the CPAN site itself yet, and its in the overlay ;) ）

₁ git.overlays.gentoo.org....perl-overlay...app-editors/padre

₂ git.overlays.gentoo.org....perl-overlay...dev-perl/MooseX-Declare

zby | September 4, 2009 1:39 AM

The problem here is rather complex - to have a rational debate we need to split it up. There are at least following sub-threads:

How a CPAN module that has other modules depending on it can change. This is very much like the API deprecation thing - just a little bit broader - there are other characteristics of a module that have impact on up-stream code. One of them is efficiency. The difference with API deprecation is that sometimes it is impossible to keep the 1 version back-compatibillity. We need a debate here. Maybe an explicite declaration in the module docs about the policies for changes would be something that would solve the problem?
How can we install big dependency tries without hassle. One solution can be the bundling you mention here. After my recent experiences I would propose another one - let's start declaring external (alien) dependencies (in Makefile.PL?). This is the first step - then we could patch CPAN(PLUS) to scan the dependency tree and inform the user upfront about the missing external deps.
Efficiency of Moose.

jnareb.openid.pl | September 4, 2009 3:06 AM

Isn't Moose::Tiny called Mouse? (There is also Any::Moose.)

"Developer versus Distributor" is Not Even Wrong

You May or May Not Like Forest Creatures

Truth in Advertising Distributed Software

The Debate is Not Even Wrong

Tags:

3 Comments

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry