Avoiding The Vendor Perl Fad Diet

| 12 Comments

Here we go again.

It looks like Red Hat is distributing Perl without the core library ExtUtils::MakeMaker. If you're not familiar with the details of the Perl 5 build chain, all you need to know is this: without MakeMaker, you're not installing anything from the CPAN.

Ostensibly Red Hat and other OS distribution vendors split up Perl 5 into separate packages to save room on installation media. Core Perl 5 is large and includes many, many things that not everyone uses all the time... but the obvious reaction to defining a core subset of Perl 5 that a vendor can call "perl" is another of those recurring discussions which never quite goes anywhere.

For example, who needs the documentation just to run code? (Except that the diagnostics pragma relies on the existence of perldiag.pod to run.) Who needs the huge Unicode encoding tables for ideographic languages such as you might find in Japan, China, Korea, and other Asian locals? (Answer: Asia.) Who needs the ability to install code from the CPAN? (Answer: users.)

While there's a lot of stuff in the core that probably doesn't need to be in the core, or at least installed by default (a LaTex formatter for POD, the deprecated Switch module, Perl 5.005 Thread emulation), one thing is both clear and almost never said.

I'll give you a moment to think about it.

Here's a hint: you're usually better off compiling and installing your own Perl 5 under your complete control such that you can compile in options you want (64-bit integers, for example) and out options you don't (threading imposes a 15% performance penalty even in the single-threaded case) and so that you can manage your own library paths without changing the behavior of the system). perlbrew changes the game. Learn it, like it, love it.

The perpetual discussion misses one important point:

The vendor perl—especially on installation media—is not for general purpose Perl programming. It's there only to support basic administrative programs provided with the system as a whole. That's why you don't replace the system Perl. That's why you don't mess with the system CPAN modules. That's why you fence off whatever's in /usr/bin/perl like it's Yucca Mountain and you're stuck with a '50s reactor design instead of something safe and clean.

Vendors can tune and tweak that Perl to their satisfaction to provide just what they need to install and configure a working system. They can keep it as crufty and out of date as they like. When it breaks, they get to keep all of the pieces and sew them back together like some sort of Fedorastein's monster. They just can't let it out of the lab.

This of course means that they need to provide packages of Perl 5 Actual for users and developers such that it's the full core of Perl 5. (It'd be nice if they called not-a-perl as such, but one thing at a time.)

You can't predict what users will and won't do. That's why you code defensively. The moment distributions started carving up Perl to install just the little bits they needed in the hopes that their guesses as to what users wanted were right, they put everyone in a bind.

Certainly Perl 5 could benefit from a thorough review of what's in core and why, but I suspect that even if p5p came up with packaging guidelines for all of the imaginable use cases and combinations of distributor needs and user wants, it still wouldn't solve the real problem.

(Credit Allison Randal for pointing out the real problem years ago. We've discussed several times the idea of a stripped-down VM for a real language—something with better abstraction and reuse than Bash—with easy access to libraries and a very small footprint, but it's a bigger job than either of us could accomplish. It's still a righter approach than bowdlerizing an upstream distribution.)

12 Comments

So what about Python, PHP and Ruby? Do developers in those languages also need to install from source or can they use the one that was supplied by the vendor?

What if you wanted to use mod_perl? Do you need to also compile Apache from source? What else do you need to compile from source?
Do you then need to keep an eye on new releases of all these libraries and applications to apply security patches?

I don't really have a problem with vendors carving up Perl into separate packages as long as *everything* goes into one of the parts and ideally there should be an easy way to install all of those packages.

I've been developing and deploying Perl code on Debian for years and have had no need for perlbrew. I'm not saying there's no place for it, merely that I have found the Debian Perl ecosystem to be 'good enough' that I haven't needed it. Installing CPAN modules via .debs certainly makes for easy deployments.

szagab: In my experience Ruby is a nightmare to deploy and most people use RVM (perlbrew for Ruby) and bundle all the versioned module dependencies into the app. In the PHP world it is common for vendor packages to lack features or fixes that would be available to someone building from source - however the typical response is to do without rather than build from source. I don't know much about Python but have heard murmurs that vendor tools written for Python 2.x are a barrier to deployment of 3.x.

As I wrote, this is why (vendors) need to provide packages of Perl Actual.

Don't get caught up in the fact that people in the know can compile their own Perls. The entire point is that the balkanized and bowdlerized installation media Perls are only suitable for the other programs on the installation media.

szabgab: Vendors often provide -devel package containing the bits needed to build XS-based packages against system libraries. Sometimes it's not enough, though -- Subversion for example really wants you to build its swig-based bindings from the same sources as the client, which necessitated the Alien-SVN distribution on CPAN.

This is one of things SUN got right in Solaris. the perl they needed for system stuff was elsewhere and not in the users path.
Admittedly there were issues with the perl for users as well such as using the sun compilers so it was always better to use your own perl anyway if you couldnt afford the sun compilers.

@szabgab.com: On Debuntu Ruby is nearly unusable. Especially in regard to gems.
We develop heavily in PHP and run the latest, often a beta.

There's hardly any vendor who does not cripple a package. Some go the extra length to cripple all of them.

You've made it clear in past discussions you're not really focused on administration, which is fine, because you're a developer. But I'll say it anyway:

This was a good move by Red Hat. local::lib and perlbrew (et al.) are great for development, embedded deployment, and so on, but when it comes to doing real-world systems administration, you need to be able to manage perl, and it doesn't like to be managed. We tried putting site_perl on an NFS share from an Isilon system; that resulted in a massive network storm any time anything on the HPC grids fired off... which then slowed our engineer's workstations down to a crawl. We've tried a million things, and what we finally wound up doing was handcrafting packages for ExtUtils::MakeMaker and things like it, which basically dropped into site_perl (instead of vendor_perl, where packaged goods normally fall).

The thing is, ExtUtils::MakeMaker is really important for building modules. What it's not important for is running perl programs. Furthermore, that specific module (among others) is not necessarily tied to the version and patchlevel of the perl binary. Perl 5.8.8 works just fine with versions of MakeMaker from far off in the distant Jetsons-like future... but because RHEL5 has MakeMaker in the perl package, you have to jump through fantastically annoying hoops to upgrade it on a 500-node grid.

Now, I agree that when you run 'yum install perl' you should get the whole distribution... but the 'perl' package should be a metapackage which weakly depends on modules like MakeMaker. That way I can use standard tools to support building newer versions of MakeMaker when some CPAN module suddenly decides it needs a newer version of that module (which often happens when CPAN dependencies are carelessly managed).

Redhat's only real mistake here was calling the package 'perl' instead of 'perl-runtime' or whatever the bikeshedders decide is an appropriate name.

Your advice, as I mentioned at the beginning of this comment, is sound for a development platform. There is almost nothing more annoying than finding out during a complex analysis deployment that the developer's code doesn't work because he was too clever by half -- and started dicking with perl build options to the point where his code won't run on our existing perl deployment.

We have something like 2500 things installed from CPAN (no matter what the language faddists say, tons of computational science is done with perl). Having to replicate the frequently-massive dependency trees that analytical software can have -- to the extent of including entire other languages liek fortran -- every time for every program is simply ridiculous. A web application? Fine, push your perlbrew to EC2 and move on. But for large installations, you can't just maverick off into your own little development walled garden. You have to coordinate with your systems administrators, because they generally don't have the manpower to shove your entire development stack into the configuration management... and then do it again for the next fifty programs.

When you're trying to do performance-critical work across hundreds or thousands of nodes, your software needs to be packable. Distributions have tools to make software manageable, and ignoring them make life harder for everyone but you.

Have you ever deployed a Java application? I have. See also WAR files and Maven and Subversion repositories full of JARs and....

I'd like what you suggest--that you can manage application dependencies in a system-wide, one installation fits all fashion--to be true, but I've only ever seen it work in the most trivial situations.

I use Perl amongst other things because it is usually already there no matter whether the operating system is Linux, BSD, Solaris, HP-UX or AIX. Sometimes system administrators refuse to install other scripting languages or rather any other package. If Perl will no longer be provided completely with a set of modules one can depend on, the advantage of being already installed will be gone. Granted, working with the pre-installed Perl is often unpleasant and chances that Python and Ruby are also available is nowadays much higher anyway, but Perl still has got an advantage.

I don't understand. How does Java being a world of suck make it acceptable for Perl to be a world of suck? What the hell does Java's horrendous binary-distribution addiction have to do with this discussion? I'm talking about packaging cpan, not downloading random precompiled XS modules.

The reason you've only seen this work in trivial situations is that not many organizations have the resources to devote enough operations time to solving the problem correctly. We do, and so we have. The first thing we do is break out all of the modules from the vanilla distribution, for exactly the reasons I specified, and then put everything into RPMs for distribution.

We also have a pretty comprehensive testing platform. The procedure, simplified for this comment, is: update local cpan mirror, build all the modules we use (and their dependencies), fix build problems, rerun packaging scripts, upload to channel for distribution. It's taken us a while to get it robust, and redhat's new packaging decision would enable us to skip the tedious breakout of the core packages, etc. Periodically (I shoot for about once a month, or when it's requested by the devs, whichever comes first) we update the local cpan mirror and test the new cpan stuff, the software we use, and the software we write, and make sure it all works before we distribute it to the grids and the clusters.

It's nice to be able to upgrade the interpreter without rebuilding every single module it comes with. It's nice to be able to upgrade a leaf module all the way back to root perl without having to monkey with idiotic dependency listings. And our developers like having a perl installation they can rely on and expand.

Semirelated: BSDPan is one of the greatest tools for module distribution I've ever worked with, and I hope to god that one day my HPC work and BSDPan have a chance to come together. I could probably cut back to a four-day work week.

How does Java being a world of suck make it acceptable for Perl to be a world of suck?

Please don't put words in my mouth.

Deploying software with dependencies takes effort. You can do that work by forbidding dependencies, by managing dependencies for each application individually, or by managing dependencies for all applications centrally. I prefer the second, and you prefer the latter.

I've worked in both situations in multiple languages: the most recent time the central approach bit me was a large Java project, where the Java ecosystem prefers application-as-big-blob-in-WAR approach but the system administrators required the build process to pull vetted dependencies and versions from a central JAR repository.

I grant you that this is not an example of the best way to manage centralized dependencies, but it was an act of root to get information on how to use the repository, let alone get a new version of a library in there. Don't even ask about how to get a new library installed.

The worst part was when one application relied on a library version incompatible with a newer version which contained important bugfixes our application needed.

I admit it's possible to create and maintain a centralized repository while minimizing these problems, but I've experienced its drawbacks enough that I'm leery.

I've been developing and deploying Perl code on Debian for years and have had no need for perlbrew. I'm not saying there's no place for it, merely that I have found the Debian Perl ecosystem to be 'good enough' that I haven't needed it. Installing CPAN modules via .debs certainly makes for easy deployments.
http://verchini.com/day-nit

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on January 25, 2012 11:46 AM.

A Decades-Old Technique to Improve Programming Languages was the previous entry in this blog.

Speed up Perlbrew with Test Parallelism is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?