July 2011 Archives

Merit and Entrance Requirements

| 1 Comment

Yaakov is right: sometimes we define merit in technical communities too narrowly.

I want more programmers in the Perl world who don't look like me (middle class, Caucasian, male, stubborn, autodidact, far too well versed in nerdy pursuits like math, Unix, comic books, video games, British humor, Japanese cartoons, and science fiction) not because there's anything wrong with those things but because I like being around people who aren't exactly like me sometimes. I like helping people with problems different from my own. Sometimes writing software for people who write software to help them write software isn't nearly as satisfying as helping a free speech attorney organize her cases or an eight year old boy catalog his collectable card game cards or a working mother download and print recipes for a son with special dietary needs.

I spent the past several days as a camp counselor with my eight year old nephew instead of attending yet another technology conference where rich old white men tell an audience of $1200+ per ticket attendees to spend their time "working on things that matter" because helping a bunch of eight year old kids spend time outdoors and realize that some adults really do care about them and want to help them grow up safe and happy and healthy is awfully satisfying, especially when compared to arguing over whether it's a sin from a technical sense to use XML in a templating system.

Take the ever-present idea of tutorials, for example. If you want to write a tutorial for an audience of purely system administrators with several years of experience managing Unix systems from the command line and really target that tutorial for people who'll never write a program of more than a couple of dozen lines where it only has to work once and no one will ever read it again and getting something mostly working is more important than everything else, go right ahead. Working system administrators need good tutorials. Blessings on you for helping such an audience.

Me, I have different people in mind—not better, not worse, not even necessarily more deserving, just different.

There can be many ways into the world of programming, and there are many places within that world: casual, professional, dilettante, perpetual novice, dabbler, whichever. I'd like to remove some of the barriers that keep people who don't look like me out.

How do we do that?

Another day, another argument over how to document TIMTOWTDI.

Some people prefer to write tutorials for novices that start from "the simplest thing imaginable" and then gradually introduce newer features. I took a different approach in the Modern Perl book. In the context of the open tutorial discussion, the book very deliberately recommends against the use of bareword filehandles (except in named exceptions of STDIN, STDOUT, STDERR, and DATA) and always recommends three-argument open over the one- and two-argument alternatives.

This is a matter of philosophy on which reasonable people differ.

For the book, I wanted every example to run without error under strict and warnings. Most examples use Perl 5.10 features such as say. That's part of the reason I wrote the Modern::Perl metapragma—I wanted to explain once a single line of magic that helps novices get as much assistance as perl itself can provide to write good code.

That does mean that any discussion of opening files depends on the reader already knowing how to declare a variable. That also means that the discussion of declaring variables has to take place at or before the discussion of assigning to variables.

Of course, someone who's never used Perl 5 before will, from the start, have strictures available to help catch undeclared variables and typos in variable names. The first lines of code this novices writes will be strict safe.

(I spent ten minutes the other day wondering why Devel::Cover couldn't see the documentation for two methods generated during BEGIN time as closures, and was almost ready to file a bug until I noticed that my documentation had typos in the method names. Typos happen.)

This strategy also means that the open code novices write won't use package global variables, so they'll avoid spooky action at a distance if they keep their lexical scopes sane. Even better, they don't have to understand how local works and what dynamic scope is in order to write safe code.

In other words, Perl 5 novices who read Modern Perl: learn from the first page to write code which:

  • Allows perl to give better warnings and error messages about dubious constructs
  • Limits the possibilities of using global variables
  • Avoids subtle security problems from magical filemode detection
  • Scales up to larger programs, mostly thanks to the first three items in this list.

In addition, these readers do not have to unlearn habits the book has previously helped them adopt. Plenty of people use Perl 5 to write little programs of a couple of dozen lines where the risk of a typo is minimal or the spooky action of an unexpected global change is almost irrelevant, but there are plenty of other people using Perl 5 to build bigger programs or programs which will become big. Why deny them the perl's full abilities to help them write correct code?

All it takes is the philosophy of "teach people to write code that can scale to larger programs" and the willingness to rearrange the order of concepts presented so that variable declarations come before opening files. (It's easy to write real programs which only ever print to standard output until readers understand variable declarations.)

It's not that package globals or barewords are intrinsically and always bad, and so you should never use them. (Even if I wanted that to be true, it wouldn't be true.) Instead, the book always takes the approach of showing a better (easier to verify, safer, more secure, easier to understand, fewer loopholes, simpler) approach over the alternatives. That's why it teaches OO with Moose, why it recommends Try::Tiny over eval, and why autodie gets prominent mention.

If experienced Perl 5 programmers use all of these great tools to take care of unsafe, complex, messy, fragile details, why not help novices use them as soon as possible? Certainly, some of them only need to know a few details to write a couple of dozen lines of code, and that's perfectly acceptable—but plenty of others could be great programmers solving real problems with more ease if we smoothed out the way for them a little bit more.

... the IT policies which forbid the installation of CPAN distributions would become construction policies which forbade the use of lumber, joists, drywall, shingles, nails, screws, rebar, concrete, wiring, insulation, and probably doorbells.

Is it drafty in here to you?

The Principle of Gentle Tutorials

| 1 Comment

Sometimes I bless function references.

I've blessed scalars and array references. Someone help me, I've even blessed typeglobs. Sometimes I've had good reasons to do so, and sometimes I did it merely because the feature was available. I've monkeypatched (and actually patched) UNIVERSAL, the base class from which all Perl 5 classes inherit. I've built my own object system. I've abused AUTOLOAD and worked around bugs in Perl 5's SUPER.

Yet the chapter introducing object orientation in Perl 5 in the Modern Perl book introduces objects in terms of plain vanilla Moose. Only after readers are happy and productive with Moose and attributes and roles does the book pull back the Moose curtain to reveal blessed references.

Why?

It's been a few months since the previous debate on the Perl 5 Porters mailing list over what tutorials should cover (or whether tutorials should exist). The current sound and fury nominally centers on a patch to improve perlopentut.

One side of the debate decries replacing examples of the two-argument open and the use of bareword filehandles (package globals) as bowdlerization and whitewashing and even an attempt to excise two-plus decades of working Perl code as clunky and out of favor. Another side of the debate suggests that encouraging novices to write code in a style that minimizes accidental flaws (I didn't write eliminate) is more important.

Other positions and nuances exist, yet the schism seems to be over what's in favor.

Certainly some novices will eventually encounter code which uses the two-argument open for a good reason, whether it's working and safe code that hasn't needed maintenance and thus has no compelling reason to revise it or whether there's a very specific and documented reason why the newer approach will not work. The core documentation must somewhere document all three variants of open (the one-, two-, and three-argument forms).

Yet why should that primary source of documentation be in a tutorial, aimed at novices who lack, by definition, the experience necessary to choose one approach over another in terms of safety, correctness, maintainability, and the ineffable bits of magic that only experience can provide?

After all, it's possible to overwrite files when you intend to open them for reading if you're not explicitly careful about how you use the two-argument open in Perl 5. Yes, I know that if you can't trust your filesystem or the other users using your filesystem you're open to all sorts of security holes, but I make a habit of not trusting myself because I make all sorts of stupid errors.

I'm sure it's possible to use the two-argument open in a safe, sane, and secure form, but I avoid it whenever possible because I don't trust myself to get it right every time. The same goes for my uses of global variables or monkeypatching or tie or indirect method calls or most uses of prototypes. I might feel slightly better if I lose important data due to my own stupidity and not a remote exploit, but I've still lost important data.

I'm glad that other people are smarter or more disciplined or more careful than I am, even after my thirteen years of experience with Perl, countless articles written and edited, a handful of books, several projects, and dozens of CPAN distributions. Yet I look at, for example, the documentation on php.net for the simplest function and I see buggy, incomplete, insecure, and confused code and I think sometimes that encouraging novices to write what some might consider to be unnecessarily strict and simple code that doesn't force them to understand everything that might possibly go wrong might be an act of compassion and kindness.

(Also interpreting part of the name of a file as an access mechanism for a file with a different name—RT #64504—is an example of why the magic parsing of unstructured data is a sin.)

M0 Test Tasks: Stealing from Perl 6

Update: M0 is dead, Parrot is effectively doomed, and the author believes that Rakudo is irrelevant. This post is now a historical curiosity.

Less Magic, Less C, A Faster Parrot promised a few small tasks for the interested to help the Parrot optimization project reach its goals.

M0 is one of those goals. This small set of opcodes represents the minimum necessary platform-specific code required to implement the rest of Parrot. M0 is a specification. Like Perl 6 is a specification, there can be multiple implementations. The demo implementation is a Perl program, and my C implementation is the second implementation (and hopefully a likely candidate to become part of Parrot itself).

Like Perl 6, M0 is a specification and a test suite.

Unlike Perl 6, the M0 test suite is tied to the original implementation. That is to say, when you run the M0 tests, you exercise the Perl demo M0 implementation.

A small task for the interested is to decouple the M0 test suite from the Perl M0 implementation and make those tests available for multiple implementations. This probably means representing those tests in a form with expected inputs and outputs, perhaps running M0 source code through an assembler to create M0 bytecode, and running the tests against an arbitrary implementation.

Perl 6's test suite has an interesting capability whereby the list of tests contains metadata on which known implementation should skip or try to run but not expect to pass arbitrary test files. The #perl6 channel on irc.freenode.net can give you more details about this fudging process.

This project does easily parallelize across multiple interested people. Coordinating in #parrot on irc.perl.org will help.

As a bonus test, the C implementation should allow optional testing with Valgrind, if installed. All tests should run without memory leaks or invalid memory accesses. (I tested the C implementation by hand, but automating this is essential for further development.) This task is slightly more complex, but should be within reach for any Perl programmer of modest experience.

Inside-Out Failure Injection

| 1 Comment

While it's still true that lexical scope is the fundamental unit of encapsulation in Perl 5, dynamic scope is a powerful tool.

Consider this snippet from a Catalyst application controller:

sub activate :Chained('superget') :PathPart('activate') :Args(0)
{
    my ($self, $c) = @_;
    my $proj       = $c->stash->{project};

    try
    {
        $proj->activate_project( $c->stash->{user} );
        $c->add_status( 'Activated project ' . $proj->project_name );
    }
    catch { $c->add_error( $_ ) };

    $self->redirect_to_action( $c, 'view', '', [ $proj->id ] );
}

Ignore almost everything but the Try::Tiny code (the try/catch blocks). The dynamic scope of this exception handling code means any exception thrown from either method called from the try block or any other code they call, will result in the activation of the catch block. Outside of those two blocks, any exceptions are the responsibility of something else.

That's easy to understand, but take it a step further and think of this dynamic scope as some sort of multiverse behavior. (If this metaphor doesn't work, that's fine. It's how I think of it.) Within a delimited scope representing program flow, not source code, the universe has changed.

Exceptions aren't the only use for dynamic scope, and even dynamic scopes offer the possibility for encapsulation.I wrote some interesting code the other day while refactoring out accumulated duplication and improving the test coverage of the controller. My goal was to test the error handling if the activation failed. This presented a design opportunity: how could I force a failure of the activation? As you can see from the controller, the project object is already in the stash. I could override Catalyst's dispatch mechanism or internals to create a dummy project object which dies on activation. I could extract these controller tests into tests which don't go through the web interface. I could even use an environment variable to change the behavior of $project->activate to throw an error.

Instead, I used an injected monkeypatch. The resulting test looks like:

    test_result_override
    {
        $ua->get_ok( 'http://localhost/projects/2/activate',
            'failed activation' );
        $ua->content_contains( 'Activation failed',
            '... should contain error message' );
    } Project => activate_project => sub { die 'Activation failed' };

$ua is an instance of Test::WWW::Mechanize::Catalyst.

You can probably see where this is going. Within the block passed to test_result_override, the Project object's activate_project method is the function passed as the final argument—a function which throws an exception.

The monkeypatch injector was likewise relatively easy to write:

sub test_result_override(&@)
{
    my ($test, $class, $subname, $sub) = @_;

    no strict 'refs';

    my $ref = "Appname::Schema::Result::${class}::${subname}";
    local *{ $ref };
    *{ $ref } = $sub;

    $test->();
}

The function prototype of &@ tells the Perl 5 parser to treat the block as a function reference. The rest of the code merely performs the monkeypatching with local (so that the monkeypatch will go away when this function returns, however it returns).

Once I knew what this code should do (I wrote the code which used it first), it took two minutes to write, and maybe five minutes to use in the other places in this controller I needed it. This isn't always the best approach, but this sufficiently encapsulated monkeypatch injection is sufficiently powerful and useful for helping ensure that my code behaves as I intended.

Free Tools for Free Books

| 2 Comments

Several people have asked if they could use the tools we used to make Modern Perl: the book. I've always said yes. Like the book's credits say, we used many pieces of free software to make the book, and there's no reason not to share the resulting code with others.

Unfortunately, using those tools meant extracting them manually from the Modern Perl book repository and editing several hard-coded values to get the necessary flexibility.

I'm happy to announce that a new project will render that work unnecessary.

Pod::PseudoPod::Book is a CPAN distribution in progress which is to writing a book what Dist::Zilla is to managing a CPAN distribution. It uses the App::Cmd model to make the ppbook command.

To create a new book:

$ ppbook create my_awesome_book
Please edit 'my_awesome_book/book.conf' to configure your book
$ vi my_awesome_book/book.conf

This creates the skeleton of a book and a basic configuration file. From there, if you follow the default layout of the book (tutorial coming), you can use several ppbook commands to render your book into the appropriate output format:

  • buildcredits turns a CREDITS file in the book's root directory into an alphabetized and Unicode-safe sections/CREDITS.pod file
  • buildchapters weaves all of the sections and chapters in sections/ together into a series of PseudoPod files
  • buildhtml converts woven chapters into XHTML
  • buildepub converts XHTML into a valid ePub file

Each command is idempotent and has strict dependencies, so that if you run ppbook buildepub, the output will reflect any changes in the dependencies.

The code is reasonably good and works for my use cases, but it needs polishing before it's ready for the CPAN. Patches and pull requests are very welcome. As well, I don't have the LaTeX to PDF chain enabled yet, because that requires some specific non-Perl dependencies, and I don't want to get into maintaining my own Alien:: bundles.

Regardless of those caveats, I believe these tools can lower the barriers to writing better standalone documentation in Perl. I welcome everyone interested in such things to join me in continuing to improve the state of the publishing art.

The Polite Fiction of Numbering

| 2 Comments

(or burying the lede on a Saturday morning)

David Golden commented on internal perceptions and evolution of Perl:

If Perl 5 is a language (and not just an interpreter), how fast should that language evolve and what "versioning" signifies significant change.

One reason people debate so hotly the naming of "Perl 6" is the magic tied to a version number. I've written many times that "Perl can never break backwards compatibility in a radical way because it's never broken backwards compatibility before." That's a common belief. It's also a common belief that it's only okay to correct some of the flaws of Perl (especially missing defaults) by breaking backwards compatibility and signifying that change by incrementing a magical version number.

Of course a number is just a number and a version is just a version and a belief—no matter how wide spread—is just a belief. It may even be a myth. Certainly it's falsifiable.

One little change to Perl would instantly make all of these things possible without requiring Larry to invoke rule #2 on the naming of Perl 6: if Perl 5.16 (or 5.18) required the explicit declaration of the version of Perl any particular file or lexical scope used and defaulted to "whatever the most current version of the perl binary used to run the program provides".

That simple act of declaration (and not-quite-heroic work for backwards compatibility pushed the direction it belongs instead of externalized from the enterprisey to volunteers) could solve so many problems. All it takes is a slight shift in culture and character and, yes, belief.

Update: M0 is dead, Parrot is effectively doomed, and the author believes that Rakudo is irrelevant. This post is now a historical curiosity.

If you catch me in person on a day I think about the implementation of programming languages, you'll probably hear me lament the persistent myth that "Oh, just write an extension in C and things will run faster." I've written a fair amount of XS and I've extended a couple of other languages in C. I have some scars.

One of the pernicious misfeatures of the Parrot VM as originally designed was that it copied the Perl approach of "Let's write big wads of C code, because C is fast but allow users to write in a higher level language because that's easier to write than C!" That's a decent approach, but you can run into a mess when the boundary between those two languages blurs.

For example, what if you have built-in types written in C and user-defined types written in an HLL, and the user-defined type extends and partially overrides behavior of the built-in type and the rest of the system needs to interact with the user-defined type? Unless you're exceedingly careful about what you can and can't override and how, you'll end up with code that calls in and out of both C and your HLL.

At its heart, a basic bytecode engine or interpreter for a HLL running on a VM looks something like:

sub runloop
{
    my ($self, $instruction) = @_;

    while ($instruction)
    {
        $instruction = $optable[ $instruction->op ]->();
    }
}

... where @optable is an array of the fundamental operations your VM supports and $instruction is some structure which represents a given instruction in the program. Each operation should return the next instruction to execute (think about branching instructions, such as a goto or a loop or a function invocation).

The problem comes when you want to do something like call a method on an object. If the method is written in the HLL, the runloop can handle it normally. After all, all HLL code is just a sequence of instructions. If the method is written in C, something in the system must know how to call into C. The runloop really can't do that, because it has to hand control over to something else: it can't dispatch C operations.

That's all well and good until the method written in C needs to call back into code written in the HLL, at which point the system needs to start another HLL runloop to handle that code. (You can't easily suspend the C code and return to the runloop unless you've written your C code in the system in a very non-idiomatic and careful fashion, and even then I'm not sure it's possible to do in an efficient and portable way.)

Now consider if code in the HLL throws an exception.

In short, the more interleaving of C and HLL code in the system, the more complexity and the worse performance can be.

Parrot's Lorito project is a series of design changes to reduce the reliance on C. The main goal is to write as much of Parrot in not C as possible. That is to say, if the Parrot of 2010 had 100,000 lines of C, the Parrot of 2012 should have 40,000 lines of C and the Parrot of 2013 should have 12,000 lines of C. The rest should be higher level code running on top of Lorito.

The current stage of Lorito is M0, the "zero magic" layer of implementing a handful of operations which provide the language semantics of C without dragging along the C execution model. In other words, it's a language powerful enough to do everything we use C for without actually being C. It offers access to raw memory, basic mathematical operations, and Turing-complete branching while not relying on the C stack and C calling conventions.

The Parrot M0 C prototype is a work in progress. It's already reached the milestone of reading M0 bytecode files and running the all-important "Hello, world!" program. (I was on three airplanes and in four airports for a significant portion of the code.)

We could use your help. You don't have to understand anything I've already written to help. You don't have to know C. If you know enough Perl to work with a mentor to write some tests or add a little bit of framework around the existing code or if you know Make or if you're willing to review the code against the M0 spec, we can find something for you to do.

All you need is the willingness to show up in #parrot on irc.perl.org and the ability to download and compile Parrot.

Want a Job? Learn Perl.

| 2 Comments

Employers lament the fact that they can't hire enough Perl programmers to expand their businesses. People often ask me "How can we make more programmers?" at which point I often give a cliché answer of "Hire great people who fit with your team and expect to train them in your problem domain as well as the way you write good code." Then I tell them to download Modern Perl: the book and pass it around as a starting point.

That helps, but the Perl community could do a much better job of expanding the pool of programmers who know Perl by expanding the pool of programmers, period. I have much to write about this—thanks to a great discussion during my advocacy talk at YAPC—but we have a place to start.

Given everything that happened at YAPC::NA 2011, there's far too much to write about and far too much to consider before it's even possible to write about all of the goodness in the Perl community this year. Even so, one repeated refrain which gives me great hope is "Hi, I'm ____ and I work for ____ doing very cool things, and we're hiring.

I have some arm twisting to convince a couple of people to write up an article for Perl.com on the subject, but here's a definite hook to use on people new to programming:

A community full of bright, fun, clever toolsmiths and problem solvers wants to help you become a productive, employed, experienced developer.

Forget arguing with blub programmers who've already made their minds up based on grotty Perl 4-style code they saw somewhere once. Let's get some mentoring going, inside and outside of companies. There's a need. We have ways to meet that need.

(Due credit to events such as the free Perl training in London a couple of years ago. Let's do that again in more places! If your organization needs more developers, let's share some mentoring resources and ideas.)

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

affiliated with ModernPerl.net

Categories

Pages

About this Archive

This page is an archive of entries from July 2011 listed from newest to oldest.

June 2011 is the previous archive.

August 2011 is the next archive.

Find recent content on the main index or look in the archives to find all content.


Sponsored by Blender Recipe Reviews and the Trendshare how to invest guide

Powered by the Perl programming language

what is programming?