Less Magic, Less C, A Faster Parrot

| 1 Comment

Update: M0 is dead, Parrot is effectively doomed, and the author believes that Rakudo is irrelevant. This post is now a historical curiosity.

If you catch me in person on a day I think about the implementation of programming languages, you'll probably hear me lament the persistent myth that "Oh, just write an extension in C and things will run faster." I've written a fair amount of XS and I've extended a couple of other languages in C. I have some scars.

One of the pernicious misfeatures of the Parrot VM as originally designed was that it copied the Perl approach of "Let's write big wads of C code, because C is fast but allow users to write in a higher level language because that's easier to write than C!" That's a decent approach, but you can run into a mess when the boundary between those two languages blurs.

For example, what if you have built-in types written in C and user-defined types written in an HLL, and the user-defined type extends and partially overrides behavior of the built-in type and the rest of the system needs to interact with the user-defined type? Unless you're exceedingly careful about what you can and can't override and how, you'll end up with code that calls in and out of both C and your HLL.

At its heart, a basic bytecode engine or interpreter for a HLL running on a VM looks something like:

sub runloop
{
    my ($self, $instruction) = @_;

    while ($instruction)
    {
        $instruction = $optable[ $instruction->op ]->();
    }
}

... where @optable is an array of the fundamental operations your VM supports and $instruction is some structure which represents a given instruction in the program. Each operation should return the next instruction to execute (think about branching instructions, such as a goto or a loop or a function invocation).

The problem comes when you want to do something like call a method on an object. If the method is written in the HLL, the runloop can handle it normally. After all, all HLL code is just a sequence of instructions. If the method is written in C, something in the system must know how to call into C. The runloop really can't do that, because it has to hand control over to something else: it can't dispatch C operations.

That's all well and good until the method written in C needs to call back into code written in the HLL, at which point the system needs to start another HLL runloop to handle that code. (You can't easily suspend the C code and return to the runloop unless you've written your C code in the system in a very non-idiomatic and careful fashion, and even then I'm not sure it's possible to do in an efficient and portable way.)

Now consider if code in the HLL throws an exception.

In short, the more interleaving of C and HLL code in the system, the more complexity and the worse performance can be.

Parrot's Lorito project is a series of design changes to reduce the reliance on C. The main goal is to write as much of Parrot in not C as possible. That is to say, if the Parrot of 2010 had 100,000 lines of C, the Parrot of 2012 should have 40,000 lines of C and the Parrot of 2013 should have 12,000 lines of C. The rest should be higher level code running on top of Lorito.

The current stage of Lorito is M0, the "zero magic" layer of implementing a handful of operations which provide the language semantics of C without dragging along the C execution model. In other words, it's a language powerful enough to do everything we use C for without actually being C. It offers access to raw memory, basic mathematical operations, and Turing-complete branching while not relying on the C stack and C calling conventions.

The Parrot M0 C prototype is a work in progress. It's already reached the milestone of reading M0 bytecode files and running the all-important "Hello, world!" program. (I was on three airplanes and in four airports for a significant portion of the code.)

We could use your help. You don't have to understand anything I've already written to help. You don't have to know C. If you know enough Perl to work with a mentor to write some tests or add a little bit of framework around the existing code or if you know Make or if you're willing to review the code against the M0 spec, we can find something for you to do.

All you need is the willingness to show up in #parrot on irc.perl.org and the ability to download and compile Parrot.

1 Comment

I really don't buy that argument generally. It makes sense with parrot, but only with parrot and nowhere else.
Any decent language which is implemented regarding performance uses a c-style ABI and the HLL follows that. Even in high-level languages with much more advanced features, such as common lisp or scheme, with features similar to parrot. Look at Go for example.

Not the other way round as in parrot where the HLL dictates the ABI and c callouts and callbacks are slow, or worst of all perl5, where there's no c stack, where the args and lexicals are in an artificial array on the heap.
c callouts should be fast, and should not require extensive protection or locks, otherwise you have limit yourself to your language in your standard library. You have to re-implement everything from scratch.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on July 6, 2011 11:08 AM.

Want a Job? Learn Perl. was the previous entry in this blog.

The Polite Fiction of Numbering is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?