November 2009 Archives

Lack of Ceremony and the Marketing Gap

Lest anyone think I hate Perl 5 for any perceived and real flaws mentioned in Perl Drawbacks and the Marketing Gap, I like almost everything in Perl 5. Its annoyances are particularly vexing because they stick out from an otherwise pleasant experience.

(They're less annoying than they might be otherwise because I know how to use the power of the CPAN to ameliorate them.)

Even still, Perl 5 has good and useful features which hurt the Perl marketing effort. They're useful. They're practical. They're pragmatic. They're consistent with Perl's goals and vision. Some of them are designed and implemented as effectively as anyone can imagine. Yet they hurt Perl's image occasionally -- and I believe that's because people don't understand them.

I can think of three such features. In broad terms, they're lack of ceremony, whipuptitude, and DWIMmery.

Lack of Ceremony

Perl enforces little ceremony on how you do what you do. You don't have to write a lot of boilerplate code to begin a Perl 5 program. You don't have to match filename to class or package name. You don't have to break your code into modules or classes or even subroutines.

You don't have to declare variables. You don't have to pass variables to subroutines. You don't have to declare functions. You don't have to match file headers with declarations. You don't have to run a compiler or linker.

You don't even have to write your code in a file.

You don't have to indent your code in one particular way, or even consistently. You can leave off parentheses. You can leave off semicolons in certain cases. (You can add extras as much as you like.)

You don't have to write tests. You don't have to know about types. You don't have to use built-in operators when shelling out to the operating system would do.

In short, Perl lets you add as much or as little ceremony to your programming as you desire.

This is good, in that every programmer and every programmer team may want something a little bit different. (That desire may change for every program as well.)

This sometimes hurts Perl's image because programmers are lazy. Perhaps it's axiomatic that novice Perl code tends toward the natural idioms of the novice's dominant programming language experience. Perl code in the late '80s and early '90s resembled C and awk and shell because of the Unix system administrator heritage of its users.

Perl code in the late '90s resembled inside-out PHP because of the CGI programmer heritage of its second wave of adopters.

Perl code in the 2000s resembles a curious mishmash of the two (even despite the Perl Renaissance) because novices tend to write code in a pastiche of styles stolen from random examples and tweaked until it appears to work.

These examples, as inconsistent and incoherent and unidiomatic as they may be, tend to work well enough because Perl 5 will happily do a lot of extra work to get the right answer. It's happy to speak baby Perl. It's almost always as happy to speak Baby Perl circa 1994 as it is circa 2009.

In other words, the awful old examples of Perl 5 code won't go away and novices will still copy and paste and modify them and get their work done, if ugly and inconsistent and anything but modern.

I can't see that as entirely a bad thing, even if that means we need to work much harder crowding out bad examples with good examples.

Next up, whipuptitude.

Perl Drawbacks and the Marketing Gap

(Note: if you hate the M-word in relation to Perl and the title didn't warn you off, you have only yourself to blame for reading beyond this paragraph.)

Some people hate Perl. Some of them have good reasons. More than you might think don't, but you can't argue with these people. (If they haven't used Perl in 15 years and still hate the language, the only two appropriate responses are ignoring and mocking.)

Some people hate or dislike or don't care for Perl for good--that is, measurable and technical--reasons. Arguing that they should like Perl is a futile activity, but discussing those reasons can be enlightening.

Some people don't care for Perl for aesthetic reasons. I have trouble reading Python code because my eyes slide off over the horizontal whitespace used to delimit blocks. Some people cannot read or disambiguate leading sigils no matter how much they work. Aren't brains interesting?

Like all languages, Perl 5 has its drawbacks. Some of those drawbacks contribute to negative perceptions of Perl.

Note that I'm not suggesting that Perl 5 should or must change nor that any patches or designs are forthcoming or have any chance of acceptance in the Perl 5 core. I merely document opinions I've seen and heard when discussing Perl with people who do and don't like the language.

Perl 5 Drawbacks Contributing to the Marketing Gap

It's easy to write messy code in Perl 5 because the language and compiler does not encourage you to write clean code by default.

In this case, "messy" means "global variables", automagic globals springing into existence upon first use, and action at a distance governed by global variables.

For example, all variables are global by default, unless you declare them as lexical. I know Python and Ruby fans hate the idea of declaring variables (though both languages still have the automagic vivification problem, even if limited to lexicals, not globals), but it solves the scoping problem. Too rarely people wonder if changing the default scope of undeclared variables to lexical from global would have changed the way people wrote code in Perl 1 through 4.

The package global action at a distance problem is obvious when you consider the many ways to inherit from a class in Perl 5. Where modifying @Some::Class::ISA directly can pass for a metaobject system, it's an ugly metaobject system.

I like Perl::Critic and Perl::Tidy for cleaning up bad code, but preventing people from writing bad code by encouraging them to write good code from the start seems like a better approach.

Similarly, I have only good things to say about Moose -- but it's an additional library that people have to install before they can use it. (Kudos to the first extended Perl distribution which includes Moose and other Task::Kensho modules by default.)

CPAN is indeed the best part of Perl 5, but until CPAN distributions are as easy to install as the average PHP application is to deploy, I believe people will still see CPAN distributions as, at best, second class extensions with no guarantee of working together effectively or correctly.

Backwards compatibility is generally a good thing, but the fact that the Perl 5 parser can (and has to continue to) accept, silently, random garbage written fifteen years ago by someone with no business writing maintainable code isn't necessarily a benefit. Yes, it's good that Perl 5 doesn't wantonly break working code, but when it costs 3% of the Perl 5 parser to support a feature obsoleted by Perl 5 in 1995, and when the replacement feature makes code simpler, more concise, and easier to read, it seems like a credible time to consider the cost of supporting old, obsolete features.

Perl has no Big Name Corporate Sponsor. That can be a good thing--Perl doesn't depend on the whims and fortunes of any single company, but it also relies on the kindness and interests and free time of a collection of individuals for advocacy, development, testing, experimenting, polishing, and support.

Then again, volunteer Perl web sites have lately far surpassed the utility, attractiveness, and freshness of supposedly well-sponsored corporate sites.

I don't even have to write about the abominable book, tutorial, and training material situation. A handful of us scream into that void, but it'll take some time before we can fill in that gap and erase the collective memory of Matt's Script Archive from the Internet.

I know, I know -- mature, experienced Perl programmers know how to work around all of these problems. They're not problems if you've written effective Perl for five years.

Put yourself in the shoes of a novice for a moment. Forget that you know how to configure the CPAN client. Forget that you know the difference between @ISA = 'Parent'; and use base 'Parent'; and use parent 'Parent';. Pretend that no one has told you that Perl 5 can catch your typos for you (if you ask it ) or warn you about deprecated constructs (if you ask it) or recommend alternate approaches that are safer, more secure, more concise, more readable, and better supported (if you install additional software and ask it).

In short, pretend that you want to write good, modern Perl 5 -- not Perl 4 -- and don't know how. How much time will you have to spend fighting the obvious battles before you realize what to do differently?

(Before you start angrily telling me that PHP and Python and Ruby and Haskell and C# and who cares what else have similar problems, I don't care, because I'm not talking about that. Feel free to have that discussion elsewhere.)

Next time: Perl benefits.

Tim Bray went to RubyConf 2009. Ignore the provincial "The Ruby community knows how to build, test, distribute, package, and reuse software better than almost any other language in the world" comment (c'mon, Tim -- the CPAN has been around for a while and the CPAN Testers service has over six million test results for CPAN distributions -- and the current rate of reporting is four million a year). There's a better quote buried deeper in the report:

... it turns out that there's almost nobody who's actually getting paid to work on actual core Ruby (less than the number who are getting paid to work on JRuby and IronRuby and MagLev and so on). Ruby really needs to find a sugar daddy -- in my opinion, a deep-pocketed Japanese corporate sugar daddy -- and find it soon.

Do I recall correctly that Tim helped convince Sun to hire a handful of JRuby developers in 2007? (They've since left the Oracle-eclipsed Sun.) Hmm. Hirer's regret?

Ian Bicking criticized Sun for the JRuby hire at the time; apparently I'm not the only one who detects a whiff of provincialism here. It's odd; John Ousterhout worked at Sun for a while, putting resources behind dynamic languages in the form of Tcl. (Anyone who says their favorite dynamic language is unsurpassed in automation abilities, easy embedding, Unicode, event handling, and GUI integration probably knows nothing about Tcl. Those who do not know programming language theory condem the rest of us to hear their incessant chest beating.)

I don't really want to talk about Ruby, though. I want to talk about Perl.

I wrote How Perl Happens as a subtle reminder that Perl has little corporate sponsorship. To my knowledge, there's no IronPerl nor JPerl. There's no Enterprise Perl fork of the Perl 5 interpreter. There's no Perl on the Smalltalk VM, nor on V8, nor on SpiderMonkey.

Booking.com did make a large donation of $50,000 to contribute to Perl 5 development -- and they deserve credit and thanks and acknowledgement for that, as do all of the contributors to TPF... but try to find someone (anyone!) employed anywhere, as an agent of TPF or otherwise, with the full-time job to implement, support, maintain, or manage any implementation of Perl.

TPF hasn't spent (to my knowledge) any of that $50,000 -- though not because TPF is unwilling, but because no one has made a plausible, workable plan for spending that money.

Ian Hague made a generous donation to help with Perl 6 development, and that's sponsored several grants to people such as Patrick Michaud, Jonathan Worthington, and Jerry Gay. As well, Vienna.pm has sponsored a portion of Jonathan Worthington's time for several months -- but again, these are modest grants, not resembling anything close to full-time employment.

The thought of a business benefactor is interesting, especially in comparison to other language communities. Does the close-knit Japanese community primarily developing MRI form a barrier to sponsorship from outside of Japan? Does Python gain an edge from Google's deep pockets? Can anyone argue that Zend's control over PHP has helped the language more than it has hurt the language? Would Lua have worked anywhere outside a research university?

I don't mean to criticize or condemn alternate Ruby implementations for their own sake. RubySpec is the most important project to come out of completing implementations; it can turn them into complementary implementations. (I do choke back chuckles every time I read a starry-eyed programmer who's only ever learned Ruby say that the Ruby community has the best testing of any programming language ever invented in heaven or on earth, because they figured out how to write test descriptions and invented a new acronym for it, but I'm a history-aware jerk that way.) There's nothing wrong with multiple implementations of a programming language built around a common specification (hey, Python has a pretty good one!) and a comprehensive test suite (fill in your own parenthetical statement here).

The problem comes, as Tim Bray is just starting to discover, when the reason for those competing implementations is to divide up a pie of consulting and support fees by capturing market segments tapped, untapped, unrelated but tappable, and not invented yet but looking very tapperific without replenishing the commons.

In other words, while I have no doubt of the good intentions of someone porting Perl 5 to the CLR or the JVM to run in businesses where standard Perl 5 might not be allowed or appropriate or to give access to libraries not available in Perl 5 yet, I suspect that the corporate interest in sponsoring such development is to expand the walled garden around the CLR or the JVM, not to make a better Perl 5 or to help programmers get their work done with less ceremony and mess and frustration.

This leaves me with two questions:

  • Why doesn't Perl 5 have sponsored development beyond a few donations here and there? (Again, those businesses and individuals who have donated to support Perl development in any form deserve praise. By no means do I intend any slight or anything other than gratitude toward them.)
  • Is the Perl community fortunate that it hasn't fragmented by implementation on various platforms, or unfortunate that it couldn't perform corporate aikido to turn walled-garden sharecropper sponsorship into dollars to improve the Perl core for everyone?

Writing good software isn't about writing software. While learning a programming language, it's easy to believe that typing or stringing together syntactic constructs or figuring out how to please a picky compiler is the single most important task you can perform, it's not. Some people even believe that programming is primarily an act of typing, once you have a problem description or a design document or a specification or a series of interconnected diagrams.

Sadly, that's not true.

Software is "done" in the sense of "usable for other people" when it has an appropriate amount of testing, when it's appropriately maintainable, when it has appropriate internal and external documentation, when it has appropriate packaging, and....

"Appropriate" here depends on a lot of circumstances, but there's the risk mentioned in the title. You have to identify what's appropriate. Will anyone else ever install the software? If so, then you need to make sure that they can. Will anyone ever have to maintain the software? Then you need to make sure that that's possible too.

It's easy to confuse "I've written the code, I just have to test it, package it, document it, clean it up, and review it, and then it'll work appropriately where you want it" for "I'm done". I like what my colleague James Shore says about Done Done; it's a good reminder that the act of writing software alone is far more than transcribing design ideas into code, or convincing a picky compiler to give you the right error message to help you fix a typo, or getting a web page to load on your development machine.

That's as important when you ask yourself if you've finished the current task as it is when someone else asks you when you'll finish.

I'm not suggesting that you lie -- far from it. I suggest instead that any estimate or prognostication or description of the completion of a project take into account the fact that you have more work to do when you save a file, even if you expect never to type another character in that file. There is no "done, except for...". There's only done.

I can't define what's "appropriate" for your project. Nor can the Perl community. Even so, the Perl community has converged on a handful of practices and mechanisms to measure and to encourage appropriateness.

You can see this principle at work when you look at the organic community standards enforced only by convention on the CPAN. CPANTS Kwalitee measures the packaging and installability of a distribution. CPAN Testers gathers information about the measurable correctness of code. Every distribution gets its own bug and request queue on rt.cpan.org.

You could do a lot worse than to adopt similar guidelines.

Another source of risk in software development is the combination of the likelihood that something will have to change and the effort required to make that change. I use tests to drive the design of my software, which has the nice features of building a comprehensive test suite for the intention of the code and of ensuring that I design my code in a testable way.

That risk may be low, and that's fine. Remember, though, that code has a way of sticking around far longer than you anticipate.

If you want to be able to maintain your code later, you have to be able, right now, to look beyond the mechanics of how you tell the computer what to do. Source code does that, yes, but that's not its primary purpose. The primary purpose is communicating what you intend to do to other programmers.

For design in the large, increasing testability usually means improving genericity, writing to well-defined interfaces, and removing coupling between unrelated systems. Yet if you, as a novice, have to worry on your own about design in the large, something is very wrong with your development process and you have problems beyond which tips for novices can help.

You can manage these risks in the small, too.

The risk of maintenance is twofold. First, you can write messy code with false trails and vestigial code and unclear logic and poor factoring. No language prevents this. You need to develop good taste for organizing, designing, and implementing code. Second, you can write ill-thought code which crams together clever tricks and side effects and strange idioms but serves to obscure the intent of the code.

The second happens less often than the former, despite so much of the software development culture brandishing pitchforks, torches, and languages with enforced indentation against the latter. One wonders at their priorities.

In general, less code is better than more code. I saw some code today which copied arrays around, pushed onto arrays, pulled values from a hash, and then finally inserted a reference to that array into a hash. The corresponding idiomatic Perl 5 reduced six lines of code into one line of code -- and not a cryptic one-liner, either. Yes, you have to understand references and dereferencing and autovivification to understand that line of code, but anyone dealing with this code that didn't understand references or dereferences is already in trouble.

In general, you can successfully replace structural code with functional code, with no loss of clarity. While understanding the mechanics of what Perl should do seems more important to novice programmers, the intent of why that behavior is necessary is more important to maintenance.

In general, you can safely assume that the first question people will ask when maintaining a piece of code (a function, a module, a class, a program) is "What does this need to do?" Your tests can answer that with regard to details. How you write the code (and comments and inline documentation) answers that with regard to purpose and intent.

In general, there's no reason to use clever tricks when simpler code will do. There's one reason to use a clever trick, and that's when there's really no other way to accomplish what you need to accomplish and when you've abstracted away the clever trick into a well-encapsulated black box with appropriate documentation and warnings and comprehensive tests. This is less often necessary than you think. (As a quick example, Moose has obviated almost every need I've ever had for even knowing how some of these dirty tricks work in Perl 5, and I'm glad of it.)

If you want to be able to maintain your software, you have to write maintainable software. This is a subtle point, bordering on tautology, but it's a necessary point that novices all too often overlook.

In my past five years of experience developing software, I've come to the conclusion that a primary component of writing great software is managing risk effectively.

Risk?

Suppose I want to bake a cake to take to a dinner party tomorrow night. I could pick a recipe based on its deliciousness, but I review the ingredient list and instructions and equipment beforehand so that I won't have to stop in the middle and rush out and buy something I forgot, potentially ruining the cake. I could base it on an ingredient I love, but Dave can't eat nuts. I could choose a recipe beyond my skill level, but I might fail and have to choose between arriving empty handed or with a prepackaged dessert from the supermarket. Yick.

Everyone makes similar judgments when writing software, even if we do so unconsciously. Consider the risk of failure. How do you know your program works?

Me, I test software that matters. If it's a short program intended to explore an idea or a program I expect never to grow more than 30 lines, I might add debugging output or run a few manual tests. Otherwise, I blatantly use testing to drive development, in design, implementation, and maintenance.

Testing (and TDD) not only allow me to build a suite of automated tests corresponding to expected behavior such that I can run those tests and verify that any change did or did not change expected behavior, the discipline of using testing to drive my development and design in the small changes the way I think. Instead of thinking merely "This function has to return a particular type of value when it succeeds," I think "What else can it return?" Because I write a test first, I think about how other people might call the function. I think about how other people might abuse the function, on purpose or by accident.

Where possible, I make the API difficult to misuse.

When that's not possible, I evaluate the risk of misuse -- its likelihood and consequences -- and consider the cost of preventing that misuse.

Perhaps TDD isn't the only way to do this. It's the most effective way I've found, but it may not be the only way. Even so, I believe that identifying and understanding the risk of failure will help novice programmers improve their skills.

The primacy of using an API over writing an API changes my state of mind. It's easier for me to consider these risks when I start from the question "How will people use this API?" instead of "How do I implement this code?"

How Perl Happens

| 3 Comments

Jane is your average programmer. Jane works with Perl in her job. She's fortunate; her company knows how to use CPAN effectively. Whenever she needs to do something new, she searches CPAN for an appropriate distribution, evaluates it, and -- more often than not -- uses it to solve her problem.

Today Jane has another problem. There's a bug somewhere. She runs her tests and narrows it down to some new code she's written in the past week.

Her debugging leads her to a CPAN module she added a few days ago. With a little more work, she discovers that one particular method doesn't handle an edge case that her code provides. Fortunately, that module's page on search.cpan.org has a link to a public source code repository. She can check out the latest version and see if that solves her problem.

Today, it doesn't -- but that's okay. The CPAN page also explains where the author prefers to receive bug reports. Now Jane has options. She can write a test with her example and attach it as a patch to apply to the repository. She can write a general explanation and see if that triggers something in the author's mind.

Jane's motivated, today. She writes a test. She peeks inside the module's implementation and realizes that a single line of code will make the test pass. She writes that code -- and all of the tests pass.

She submits the bug report and her patches to CPAN's bug tracker. The author applies the patch and releases a new version of the distribution on the CPAN and the world is a little bit better for a little bit of cooperation between a couple of motivated people.

This is how Perl happens.

The Perl 5 core? Thousands of people have sent one or more little patch to correct a typo or add a new feature or update a core module or improve the documentation or fix a bug or speed up an existing feature or to make it work on a new platform. That's thousands of volunteers, motivated by the desire to make the world a little bit better -- sometimes for themselves or their employers and sometimes for other people.

The CPAN? Thousands of developers have donated code written for whatever purpose so that other people can use it freely.

Perl web sites? Perl mailing lists? Perl documentation and tutorials? The same.

None of this is by accident. Improvements happen because they're necessary and useful, but mostly because someone decides to take the Perl motto of JFDI (Just Do It, Friend!) to heart.

That goes for Perl 5, Perl 1, Perl 6, Parrot, CPAN, Catalyst, RT, Moose, CPAN Testers, and whatever other project in the Perl world you care to mention. With people willing to invest an hour or two to make even a tiny improvement, we make big progress.

Just do it, friend.

From Novice to Adept: Pronouns in Perl

One of the persistent criticisms of Perl is that it has too many implicit variables. These are variables on which certain built in operations work if you don't provide explicit targets.

The funny thing about that criticism is that anyone who can read and understand the previous two sentences appropriately can understand Perl 5's two pronouns.

It in Perl 5

The first -- and most used -- pronoun in Perl 5 is $_. This is the default variable. You may also hear it called the topic variable. A few operations set it. Several operations operate on it. Whenever you see $_, read it as "it".

For example, when you iterate over a list without an explicit iterator, Perl internally aliases $_ to the current iteration value:

for (1 .. 10)
{
    say;
}

This example shows off the implicit use of $_ as well; say with no arguments operates on the current value of $_. You can write this explicitly if you want (say $_;), but idiomatic Perl eschews the use of the topic unless necessary.

An interesting technique is deliberate assignment to $_ to enable implicit reference:

for ($some_var)
{
    /foo/ and do { ... };
    /bar/ and do { ... };
    /baz/ and do { ... };
}

Perl 5.10's given/when construct cleaned that up:

use Modern::Perl;

given ($some_var)
{
    when (/foo/) { ... }
    when (/bar/) { ... }
    when (/baz/) { ... }
}

A little bit of extra syntax in the form of new keywords clarifies the intent of the code.

These in Perl 5

The other pronoun in Perl 5 is @_, for these or them. Most of the uses of this pronoun in modern Perl is in handling function parameters. You can get at the invocant of a method with:

sub my_method
{
    my $self = shift;
    ...
}

... though it's often more succinct to handle multiple parameters with list assignment:

sub my_other_method
{
    my ($self, $name, @args) = @_;
    ...
}

Unlike $_, few built in operators operate on or set @_ implicitly. You can refer to it implicitly if you use the goto idiom for tail call elimination or if you use Perl 4-style function calls... but don't do either one.

A Language's Tools (CPAN versus IDEs)

Some people claim that dynamic languages are too difficult to manage without using powerful IDEs. One theory is that a lack of static typing means that static analysis is insufficient to write and to manage great gobs of code.

I look at the problem differently. I don't want to manage great gobs of code. I want to use small, sharp, flexible, well-designed tools that I can fit together in a coherent form.

I wrote about recursion and tail call elimination in the Modern Perl book yesterday. As I was doing so, I realized that I should at least look at Yuval Kogman's new Sub::Call::Tail and Sub::Call::Recur.

Granted, I do have root privileges on my local machine (though with local::lib that doesn't matter anymore) and I have the suitable development tools installed (gcc, make, the Perl headers, system headers), and I have my CPAN client installed -- but installing these distributions to test was almost trivial:

$ cpan Sub::Call::Tail

In less than the time it took to read the documentation on search.cpan.org, the CPAN client prompted me for a sudo password. I changed three lines in my example code and ran the tests and everything passed.

You read that right. It took fewer than five minutes from deciding to install a language extension to using that extension productively. Yes, my example was small, but my confidence in the code and the ecosystem is high because CPAN has been a fundamental part of the professional, modern Perl experience for over a decade.

I don't have to care about installation paths or mirror selection. I don't have to care about having the right dependencies. I don't have to care about losing out on the latest version or wondering if the code works right on my machine. The tools are great.

(Caveat: sometimes I do have to worry about those things, but those are the exceptions which demonstrate how amazing it is that a system begun 14 years ago with tens of thousands of modules written by thousands of authors works together so well.)

Part of this is a testament to the Unix philosophy. Sub::Call::Tail does one thing and does it well. There's almost no interface. There's little to learn. It's a very simple language extension that, if it behaves properly, should only make your code clearer and somewhat more efficient (if you even need to measure it).

Part of this is that Yuval is a good, careful coder with good design sense. See also the first point.

Part of this is also that I have more than a decade of practical experience with Perl 5, so I know that I can trust Yuval and I know how to wrangle CPAN to do what I want. Then again, the CPAN ecosystem has improved so much even in the past three years that that ease doesn't come only from practical experience.

Could an IDE replicate the CPAN experience? I suppose there could be a graphical IDE to browse and install CPAN distributions in the same way that some people find Synaptic easier to use than aptitude. I'd like to see such a thing, personally -- but make no mistake, that's just an interface over what makes CPAN work: the CPAN distribution system, CPAN metadata, CPAN Testers, CPANTS, community standards, and the realization among effective Perl programmers that programming modern Perl effectively means making the most of what's available on the CPAN.

My productivity doesn't depend on an IDE writing code for me or performing automatic refactorings. (I have no problem with the latter.) My productivity depends on reusing great code, most of it code I didn't have to write. If you overlook the CPAN tools because they don't come with a splash screen or a point and drag installer, you're overlooking one of the most powerful benefits of modern Perl.

My recommendation against explicit scalar in Embracing Idioms gathered some attention. "It's clearer to read," said some people. "It's more obvious what I mean," said others.

The best advice I can give any fledgling Perl 5 programmer is to read anything Mark Jason Dominus writes, especially his Higher Order Perl. That book may be daunting, so for now start with Ovid's journal entry about synthetic classes. He quotes MJD, who responds himself:

I realized after a couple of years that architects have ... "structural" and "functional" elements.

One of the characteristics of novice code is that it focuses on structural elements. When someone says "You write Perl like a C programmer", it often means that you're doing too much work in your loops:

for (my $i = 0; $i < @elems; ++$i)
{
    my $elem = $elems[$i];
    say "$elem";
}

Compare that to the Perlish use of iteration:

for my $elem (@elems)
{
    say $elem;
}

... or the postfix iteration, implicit topic form:

say for @elems;

Sometimes you can even get away with the version which exploits list context:

say @elems;

(Though note that the current output record separator has a greater effect on the functional equivalence of the final version.

In this case, the structure of the iteration is less important than the function of printing every element of the array. While for people learning Perl, the multiple seemingly-equivalent options seem overwhelming, daunting with fragmentation possibilities, experienced Perl programmers can choose the form which best expresses the intended function while minimizing synthetic code.

Perhaps it's clearer to explain why map and grep are important.

Perl 5's map iterates over a list and produces a list. That's it. You can produce a list of the first ten square numbers with the code:

my @squares = map { $_ * $_ } 1 .. 10;

The equivalent iteration version might be:

my @squares;

for my $i ( 1 .. 10 )
{
    push @squares, $i * $i;
}

If you're not familiar with map, the first version may seem inscrutable and the second comforting. Yet compare the amount of structural code in the second example. The iteration is explicit. Creating list elements is explicit. Creating the resulting list is explicit. Extending the resulting list is explicit.

Similarly, grep removes synthetic code:

my @primes = grep { is_prime( $_ ) } @maybe_primes;

... versus:

my @primes;

for my $maybe_prime (@maybe_primes)
{
    next unless is_prime( $maybe_prime );
    push @primes, $maybe_prime;
}

Perl 6 takes this further by introducing a concept of metaoperators (see Perl 6 Synopsis 3), which allow the use of any existing operator -- built in or user-defined -- to reduce or distribute over a list (in parallel, even) or more.

This is not to say that it's bad or wrong to use structural code, especially as a novice. Learning to program takes work. Learning syntax and design and the interaction of symbols in symbolic computation is complex. Even so, one of your goals should be to reach the point where you can evaluate syntax and idioms and choose the ones which best clarify the intent and function of your code, rather than the ones which emphasize the structure of how you accomplish that goal.

I write a lot of little conversion programs. They take a command-line argument or two, loop over a series of files, read them, convert them, manipulate them, mangle them, then write them out elsewhere. It's the Unix filter pattern (and, one might argue, the functional programming pattern).

These programs tend to be, at most, 100 lines of code, with significant whitespace.

I often start by writing a couple of helper functions, one to find the names of all of the interesting files, one to read a file, one to process the input names into output names, and one to write a file. I should abstract this whole process into a reusable framework, but I haven't figured out the appropriate genericity yet.

The important point is that I start with names:

my $scenes = get_scene_list();

for my $chapter (get_chapter_list())
{
    my $text = process_chapter( $chapter, $scenes );
    write_chapter( $chapter, $text );
}

die( "Scenes missing from chapters:", join "\n\t", '', keys %$scenes )
    if keys %$scenes;

exit;

sub get_chapter_list { ...  }

sub get_scene_list { ... }

sub process_chapter { ... }

sub read_scene { ... }

sub write_chapter { ... }

This particular program is 88 lines of code, with copious whitespace and BSD-style brace placement. There's no functional reason to write it with functions instead of straight-line code that operates on global variables. There's only aesthetic practicality.

Names matter.

I don't have to show you the contents of any of these functions because their names describe what they do. They don't tell you how they do what they do, but you can get a sense of the organization and intent of the code by reading the simple control flow here.

Very little novice code I've seen makes the attempt at organized, named structure. I can appreciate that design and abstraction and factoring are all skills learned through practice just as much as is programming effectively in a language. Even still, these are important skills to learn.

I've heard a lot of people try to explain subroutines as "Pieces of reusable code". That's wrong; I think the Forth and Lisp and domain-driven design people have it right here. A subroutine is a way to name a set of behavior. It's an abstraction. Being able to identify and name individual sets of behavior is essential to being able to solve problems well.

Thinking in terms of sets of behavior -- individual units of behavior -- is essential to programming well.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Archive

This page is an archive of entries from November 2009 listed from newest to oldest.

October 2009 is the previous archive.

December 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?