Are Objects Black Blocks or Toolkits?


I'm working on a small project today. Part of that project requires fetching syndication feeds and enqueueing further work if those feeds have new items. That means detecting whether those feeds have new items, and it also means polling the sites with those feeds frequently.

These are simple, well-understood problems, with well-understood solutions.

I don't want to poll sites more frequently than they allow, so I'm happy to use LWP::RobotUA to fetch the feeds, as it respects the robots.txt protocol for well-behaved spiders.

I also want to skip processing if the feeds remain the same between fetches, so I want to use LWP::UserAgent::WithCache, which checks HTTP headers such as Last-Modified/If-Modified-Since and ETag/If-None-Match for modifications.

Unfortunately, both are subclasses of LWP::UserAgent, and both expect to be at the same level of a complex inheritance hierarchy which forms all of LWP in Perl.

Here is the object lesson for people desigining software. If you intend other people to reuse your software as components, such that you can't predict how other people will use it, remove as many unnecessarily hard-coded dependencies as possible.

If I were to redesign this part of LWP, I'd make the caching behavior and the robots.txt-respecting behavior into separate behaviors, perhaps runtime roles. I'd rewrite the LWP::UserAgent constructor to use a plugin system, where instantiators could provide an optional (and ordered) list of behaviors with which to decorate the $ua object. Obviously the behavior I need is first to check the cached copy and then check the robots.txt rules and then use normal HTTP access, but why hard-code these behaviors?

There are plenty of mechanisms (CLOS method modifiers, the Decorator pattern, dependency injection) to work around this problem, but for now my solution is to subclass LWP::UserAgent::WithCache, override its constructor, and manually inherit from LWP::RobotUA.

(For all of the faults of Java's IO model, it handles this problem well. Its defaults are awful, and it exposes too much complexity, but the Decorator pattern works effectively. PerlIO works in a similar fashion with much better defaults. This HTTP fetching problem is in the same category; note how a similar model could handle proxying, compressed output, anonymizers, and filtering with ease.)


Your blog really needs to get OpenID working for both google and blogger.

I been thinking maybe someone should write a moose wrapper for lwp.

and maybe some of the parts should be rewritten? would that be hard? or for some reason impossible? to do what you're suggesting.

I'd also like to say that I'm not sure how this is a 'blackbox' thing...

Well said, sadly this is an all-too common problem in OO design.

I ran headlong into this exact problem with caching modules and ended up writing Cache::CacheFactory as a result.

Inheritence simply isn't a good model for extending functionality most of the time.

I think Ovid's talk about inheritence vs roles at the BBC covered this extremely well, it was almost enough to make me want to use Moose. (That's an unfair comment, I know...)

I think the link was:

Isn't working for me at the moment though.

The intent of a black box is that you never, ever look inside.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

affiliated with Modern Perl Whitepapers



About this Entry

This page contains a single entry by chromatic published on May 10, 2010 3:12 PM.

New to Programming or New to Perl was the previous entry in this blog.

Don't Make the Robot Devil Angry is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Sponsored by Make a Smoothie guide and the Trendshare how to invest guide

Powered by the Perl programming language

what is programming?