I'm working on a small project today. Part of that project requires fetching syndication feeds and enqueueing further work if those feeds have new items. That means detecting whether those feeds have new items, and it also means polling the sites with those feeds frequently.
These are simple, well-understood problems, with well-understood solutions.
I also want to skip processing if the feeds remain the same between fetches, so I want to use LWP::UserAgent::WithCache, which checks HTTP headers such as
If-None-Match for modifications.
Unfortunately, both are subclasses of LWP::UserAgent, and both expect to be at the same level of a complex inheritance hierarchy which forms all of LWP in Perl.
Here is the object lesson for people desigining software. If you intend other people to reuse your software as components, such that you can't predict how other people will use it, remove as many unnecessarily hard-coded dependencies as possible.
If I were to redesign this part of LWP, I'd make the caching behavior and the robots.txt-respecting behavior into separate behaviors, perhaps runtime roles. I'd rewrite the LWP::UserAgent constructor to use a plugin system, where instantiators could provide an optional (and ordered) list of behaviors with which to decorate the
$ua object. Obviously the behavior I need is first to check the cached copy and then check the robots.txt rules and then use normal HTTP access, but why hard-code these behaviors?
There are plenty of mechanisms (CLOS method modifiers, the Decorator pattern, dependency injection) to work around this problem, but for now my solution is to subclass
LWP::UserAgent::WithCache, override its constructor, and manually inherit from
(For all of the faults of Java's IO model, it handles this problem well. Its defaults are awful, and it exposes too much complexity, but the Decorator pattern works effectively. PerlIO works in a similar fashion with much better defaults. This HTTP fetching problem is in the same category; note how a similar model could handle proxying, compressed output, anonymizers, and filtering with ease.)