Blinded By Our Own Experiences

| 6 Comments

Programmers are optimists.

Programmers are notoriously bad at estimates. (Programmers are optimists.)

No program survives first contact with end users. (Programmers are optimists.)

No matter how much we try to predict the future or how many edge cases we can imagine, there's no substitute for the cold, hard crash of reality against our carefully constructed edifices. Two examples come to mind.

If you've written Perl before, spot the bug without running this program:

my %extensions_to_names =
(
    001 => 'Rodney',
    004 => 'Lucky',
    020 => 'Daisy',
    080 => 'Petunia',
    108 => 'Tuxie',
);

(Why do the animals have phone extensions? In a world of software telephony through Asterisk, the question becomes why not?)

Sure, the code aligns prettily and uses three-digit phone extensions where it makes sense, but four of those numbers are octal numbers. One is invalid. One isn't what you expect. The presence of the autoquoting fat comma doesn't fix this. Oops.

This is the kind of mistake you make without thinking about it because it never comes up. Then it comes up again. (Yes, the righter approach is to require an explicit base when writing literals in anything other than base 10, like 0b or 0x.)

We make these mistakes because of the assumptions we bring to our data, and we bring those assumptions from our experiences. As Tom Christiansen says, "You don't come from somewhere whose zipcode begins with a 0."

One of the reasons I use a single-word nom de plume and never capitalize it is to see what breaks. (Answer: lots of stuff.)

What happens if you're Icelandic and your surname depends on the first name of your father and not his surname? (Happened to a friend of mine.)

What happens if you don't have a middle name?

What happens if you're from a culture with a different order of names? What do you put in the little boxes in some startup's web form?

It's not just names. It's not just Unicode. It's not just the one true way to express street addresses or telephone numbers. It's every expectation we make.

How do you calculate the growth rate of free cash of a company when it's negative for the first few years, then turns positive halfway through? (What percentage is positive one million dollars greater than negative seven million dollars?)

As I see it, we have three overlapping pieces of solutions.

First, experience helps. As we do things wrong, find out why they're wrong, and fix them, we have the opportunity to learn things and do them less wrong in the future. Better yet it doesn't even have to be your own experience.

Second, knowledge of the world outside of programming helps, if you pay attention. The more you know about the world and its complexities, the more opportunities you have to learn about what can go right and what can go wrong, or at least the kind of information you will eventually have to scrub and munge and mangle into correctness.

Third, the sooner you deliver software to real users to break things, the sooner you can revise and fix your faulty assumptions. Keep a notebook. (I deployed software to friends and family over the past couple of weeks. Even though I thought I made it easy to use, they're full of suggestions to make it easier. Also they have a knack for finding weird bugs.)

If nothing else, move somewhere with a weird postal code and change the representation of your name for a while. I suggest Unicode symbols with weird casing and at least one punctuation character. If nothing else, you'll get your text editor settings right.

6 Comments

I like this post. It resembles our brain behavior.

The octal issue is in bash too...

this_month=$(date +%m)
next_month=$(( (this_month == 12) ? 1 : this_month+1 ))
bash: 09: value too great for base (error token is "09")

woha! you test your program in the month of March, and you reach your bug a year latter... (well, bash unit tests are not the common case).

The human race are thus limited. Our perception is very relative.

"Perception (from the Latin perceptio, percipio) is the process of attaining awareness or understanding of the environment by organizing and interpreting sensory information." -- wikipedia

"Relativism is the concept that points of view have no absolute truth or validity, having only relative, subjective value according to differences in perception and consideration." --wikipedia too

On february 29 this year (2012), MS azure cloud was down just by leap years and date calculations. Who wants y2k effects? we already have humans with visual studio.

We're all totally "Blinded By Our Own Experiences". Can't agree more.

Nothing much to say, just liked your post. Congrats.

Where I said March, I just wanted to say "September". Else the bug is reached before one year.

What happens if you don’t have a middle name?

This is, btw, almost the same question as asking “what happens if you are European?”

W3C has nice writeup about personal names around the world - there are some strange (for us) conventions and cases there. Moreover there is even a proposal how to handle it.

I'm European and have a middle name. My wife has two middle names. Each of my children have a middle name.

The trick to storing names in database is to not try to split them into multiple fields. After all, most organisations have very little actual need to know which part of my name was chosen by my parents, and which part has been inherited from ancestors.

There are essentially two reasons developers are tempted to splits names into multiple fields:

1. They think they "should" - that it's the proper thing to do. They've seen it done in so many SQL tutorials that the idea has become ingrained; or

2. They hope to reassemble the name in different ways. I might be "Dear Toby" on their weekly newsletter, but "Mr Inkster" when they send me an invoice.

Actually, come to think of it, there's a third reason:

3. They think they're smart enough to do it properly. They are not. There will always be edge cases.

Just don't do it. Don't have "title", "given_name", "surname" fields. Just have a single "name" field.

Reason #2 is a valid use case for splitting names into fields. But we can cover reason #2 using another technique. Store multiple full names for each person. Perhaps have "formal_name" and "informal_name" fields. Depending on the complexity of your needs, you might even need more.

Of course customers might baulk at seeing "formal name"/"informal name" fields when they are signing up on your website. But a little creative client-side scripting to default the informal name from the formal name can actually result in a quite usable interface.

This benefits not just people from cultures other than your own, who might have different naming systems, but also probably people from your own culture. (Elizabeth Jones who prefers to be known as Libby.)

After all, most organisations have very little actual need to know which part of my name was chosen by my parents, and which part has been inherited from ancestors.

I agree with that in principle, but there is one place where I think many organizations would disagree: sorting. If I have a list of employees, or a list of customers, or a list of enrollees, I'm typically asked to sort them, and nearly always to sort them by surname. Now, whether that's actually a valid business need or not is debatable, but that's irrelevant, because enough business users think they want it that you're not going to get anywhere arguing with them about it.

So that presents a problem if you allow only a single field for name. You could have two fields ("full name" and "sorting name" or somesuch), but that involves duplicate data entry. Unless you make a cogent guess at prefilling in the "sorting name" and let the user correct it if it's wrong. (This is pretty much what the W3C link provided by jnareb proposes.)

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

affiliated with ModernPerl.net

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on March 1, 2012 12:57 PM.

Modern Perl 2011-2012 PDFs Available! was the previous entry in this blog.

Your Test Suite Needs at Least This File is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Sponsored by Blender Recipe Reviews and the Trendshare how to invest guide

Powered by the Perl programming language

what is programming?