The Overhead of a Class

| 2 Comments

The only problem you can't solve by adding another layer of abstraction is the problem of too many layers of abstraction.

When I write code, I try to write simple code. I don't mean baby code (though it's okay if you're just learning to program). I mean code that does what it needs to do, is easy to understand if you know what it needs to do, and is easy to maintain if you understand the problem.

Everything should be in the right place. Everything should have a meaningful name. The organization should make sense and should suggest how to make meaningful and necessary changes.

(I usually have to let the architecture of a system emerge through guided trial and error and a couple of rounds of refactoring before I'm satisfied.)

I write a lot of Perl. Perl's very effective at allowing my projects to evolve in almost every way.

Almost.

Like most programmers I know, I struggle with the idea of primitive obsession. This might be more prevalent in languages with dynamic typing than with good static type systems. (Jim's example uses Java, one of the worst of all possible worlds, as you'll see. C is worse in this example.)

In simple terms, primitive obsession is what happens when we say "Oh, someone's name is just a string of two words" instead of representing a name with something that understands all of the information a name can contain (do you have a family name? a formal name? a title? a middle name? a multi-word first or last name? no last name? a cultural name distinct from a legal last name? a cultural or political organization of names in non-Western order?) and all of the operations you can perform on a name (casing, changing, sorting, searching, normalizing).

Primitive obsession is what happens when we say "I need to store a date, and as an American it's obvious that dates are always of the form MM/DD/YYYY." or "Sure, they'll have computers in 2000, but those extra two digits are super expensive right now."

Perl exacerbates this with syntax. (Bet you're surprised to see me write that.)

For better or worse, as with many modern languages, the best way to create abstractions over data and behavior is to create a class, and this is where Perl occasionally gets in my way:

package Some::Class
{
    use Moose;

    has [qw( some_attribute another_attribute )], is => 'ro', lazy_build => 1;

    sub _build_some_attribute    { ... }
    sub _build_another_attribute { ... }

    sub some_method              { ... }

    __PACKAGE__->meta->make_immutable;
};

Given all of that code necessary to create a new class (and thanks to Moose it's much less than it could be and much better than I would normally write by hand), I far too often say "It's simpler to use the primitive here because I can refactor it to a class later". Remember also that adding a new class means adding a new file and loading it, or dealing with the order of compilation (did I mention the advantages of declarative syntax yet?) when adding a new class inline. Yes, some of those reasons are semantic and not syntactic, but don't overlook the syntax.

That decision doesn't always cause problems in the future, but it causes enough problems that it's a risk. (DBIx::Class deserves tremendous credit for including DBIx::Class::InflateColumn::DateTime as a core module. Only heroes get date and time calculations right, and Dave Rolsky is a hero for that and countless other thankless reasons.)

I conclude from this a few lessons:

  • The overhead of declaring a class is still higher than I would like
  • I am tremendously lazy and bad about predicting the future
  • Abstraction is costly in terms of design, but it often is a good investment

(I idly wonder how someone might design a functional approach to the same problem, and then I get lost in the question of declaring closures for anonymous functions that has an attractive syntax.)

2 Comments

A recent project of mine (MooX-Struct) aims to eliminate some of that overhead. Classes are declared very concisely, are anonymous (thus mostly eliminating namespace management concerns) and have some vaguely sensible defaults (including some built-in methods and overloading).

e.g.

use MooX::Struct Date => [qw( $year $month $day )];

my $today = Date->new(year => 2012, month => 10, day => 9);
my $yesterday = Date[ 2012, 10, 8 ]; # even more concise
my $tomorrow = $today->CLONE(day => 10);

Being based on Moo, it should interoperate with Moose (fairly) seamlessly - e.g. you can apply Moose roles to objects.

In addition, using a single string to represent a person's name is actually just about the best way of doing things. Attempts to break it down into components like first name, middle name, surname invariably end up failing when you encounter people with traditional Chinese names (surnames first), mononyms, or even no name at all (e.g. very young infants).

There's very few technical reasons to even try to split up names into different fields. If you need to be able to sort on names, or address people by informal and formal names in different circumstances, these are reasons to have "formal_name", "informal_name", "sortable_name" fields, but not a reason to attempt to split a person's name up.

I would have thought that you, who go by the mononym "chromatic" would know all this. :-)

See also Falsehoods Programmers Believe About Names.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on October 1, 2012 11:23 AM.

Mentor-to-Hire for Perl Programmers was the previous entry in this blog.

Code Injection with eval require is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?