blog.rupamsunyata.org

Decklin’s excuse for some blogging software. Est. 2006.

Tap that class

And now, something entirely impractical.

I picked up Beginning Ruby by Peter Cooper the other day to look for some teaching material. Flipping through, I came across a basic feature that had heretofore escaped my notice: the Struct class. If you want to use a class for nothing more than bundling together a few values, you can create a Struct instead of writing out initialize and attr_accessor yourself. The idiom is like this:

class Server < Struct.new(:host, :port, :password)
  def to_s
    port == 6600 ? host : "#{host}:#{port}"
  end
end

That particular one's from Njiiri (which may clue you in that it took me several months to get around to writing this post). For educational purposes, it's nice: you can point out how we give a name to a class (I've also tried to explain assignment as "naming" rather than "storing"), that there's an unnamed class (classes are also objects!), and overriding a method (which uses the dynamically defined stuff), without any boilerplate to get in the way.

So it's tempting to use this in a lesson. Unfortunately, Ruby is a little bit weird here, and you run into those distracting "practical" issues about the particular language you're working in. Your classes can't subclass Class, and thus you can't say Foo.new like you can Struct.new. Struct is actually implemented in C (or Java, or whatever's native). So you have to do some handwaving.

This is because the "metaclasses" we have in Ruby are implemented, so to speak, as singleton classes of objects (including class objects). I do not mean this as a criticism of Matz at all, but they seem like more of a serendipitous thing than an intentional design -- "hey, if we implement classes in this particular way, we get this sort of metaclassing automatically [1]." (The clearest explanation of why this is that I've found is in the first chapter of Advanced Rails by Brad Ediger.)

I wanted to be able to just subclass Class, rather than have all that fun power that we normally get to abuse in Ruby, only because I think this is the clearest way to explain the abstraction. Even I still have trouble describing singletons in plain English. Struct is intuitively a class of classes, and factors out similar/boring stuff -- a good practice. It could be an example of how to refactor some simple classes, if only we could follow it.

I decided to give up on the idea for my tutorial, but it kept me up at night. Ruby metaclasses clearly can do anything that the sort of Struct-like metaclasses I have in my mind -- "parametric" classes, if you will -- can do, and I can dynamically define just about anything; why not make it happen? We can instantiate Class itself, but the tool we have to shape that class is singleton methods. This can certainly be abstracted away. So, I whipped something up.

Before getting to it, though, let's visit a new construct in Ruby 1.9 and 1.8.7: Object#tap. Apart from the obvious debugging use described in its documentation, it makes it quite easy to factor out this pattern:

def gimme_a_thing
  thing = Thing.new
  thing.do_stuff_to_it
  thing
end

Into something closer to the style of functional or declarative programming:

def gimme_a_thing
  Thing.new.tap |thing|
    thing.do_stuff_to_it
  end
end

(Well, "do stuff" is obviously still procedural, but A for effort.)

Which one is "better" could probably be the subject of much debate, but: I really prefer the second one; even though it has exactly the same effect, it looks like "what" rather than "how" (something else I try to beat into impressionable young heads. :-)), which is I think easier to write tests for. I'm going to use it here, because it's shorter.

Now, Object#tap passes the object as an argument to the block; in our case, we are going to define a metaclass, so we want to work with self, instead. So we define a new version, class_tap -- by analogy with class_def, I suppose --- which class_evals the block rather than simply evaluating it:

class Class
  def class_tap(&blk)
    tap {|_self| _self.class_eval &blk }
  end
end

And to do the following trickery, we make use of MetAid, written by why the lucky stiff -- it's very small, so we could always just incorporate the bits we want into the code here, but this short file provides a common vocabulary for talking about metaclass stuff which is quite valuable.

Now, we can write a method to create our new classes on the fly. Here's what I came up with:

require 'metaid'

class << Class
  def meta(_super=Object, &blk)
    new.class_tap do                     # 1
      meta_def :new do |*args|           # 2
        Class.new(_super).class_tap do   # 3
          class_exec *args, &blk         # 4
        end
      end
    end
  end
end

(Note the "class << Class", opening the singleton, rather than "class Class", opening Class itself. Also, the distinction between new and Class.new -- they are the same method, but from inside the meta_def we're no longer in the Class class.) The lines of meta itself mean:

  1. The value of this thing is a generated class, which we will describe thusly:
  2. Its singleton class has a new method, which gives you a value that is:
  3. Another generated class, which is defined by:
  4. the original block, which now gets run with new's arguments.

Nary an assignment in sight!

The (embarrassing) caveat is that methods in the block that defines the new class instance cannot use def to create methods as normal. A def is evaluated in a totally fresh scope (I don't think I get the opinion behind this decision, to be honest), so we need to use class_def instead. This is, I must admit, rather hideous. Perhaps someday the language will change.

But, first things first: now we can implement Struct in pure Ruby.

class MyStruct < Class.meta \
  do |*args|
    attr_accessor *args
    class_def :initialize do |*instance_args|
      args.zip(instance_args).each do |attr, val|
        instance_variable_set :"@#{attr}", val
      end
    end
  end
end

The real Struct class does a few other nice things for you, but this is the heart of it; I can go back to that Njiiri example and just swap in MyStruct for Struct (Providing an instant performance gain of -200%... :-)).

Here's a "shapes" example from my abandoned OO lesson:

class Polygon < Struct.new(:sides)
  def perimeter
    @sides.inject(&:+)
  end
end

class RegularPolygonClass < Class.meta(Polygon) \
  do |n_sides|
    class_def :initialize do |side_length|
      @sides = [side_length] * n_sides
    end
    class_def :area do
      @sides.size * @sides.first**2 / Math.tan(Math::PI / @sides.size) / 4
    end
  end
end

class Square < RegularPolygonClass.new(4); end
class Pentagon < RegularPolygonClass.new(5); end

(Note that area has no free variables and thus could actually be defined with def. It would just look funny.)

You only have to change one number here to make new polygon classes, rather than accumulating parameter lint or explicitly subclassing and redefining something implicit just for the derived class [2]. In a way it the exact analogue of the imperative vs. tap style described above. There is quite a bit of aesthetics involved; one way is not "right".

Apart from the syntactical wart, I like being able to do things this way. Ruby is, as they say, optimized for programmer happiness and the principle of least surprise. Still, I'm sure it is quite slow, and I don't particularly need it for any real-world application right now. For an intro to OO it's way too much complexity to have lurking unexplained beneath the surface and still requires getting bogged down in the language you happen to be using. But hey! It's neat.

[1]I basically try to do this whenever possible, anyway. Design decisions are like sex: they're better when they're free.
[2]Of course, if Ruby class variables were not leaked across the entire inheritance hierarchy, something as trivial as this one-variable example would be... trivial. At least, few people would actually miss metaprogramming features. Blub blub blub.

What's sup?

After some weeks of final testing, I've just uploaded packages for sup-mail to NEW. I'm pretty excited about this.

Sup is a console-based MUA, like mutt (which I have used for many years). A few things distinguish it from most mail readers targeted at geeks like us:

  • Sup has no folders, a la Gmail. After watching many friends and even fellow hackers switch to Gmail, I have to admit: this literal hierarchical organization thing doesn't scale. I was planning to totally redo my mail folder system Any Day Now for about six months prior to starting on this. It was never going to happen.
  • Sup uses a Ferret full-text index to make this approach plausible. Search is super fast and beats (for me) both any kind of "organization" I could have disciplined myself into and the fine-grained control of something like mutt's search. It's sort of like git: until you do it, you don't realize how much more productive you can be when previously-expensive operations become instantaneous.
  • Sup works with threads, not messages; this is another thing Gmail got right. I used to waste brain cells thinking about which messages in a thread were worthwhile enough to save or not. Given the absurdly cheap price of disk relative to what we can type out in plain text since, like, a decade ago, this is crazy. In the index, I only have to look at whether a thread has new chatter or not, not its size, shape, or where the new messages are relative to it. All that's in the thread-view buffer where I actually read content.
  • Sup is written in Ruby. Back in the dawn of time, I used Gnus, and while I wasn't very good at elisp, the hackability afforded by being written in a high-level language was very nice compared to programs mostly implemented in C (even if they had a tacked-on scripting language). Plus, I love Ruby right now.

Despite all of those wins, sup currently has many drawbacks, and I don't recommend it for everyone. (And I mean everyone who thinks that the above are good ideas and are interested in using it; plenty of people, I'm sure, already think everything about this is idiotic, not new, or inferior to their preferred MUA. That's fine! You can ignore it all.) Here's what's still problematic:

  • At version 0.6, sup is very much not-yet-1.0. While it handles insanely large amounts of email without breaking a sweat, I still keep an additional backup of everything. (If Ferret crashes, the original copies of mail will be untouched, but it never hurts to be paranoid.)
  • The flow of data from your physical mail store to the sup index is currently one-way only. Actually removing deleted/spam messages is a big hack (if it works at all), and labels/flags/etc live entirely in Ferret-land. If you want to manipulate an actual mailbox, mutt is still the tool for the job (and then, you need to re-sync sup). This is probably the deal-breaker for most of us. I jumped in anyway because I feel like it can be solved (or more likely, made irrelevant) later.
  • William (upstream) is currently re-designing the whole thing from scratch, replacing the index library with Sphinx, and decoupling the index from the console frontend. As a result, the previous item is pretty much a non-priority (and bugs in general are not going to get the same amount of love as usual). I am hoping that we end up dumping mail into the index directly, then writing more frontends to write to Maildir backup, serve as webmail/whatever, but this is a long way off. On the plus side, thanks to Thrift, they will not be limited to Ruby.
  • Ruby's ncurses library still doesn't handle Unicode correctly. It can be patched (still doesn't work totally right), but I'm trying to find a more permanent solution for Debian.

So, if you're interested enough that you want to deal with these warts for now, apt-get install sup-mail (as soon as it hits the archive) and join us! Hopefully being in Debian will increase the userbase and get things fixed faster. If you're unsure, stay tuned for the next-generation version later.

(There are screenshots and a few introductory docs over at Rubyforge that illustrate and explain all this in more depth, which I recommend checking out if you're still saying, "...huh." Me, I'm a sucker for any piece of software with a manifesto.)

Generated by Mnemosyne 0.12.