eigenclass logo
MAIN  Index  Search  Changes  PageRank  Login

Low-level introspection to save brain bits: Ruby object model, class hierarchy and method dispatching

I've often found that it's easier to remember some implementation detail of ruby, inferring from that how the language behaves, instead of trying to keep in mind the numerous implications themselves.

Take method dispatching. There are quite a lot of rules to consider if you try to remember what will happen in each of the cases shown below, yet it's all caused by a particularity of the implementation, which I'll showcase using evil.rb:

class A; def foo; "A#foo" end end
class B < A; end
module C; def foo; "C#foo" end end
module D; def foo; "D#foo" end end
class E < A; end
class F < A; end

a = A.new
b = B.new
e = E.new
f = F.new
a.foo                                              # => "A#foo"
b.foo                                              # => "A#foo"
e.foo                                              # => "A#foo"
class E; def foo; "E#foo" end end
e.foo                                              # => "E#foo"
class << e; include C end
e.foo                                              # => "C#foo"
class A; include C end
f.foo                                              # => "A#foo"
f.extend C
f.foo                                              # => "A#foo"
class F; include D end
f.foo                                              # => "D#foo"
class F; def foo; "F#foo" end end
f.foo                                              # => "F#foo"
def f.foo; "f#foo" end
f.foo                                              # => "f#foo"

module X; def bar; "X#bar" end end
b.foo                                              # => "A#foo"
class B; include X end
b.foo                                              # => "A#foo"
b.bar                                              # => "X#bar"
module X; include D end
b.foo                                              # => "A#foo"
class B; include X.clone end
b.foo                                              # => "D#foo"

Singletons, mixins, inclusion after inclusion and extension after extension, that's quite a lot of possibilities...

Looking around with evil.rb

First of all, we need to summon the power of darkness to deconstruct matz' creation:

require 'evil'

Object#internal is defined by evil.rb to return a handle that allows us to inspect and manipulate the low-level fields associated to the original object. For regular objects, the interesting fields would be:

  • iv_tbl: a pointer to a hash table with the instance variables (st_tbl *, see st.c)
  • flags: contains information such as whether the object is frozen
  • klass: points to the class of the object

Time for some basic low-level introspection:

a = Object.new
a.internal.klass.to_i                              # => 3085110560
a.internal.iv_tbl.to_i                             # => 0
a.instance_variable_set(:@foo, 1)
a.internal.iv_tbl.to_i                             # => 135912160

The iv_tbl isn't initialized until an instance variable is set: not only the language, but also the implementation makes sense.

Classes and modules carry a bit of additional info:

  • m_tbl: st_tbl (hash table) holding instance methods
  • super: a pointer to the class/module higher in the hierarchy
class A; end
A.internal.super.to_i                              # => 3085110560
A.new.internal.klass.super.to_i                    # => 3085110560
# ...
Object.object_id * 2 + 2**32                       # => 3085110560

Exploring the class hierarchy

Let's begin with something simple:

o = Object.new
o.internal.klass.to_i                              # => 3085110560
Object.object_id*2 + 2**32                         # => 3085110560
OBJ(o.internal.klass)                              # => Object

class A
  def foo; "A#foo" end
end

a = A.new
OBJ(a.internal.klass)                              # => A
OBJ(a.internal.klass.super)                        # => Object

OBJ is a method that returns a Ruby object given it's address (which is related to, but not identical to its object_id). We can ignore it for now (there's a small hint in the above snippet if you want to try to figure how it's implemented).

Nothing surprising so far: the implementation matches what we can see inside the language (without resorting to wicked methods).

someobj.internal.klass.to_i

returns the address of someobj.class, and

someclass.internal.super.to_i

that of someclass.superclass.

Singleton classes

Moving on to a slightly more complex example:

def a.foo; "singleton foo" end
s = OBJ(a.internal.klass)                          # => #<Class:#<A:0xb7dc9c2c>>
class << a; self end                               # => #<Class:#<A:0xb7dc9c2c>>
s.instance_methods(false)                          # => ["foo"]

OBJ(a.internal.klass)                              # => #<Class:#<A:0xb7dc9c2c>>
OBJ(a.internal.klass.super)                        # => A
OBJ(a.internal.klass.super.super)                  # => Object

This shows that the singleton class is being inserted before (lower than) the actual class in the klass chain. One could also say that the actual class is the singleton one, and that

 a.class

should always return the singleton class of the object; but that's not the way matz meant it to be*1.


ICLASSes

The Pickaxe refers to them as proxy classes, but that term is somewhat overloaded so I prefer naming them ICLASSes (after the T_ICLASS constant used to tag them), which is reminiscent of their low-level and "only for the implementor's eyes" status.

But what are they? Let's see:

module B
  def foo; "B#foo" end 
end

class A
  include B
end

a = A.new
OBJ(a.internal.klass)                              # => A

Suspense...

 sup = OBJ(a.internal.klass.super)
 # ~> undefined method `inspect' for #<B:0xb7d8b028> (NoMethodError)

Trying harder:

 Object.instance_method(:inspect).bind(sup).call
 # ~>  TypeError: bind argument must be an instance of Array

"What the heck?"

OBJ(a.internal.klass.super.klass)                  # => B
OBJ(a.internal.klass.super.super)                  # => Object
a.foo                                              # => "A#foo"
B.internal.super                                   # => nil

What's going on is that the ICLASS was inserted between A and Object in the klass chain. Its klass field points to the module that was included (B). Why didn't #inspect work on it? Module is an Object, and a Module instance too, but the klass chain starting from the ICLASS stops at the included module and doesn't go up to Object. B.internal.klass is 0. It takes some time to internalize what the klass and super fields mean for module, classes or ICLASSes.

Raw stuff

A few more bits to think about before I get to explain them properly:

class << a; end
OBJ(a.internal.klass)                              # => #<Class:#<A:0xb7dbbbb8>>
class << a; self end.ancestors                     # => [A, B, Object, Kernel]
a.extend B
a.foo                                              # => "A#foo"
class << a; self end.ancestors                     # => [A, B, Object, Kernel]

module C
  def foo; "C#foo" end
end

class << a; include C end
a.foo
class << a; self end.ancestors                     # => [C, A, B, Object, Kernel]

OBJ(a.internal.klass)                              # => #<Class:#<A:0xb7dbbbb8>>

# OBJ(a.internal.klass.super)                        # => 
#~> undefined method `inspect' for #<C:0xb7d70334> (NoMethodError)

OBJ(a.internal.klass.super.super)                  # => A


correction. - binary42 (2006-02-07 (Tue) 11:51:07)

I was scanning the article to make sure I knew what I thought I knew and it seems you have an error.

 f.foo #=> "A#foo"
 f.extend C
 f.foo #=> "A#foo" <-- this line surprised me at first. then I went to irb...

it should be "C#foo" if I didn't botch my test.


binary42 2006-02-07 (Tue) 12:03:48

Too bad I didn't think too format. The error is location obvious enough though.


mfp 2006-02-07 (Tue) 14:12:03

(reformatted your code)

Surprising, isn't it? I think "A#foo" is correct (the # => lines should be correct because they get added automatically by my xmp filter). Did you include C into A before extending f? Just to double-check:

   batsman@tux-chan:~/mess/current$ cat foo.rb
   class A; def foo; "A#foo" end end
   class F < A; end
   module C; def foo; "C#foo" end end
   
   f = F.new
   puts f.foo
   
   class A; include C end
   f.extend C
   puts f.foo
   
   batsman@tux-chan:~/mess/current$ ruby -v foo.rb
   ruby 1.8.4 (2005-12-24) [i686-linux]
   A#foo
   A#foo

My irb also agrees.


binary42 2006-02-07 (Tue) 15:45:19

Ah. right. I scanned too fast. And only did the partial test. Thanks for the pointer!

Last modified:2006/02/07 15:17:16
Keyword(s):[blog] [ruby] [introspection] [hierarchy] [klass] [ICLASS] [evil.rb]
References:[evil.rb wants love] [Tricking that old, picky interpreter: prototype-based OOP]

*1 Indeed, he considers even class << obj; self end is somewhat abusive, i.e. that singleton classes need not be accessible Ruby objects, so an alternative implementation could do without that.