eigenclass logo
MAIN  Index  Search  Changes  PageRank  Login

A taste of evil.rb: using DL to unfreeze objects

More people are getting interested in ruby's internals*1, perhaps thanks to Caleb Tennis' latest series. So maybe the time has come to unveil evil.rb*2 as quite an useful learning tool, allowing you to mess with the interpreter's internal data structures at runtime.

I will show how evil.rb manages to unfreeze an object, introducing DL along the way. That's one of the simplest low-level manipulations one can do, just the first step towards better understanding of the runtime structures, but this is the sort of things happening under the hood in the interpreter all the time.

evil.rb

The idea behind evil.rb was conceived by Florian GroƟ: it's as simple as using DL (a standard extension providing an interface to the dynamic linker) to access internal objects. That is, it allows you to manipulate them at the C-struct level. This enables you to do things like:

  • changing the class of an object
  • manipulating the inheritance chain for fun and profit
  • grabbing instance variables or singleton methods
  • swapping objects (ever heard of Object#become?)
  • changing the self context of a Proc (the way 1.9's instance_exec does)
  • messing with the flags of an object

Using DL

Ruby VALUES for non-immediate objects are pointers to RVALUE slots inside heaps managed by Ruby. RVALUE is a union type taking 5 words (20 bytes), whose contents can be interpreted as any of the "RStructs": it turns out that the interpreter distinguishes internally between different object types (and other things like NODEs that correspond to executable code, or scopes).

Object layout


A plain object is represented by an RObject struct like

struct RBasic {
    unsigned long flags;
    VALUE klass;
};

struct RObject {
    struct RBasic basic;
    struct st_table *iv_tbl;
};

flags will hold information like 'this object is frozen', and iv_tbl is a pointer to a hash table with the name=>value mapping for instance variables.

An Array is held in a RArray structure:

struct RArray {
    struct RBasic basic;
    long len;
    union {
	long capa;
	VALUE shared;
    } aux;
    VALUE *ptr;
};

Such "RStructures" exist for Objects, Classes/Modules, Floats, Strings, Arrays, Regexps, Hashes, Files, Data objects...

Manipulating RObjects through DL

The first thing I'll have to do is telling DL about the layout of the structures we want to manipulate:

require 'dl/struct'

module Internal
  extend DL::Importable

  typealias "VALUE", nil, nil, nil, "unsigned long"
  typealias "ID", nil, nil, nil, "unsigned long"

  Basic = ["long flags", "VALUE klass"]

  RBasic = struct Basic

  RObject = struct(Basic + ["st_table *iv_tbl"])
end

Getting object addresses

I can now instantiate Internal::RObject objects, which will have three attribute accessors: flags, klass and iv_tbl. In order to do so, RObject.new must be given the address of the chunk of memory that will be interpreted as a RObject. This is fairly easy: Object#object_id is defined as

 VALUE
 rb_obj_id(VALUE obj)
 {
     if (SPECIAL_CONST_P(obj)) {
         return LONG2NUM((long)obj);
     }
     return (VALUE)((long)obj|FIXNUM_FLAG);
 }

so someobj.object_id returns the address of someobj divided by two.

Unfreezing objects

The VALUEs for immediate objects don't point to RVALUE slots, so we have to make sure we don't bump into any of those:

class Object
  def immediate?
    [Fixnum, Symbol, NilClass, TrueClass, FalseClass].any?{|klass| klass === self}
  end
end

Finally, an object is marked as frozen when its FL_FREEZE bit in the flags field (of the associated RBasic structure) is set, so I only have to flip it:

module Internal
  FL_FREEZE = 1 << 10
end

class Object
  def unfreeze
    return self if immediate?

    Internal::RObject.new(DL::PtrData.new(self.object_id * 2)).flags &= ~ Internal::FL_FREEZE
    self
  end
end

a = "foo"
a.freeze
begin
  a.upcase!
rescue
  puts "Couldn't change the frozen string: #{a}"
end
a.unfreeze
a.upcase!
puts "Now a is #{a}"

# >> Couldn't change the frozen string: foo
# >> Now a is FOO

Feel free to play with evil.rb inside irb, but think twice before using it for serious stuff!


what does DL stand for? - Anonymous (2006-02-03 (Fri) 13:26:05)

For us noobs, what is DL?


mfp 2006-02-03 (Fri) 13:31:43

Quoting from ext/dl/doc/dl.txt:

Ruby/DL provides an interface to the dynamic linker such as dlopen() on UNIX and LoadLibrary() on Windows.

In addition to calling functions from a shared library, you can perform low-level operations on C structures, etc.

Last modified:2006/02/03 13:36:57
Keyword(s):[blog] [ruby] [evil.rb] [dl] [unfreeze] [object_id]
References:[evil.rb wants love] [Tricking that old, picky interpreter: prototype-based OOP]

*1 I refer to the implementation as ruby and the language as Ruby, following a convention AFAIK first defined with Perl

*2 I take the blame for the name, and for part of the implementation