eigenclass logo
MAIN  Index  Search  Changes  PageRank  Login

Inspecting a live Ruby process, easier if you cheat.

Are you still adding printf/puts calls and restarting your app to figure what went wrong? Sometimes, the problem is hard to reproduce, or you only discover it in production. You've got a process that exhibits the bug, but you didn't run it under ruby-debug, so there's no choice but kill it and reproduce after adding some code to inspect your program, right?

Sure not. Jamis Buck blogged about how to use GDB to inspect a live Ruby process, and showed how to get a stack-trace using Ruby's C API and some GDB scripting:

(gdb) set $ary = (int)backtrace(-1)
(gdb) set $count = *($ary+8)
(gdb) set $index = 0
(gdb) while $index < $count
>  x/1s *((int)rb_ary_entry($ary, $index)+12)
>  set $index = $index + 1
>end

But it gets much easier than that. How about this:

(gdb) eval "caller"

or

(gdb) eval "local_variables"

Once you've opened that door, you get a full-powered Ruby interpreter inside GDB. Ruby's introspection capabilities do the rest. Local variables, instance variables, classes, methods, Ruby threads, object counts... evil eval can bring us pretty far. You can find the scripts to turn GDB into a sort of IRB that can attach to running processes below.

A synthetic example

For the sake of illustration, let's take this trivial program:


def foo
  a = 1
  bar + a
end

def bar
  a = 2
  b = 2
  baz + 1
end

def baz
  c = 232
  c + gets.to_i
end

puts "foo returned #{foo}"

$ gdb ruby 
GNU gdb 6.1-debian
...
(gdb) attach 13658
Attaching to program: /home/batsman/usr/bin/ruby, process 13525
...
0xa7ec526e in read () from /lib/tls/libc.so.6
(gdb) 

I don't want to pollute gdb with lots of commands, so I've put the Ruby magic in a separate script that can be loaded with

(gdb) session-ruby

(I also have a session-asm that imports a number of commands useful for low-level inspection.)

At this point, I could just

 (gdb) eval "caller"

but that would #p() the caller's result in the stdout from the process being examined. So I first redirect $stdout with another command:

 (gdb) redirect_stdout
 $1 = 2
 (gdb) eval "caller"

This way, the result from eval is put in /tmp/ruby-debug.PID, and I get:

 $ tail /tmp/ruby-debug.13658 
 ["foo.rb:16:in `baz'", "foo.rb:11:in `bar'", "foo.rb:5:in `foo'",
 "foo.rb:19"]

Basic introspection

The stacktrace tells me where I am, but there's more much information available:

 (gdb) rb_object_counts
   220  String
   183  Class
    16  Module
    12  Array
     5  Float
     3  Bignum
     3  IO
     3  Object
     2  File
     1  Proc
     1  SystemStackError
     1  Binding
     1  fatal
     1  Thread
     1  ThreadGroup
     1  NoMemoryError
     1  Hash
(gdb) local_variables
["c"]

The last result might look surprising. How come there's a (Ruby) local variable, if we're running a method implemented in C (rb_f_gets)? It seems ruby doesn't get out of its way to overwrite the (Ruby) frame info when it calls a C method, so the data you get corresponds to the enclosing Ruby method.

Running until the end of the current method (aka "finish")

The process was interrupted in the middle of read(2):

(gdb) where
#0  0xa7ee826e in read () from /lib/tls/libc.so.6
#1  0xa7e88d18 in _IO_file_read () from /lib/tls/libc.so.6
#2  0xa7e87a6e in _IO_file_underflow () from /lib/tls/libc.so.6
#3  0xa7e89ffb in _IO_default_uflow () from /lib/tls/libc.so.6
#4  0xa7e89ddc in __uflow () from /lib/tls/libc.so.6
#5  0xa7e84db7 in getc () from /lib/tls/libc.so.6
#6  0x08073bbc in appendline (fptr=0x811ddf0, delim=10, 
    strp=0xaffcf638) at io.c:1566
#7  0x08073dab in rb_io_getline_fast (fptr=0x811ddf0, delim=Variable "delim" is not available.) at io.c:1662
#8  0x0807aa6e in argf_getline (argc=0, argv=0x0) at io.c:4498
#9  0x0807ac97 in rb_f_gets (argc=0, argv=0x0) at io.c:4565
#10 0x08053e6a in call_cfunc (func=0x807ac00 <rb_f_gets>, recv=2816571900, len=Variable "len" is not available.) at eval.c:5654 #11 0x0805bce0 in rb_call0 (klass=Variable "klass" is not available.) at eval.c:5810
#12 0x0805c8a8 in rb_call (klass=2816576680, recv=2816571900, mid=7345, argc=0, argv=0x0, scope=2) at eval.c:6048
#13 0x0805a191 in rb_eval (self=Variable "self" is not available.) at eval.c:3464

Ideally, something similar to GDB's finish command would allow one to resume execution until Ruby returns from the current method, instead of using finish after finish to get back to the right rb_eval context. I'd made a hackish, incomplete rb_finish command to makes this somewhat easier, but it'd break if you built ruby differently:

(gdb) rb_finish
$3 = 2816496980
Breakpoint 1 at 0x809988b: file process.c, line 1574.
rb_f_sleep (argc=1, argv=0x1) at process.c:1574
...
(gdb) finish
Run till exit from #0  rb_f_sleep (argc=1, argv=0x1) at process.c:1574
...

Then I can finish past call_trace_func, stopping on the 3rd rb_eval frame. This can be automated by just adding the appropriate (fixed) number of finish calls in rb_finish's definition. However, the number of stack frames will change depending on the optimization options used to compile Ruby, as some functions could get inlined. It can be hardcoded if all your ruby executables were built the same way.

In practice, I just finish manually (just keep pressing enter) until I get to rb_eval.

Local variables

(gdb) eval "local_variables"
["a", "b"]
(gdb) rb_locals
a = 2
b = 2
nil
(gdb) finish
...
(gdb) rb_locals
a = 1
nil
(gdb) eval "a = 1000"
1000
(gdb) rb_locals
a = 1000
nil
(gdb) restore_stdout
(gdb) cont

Finally, the programs outputs

foo returned 1275

as expected after I entered '42' and set the a variable in #foo to 1000.

A real example

I've been hacking my ruby-wmii scripts for the wmii window manager to use Ruby-IXP, and noticed that the process is taking up a lot of memory. ruby-wmii's wmiirc script controls the window manager of the X session I'm using while debugging it, so I must proceed with some care:

$ gdb ruby 6232
...
0xa7e84308 in select () from /lib/tls/libc.so.6
(gdb) rb_object_counts
166772  Array
107367  String
 74961  Hash
 16718  IXP::Rwalk
 16716  IXP::Ropen
 16715  IXP::Rclunk
 13805  IXP::Rwrite
  5226  Range
  3490  MatchData
  2999  IXP::Rread
  1322  Time
   652  IXP::Rerror
[...]

Rwalk, Ropen, Rclunk, Rwrite, etc. are objects representing the responses given by the 9P server. It seems that for some reason they're not being disposed of. I wanted to confirm that they were being referenced by an IXP::Connection object, so I just did:

(gdb) eval "ObjectSpace.each_object(IXP::Connection){|x| p x.instance_eval{@orders.size}}"

Obtaining

   65536
   6
   6
   5
   4

The ability to introspect on a live process comes really handy at times.

The scripts

I've split my GDB scripts into separate files so that I can load them separately and avoid polluting the command namespace by default. So I placed this in my $HOME/.gdbinit

define session-ruby
  source ~/.gdb/ruby
end

define session-asm
  source ~/.gdb/asm
end

.gdb/ruby is basically Nobu's with a few additional commands (eval, redirect_stdout, restore_stdout, rb_finish, rb_locals, rb_object_counts). I took it from an old (3 years or so?) posting of his to ruby-core and I haven't compared it to the one he just posted. Anyway it's worked well for me for a long time :)

ruby

Here's another GDB script for low-level (asm) debugging. I don't remember where I took it from, though.

asm

So useful - Marcus (2007-01-27 (Sat) 12:22:56)

Your script was so useful debugging ruby scripts!


dbx version - slumos (2006-09-29 (Fri) 16:20:10)

Thanks for this! Based on your example, I came up with this for Sun dbx:

# ~/.dbxrc
reval() {
  : ${1?"usage: $0 <expression> # eval Ruby expression"}
  result=$[rb_eval_string_protect("$1", 0)]
}

rp() {
  : ${1?"usage: $0 <expression> # print value of Ruby expression"}
  reval "$1"
  string=$[rb_inspect((VALUE)$result)]
  print -l ((`string.c`struct RString *)$string)->ptr
}

rwhere() {
  reval "caller(0).join($/)"
  echo $[-l ((`string.c`struct RString *)$result)->ptr]
}

rlocals() {
  reval "local_variables.map{|x| '%s = %s' % [x, eval(x)]}.join($/)"
  echo $[-l ((`string.c`struct RString *)$result)->ptr]
}

These echo the result in the debugger instead of outputting to the process stdout. So now I can say

(dbx) rwhere
db.rb:14:in `baz'
db.rb:9:in `bar'
db.rb:3:in `foo'
db.rb:18

ruby file is corrupt? - Matt Todd (2006-09-23 (Sat) 12:46:26)

For some reason, the ruby file doesn't work (I'm defining .gdbinit correctly). I put it through cat and it looks to be corrupted (there are a bunch of ??????s right before "define nd_type" and below as well. I've downloaded it a number of times and it stays consistent. I don't know if this is part of the actual file, but I think it's wrong. I'm almost certain the rest of how I have things set up is correct. (By the way, when I run "session-ruby", no response is given in gdb.)

M.T.

rubikitch 2006-09-23 (****) 18:20:15

(gdb) session-ruby (gdb) session-asm (gdb) local_variables Undefined command: "local_variables". Try "help".

I do not know why! Of course, there are .gdbinit, .gdb/ruby and .gdb/asm.

mfp 2006-09-23 (Sat) 18:44:41

M.T.:

The multibyte parts are nobu's original comments (in EUC-JP, I assume). gdb won't output anything when you run session-ruby, but the new commands should be available (you can verify that by typing 'rb' and pressing tab to autocomplete the command).

rubykitch:

it's either

(gdb) eval "local_variables"

or

(gdb) rb_locals

Also, keep in mind that the information will be sent to the stdout of the process being inspected, so you can

(gdb) redirect_stdout

first to have it redirected to /tmp/ruby-debug.PID (restore_stdout to restore the previous $stdout).

Aredridel 2006-09-23 (Sat) 19:07:24

You can replace HOME with ~ in .gdbinit and save the trouble of having to manually expand HOME.

Matt Todd 2006-09-23 (Sat) 19:38:30

With some good help from Aria on #ruby-talk, I got the files translated over into UTF-8, though, honestly, I don't know if that actually solved the problem. In my haste to test things out, I forgot 'session-ruby' almost every single time. DUH! So, I made it easy: I took out the function definition and simply pull in the source! (Thanks again to Aria for the good idea). Also, since I'm on a Mac, and have been having mental lapses all day long, I made the error of forgetting that /Users/mtodd/ was my home, instead of /home/mtodd/, as it is on my work machines, hehehe. DUH again! Zero for two!

But, not it works. Thanks for the cool files, et al.

Something I've done to make this a little easier for me is create a library file called inspector.rb (in my standard library location), looking like this:

trap("INT") do
  system "gdb ruby #{Process.pid}"
end

And if I want to be able to, at will, hop into gdb with the current process (by pressing C+c), I simply...

require 'inspector' if $DEBUG

Of course, the library file is being included if -d is on the shebang or in the command call, so don't forget it!

Hope this helps. :)

M.T.

Matt Todd 2006-09-23 (Sat) 21:49:39

Pastie for above code: http://pastie.caboo.se/14613

mfp 2006-09-24 (Sun) 05:23:47

Changed HOME -> ~; thanks Aria.

Matt: you can also change your .gdbinit to

source ~/.gdb/ruby
define session-asm
  source ~/.gdb/asm
end

if you want session-ruby's env to be loaded by default. inspector.rb seems really convenient for console apps/scripts... so now we have ruby call gdb to run ruby code, can we add yet another level? :-)


Last modified:2006/09/23 04:44:29
Keyword(s):[blog] [ruby] [frontpage] [live] [introspection] [debug] [gdb]
References: