« File under 'R' for redundant | Main | One is the loneliest number... »
May 03, 2005
Ruby debugger hacking
ZenSpider and I got together this morning since we hadn't hung out in quite a while. He showed me his new zenprofiler, which uses his RubyInline module to create a hybrid Ruby/C profiler that is much faster than the default pure-Ruby profiler, but with much more readability and less verbosity than a pure-C implementation hooked into Ruby itself.
Meta<Foo>
Ruby's profiler/debugger is a lot like Perl's. The stock introspection tools aren't written in C, they're written in the high level language. The core C engine of the language makes calls to callback hooks whenever a line of code is executed, a function is called, etc, but all the smarts and UI live in high level code. This is mostly a good thing, except it means that your debuggers and profilers can have a high overhead. Still, those are problems that can be fixed.
The philosophy of implementing core functionality in a higher-level language is one of the reasons Ryan is working on RubyToC and metaruby, because he believes that if the Ruby engine itself was written in Ruby, it would be easier to find developers willing to add features to the language, and maintenance overall would be much easier.
Now, self-bootstrapping is one of those through-the-rabbit-hole things that freaks you out at first. But Ryan is one of those guys who just groks compilers and low level stuff, and he's a patient instructor. I'm not at a point where I can jump in and pair with him and Eric on tasks yet, but I can read the code and grok what's going on, and not make funny noises while I read the code.
If you want to read more about their RubyToC and metaruby projects, read their blog category for metaruby, and check out their overview slides for the project. It's really great stuff, and just a small slice of what the Ruby community as a whole is working on.
Ruby debugger guts
But, back to the zenprofiler and the debugger work we did today. zenprofiler uses RubyInline, a module Ryan wrote which was inspired by Brian Ingerson's set of Inline modules for Perl, they allow you to define C code inside your Ruby code, and at runtime, the C code is automatically built and linked into your program. The resulting shared libs are cached for future runs.
Now, the default Ruby profiler and debugger are very slow beasts. That's because Ruby's debugger and profilers tell the core engine to activate a trace function which is called every time a new line of code is executed, a wrapped C call is made, a function returns, etc, etc. The trace function receives information about the state of the interpreter, and does appropriate things with it, depending on if it's a profiler or a debugger.
The trace func called by the C engine doesn't differentiate between different code states, it just passes a string to the trace func which says "call", "c_call", "return", etc, etc. So your trace function is called for every line of code, even if it doesn't need to be.
Our work today centered around pushing as much of the switching logic inside the debugger's trace function down into an inlined C class. There are still some segvs to be worked out, but we saw fairly encouraging initial results. In the next iteration, we're probably going to do something I've been wanting to do for quite some time. If you're in profiler mode, you have to make annotations for each line of code and each function call/return. Debuggers, not so much! With a debugger, you're setting breakpoints and telling the code to run, or you're single-stepping through the code. If you're single-stepping, then overhead isn't a big issue for you.
But if you're setting breakpoints, the code should run as close to non-debugger speed as possible! That's the next set of changes, I think. A small set of hooks that makes the trace func only get called on lines or functions that we want to break on, and in all other cases, skips the trace func altogether.
With the stock debugger, we were seeing slowdowns of 100-250x when running some regression tests with REXML, a pure-Ruby XML parsing library. I think we can get the in-debugger performance of breakpoint code to be 1.05-1.1x the normal performance, with maybe 10-20 hours of work. RubyInline makes this kind of stuff a lot easier, although in this case, we'll have to make some core interpreter changes just because of what we want to do. But it will be pretty sweet.
How random hacking can payoff ten-fold down the road...
I've always been very interested in scripting language debuggers. Back when Amazon was using perl 5.003 in 1997, I back-ported the 5.004 debugger and added a bunch of features/macros to make it easier on me when fixing code. I spent a few hours a night for about 10 days, pouring over the debugger's code, learning all of its intracacies and gotchas, and ended up with a pretty deep knowledge of perl internals because of it. The debugger used all sorts of tricks with symbolic references and symbol globs in order to peek at the symbol table and properly handle Perl's control flow, which we all know can be a bit... chaotic, to put it nicely.
And the funny thing about learning about that sort of stuff is that you never know when it will come in handy. When we were grafting Mason onto our new website display engine in 2001-2002, all that knowledge I learned from hacking the Perl debugger paid off in spades, since I was able to work at different layers of our embedded perl engine and fix problems quickly, without having to mentally context switch too much. But when I originally worked on my debugger fixes, I was simply focused on peeling apart a tool I used often to see how it worked, I had no goal or plan. I just wanted a few annoying bugs to get fixed, and once I did that, I realized I had enough knowledge to start adding features. I think it's similar to people who move to France, and bemoan the fact that they can't learn French, until all of a sudden one day, they're talking to themself about groceries they need to buy, and they realize they're talking to themselves in French! I love aha! moments like that.
I started thinking about some favorite aha! moments over the years, but then I realized that's a separate blog entry altogether. But I do have fond memories of reverse-engineering saved game formats on my Apple IIgs (Bard's Tale, anyone?) and playing games methodically to figure out their logic, and then exploting that logic in order to get past tough spots. I guess I always had a bit of a hacker spirit, even when I was a little kid.
I blame the legos.
Posted by djb at May 3, 2005 06:05 PM