« E3 and recent announcements | Main | I've got a new job... »

June 02, 2005

seattle.rb hackfest and monthly meeting

The seattle.rb hackfest last weekend turned out pretty well. I was only able to make it on Saturday afternoon, since I had homework, but it was a lot of fun.

We also had a seattle.rb meeting on Tuesday night, using the new meeting location at Amazon's building down in the international district. The ID is a great place for user group meetings, there are several great restaurants for pre-meeting meals. A group of us met up at Shanghai Garden, where we introduced each other to our favorite dishes. Afterwards, we walked through the Uwajimaya grocery store and got ice cream at the milk tea stand.

The meeting was interesting, it was a little less structured than normal. We had nine people show up, with a mix of experience represented. We summarized the hackfest changes to rubygems and that led to an impromptu demo of 43 things by Eric. This led to a discussion of how different people were using Ruby at work, and what were the shortcomings that people had run into. The main thing mentioned was how the green threads made things difficult at times, and the lack of a unified bundling format for third-party ruby code out there, and the large amount of abandonware there was. rubygems is addressing the packaging and install aspects of third-party code, but there's a lot of good stuff out there that is either abandoned or only has docs in Japanese.

I got the room to do a little brainstorming on new debugger features, but it turned out that most of the users at the meeting hadn't used it much before. So we took some time to cover some of the debugger's more interesting features.

Eric also demoed his new database query grapher. It's pretty sweet. Rails keeps a detailed log of all queries done, so Eric wrote a script that parses the sql trace for each query and finds all queries that joined tables together. For each join pair, he created an edge in a graph, with the label being how many times those two tables were joined in the logs. He plugged the graph object into a Graphviz dot file, and then used the OS X Graphviz client (I use it too, it rocks!) to manipulate the graph to show us which tables were most heavily linked together. This was an interesting way to determine how to partition the existing 43 Things database into multiple databases.

I also discussed a change I'd like to make to the Ruby garbage collector. I posted back in March how Ruby's garbage collector destroys copy-on-write semantics because when it does its mark pass through the list of ruby objects, it does the marking on the objects themselves instead of a separate mark tree.

I used a small program that required several libraries, and then forked off 10 children. In the normal case, the kids didn't do anything, so they used 250KB of memory a piece. When I flipped a flag, each child forced a GC run as soon as it was spawned, the result being that each child was consuming 2500KB, or 90% of what the parent was using.

This doesn't seem like a big deal, since most small Ruby programs just use threads for concurrency, but this ignores the juggernaut that is Rails. Production Rails installs use small httpds for the frontend that connect to Rails FastCGI processes running on the backend. 43 Things is consuming approximately 50MB of RSS (i.e. non-shared and actually allocated) memory per FCGI child. They don't have to, though. Each child should only be consuming 2-5MB a piece, and if Ruby's GC didn't stomp on copy-on-write semantics, they would.

The work of moving the marks to a separate tree structure isn't as bad as you'd think, since all the code for traversing the nodes and marking them up only exists in gc.c. Ruby already provides a nice C tree implementation, so it's basically a task of refactoring 500 lines of a 2000 line file. Once the work is done, forked children will consume much less memory than they do now, which means that Rails installs will be able to scale more linearly. Right now, you can only run so many Rails FCGI processes on a box before you start to thrash virtual memory.

I've got at least one hacker who's interested in doing the work with me, so we're going to tackle that over the next couple months. I'm thinking it would only take a week to make the bulk of the changes, and then a few more weeks to flush out the bugs and write some good unit tests to ensure correctness.

Posted by djb at June 2, 2005 10:56 AM

Comments

Interesting info about GC and Rails. I just passed the word along to bitsweat and minam ... maybe you'll get some other hackers who want to help.

Posted by: Pat Eyler at June 3, 2005 04:08 PM

Post a comment




Remember Me?

(you may use HTML tags for style)