« May 2005 | Main | July 2005 »
June 25, 2005
No more GC loops
I fixed the cycles in the sweep phase. I had a few bugs in my changes where the mark table (what I'm calling the hashtable that holds the list of marked node addresses) wasn't getting updated properly, so certain nodes were being continually marked in a runaway loop.
It was interesting to fix, since I couldn't tickle the problem with simple test scripts. I had to use the miniruby build calls used when compiling ruby from source. Debugging was difficult as well, since the cycle didn't occur immediately and it was hard to tell how much work had been done when cycles occurred. So, I added extensive logging statements to watch all the state changes and see what addresses were being marked in what order. I then wrote a small script to parse the GC debug logs and find the cycle nodes. It ended up working pretty well!
I still have some bugs to fix on the sweep phase, but the mark phase is looking relatively complete.
I've also got to put together a few small slides for the seattle.rb meeting on Tuesday night, since I'm giving a progress report on the GC fixes. I'm not sure if I will have the fixes working by Tuesday, but at least I'll be able to explain the design and implementation of the changes.
As an aside, it's been a while since I wrote bare C, so this project has been fun to work on. I don't miss C's tedious memory management when I'm working in Java, but there are times where I miss the terseness and flexibility of pointers, and being able to tightly control and keep track of allocated memory. Don't worry though, it doesn't take a lot of C to make me want to start doing Java and Ruby code again. It's just that having lots of constraints and steps can be oddly liberating sometimes.
Posted by djb at 11:54 PM | Comments (0)
June 19, 2005
GC update
I've been doing lots of move-related stuff, but I've been able to make progress on the ruby GC work in the evenings. I'd say the work is 70% done, there are a few unintended cycles that I need to close, but I've got lots of printfs in there and I've spent a lot of time in the debugger. I bunded all my changes with CPP logic, so I can flip a switch and run the old or new code as desired.
I'm hoping to have it in beta form for the seattle.rb meeting at the end of the month, it will be my second-to-last meeting.
While doing some packing in my office this week, I found an old handout from a GC tutorial I took at OOPSLA 2000. I had forgotten all about it, but reading through the slides, I realized it was full of useful hints and improvements to make with the mark/sweep family of garbage collectors. The handout is 130+ pages long and full of diagrams, plus handwritten notes I made while taking the tutorial, so it was a bit of a find. And very timely, considering the GC stuff I'm doing this month.
This upcoming week is mostly going to be spent going through all my unpacked boxes in the garage, and all the papers and books in my office, and deciding what I'm going to take with me. I aim to only take 30% of what I have now, just to keep the clutter down. It should be fun!
Posted by djb at 02:16 PM | Comments (0)
June 06, 2005
OS X Intel builds
Well, Apple announced today that they're moving to Intel. Wild!
I'm pretty pleased with this development, especially because now all the player haters get to eat crow. There were so many people saying that it couldn't be done, that the porting would be too tough, that Apple would lose market share... Well, the market share part is yet to be determined, but I think a move to Intel can only spell good news for Apple.
And one can only hope that as time goes on, OS X gains more market share. It's my favorite development platform, both for the ease of development, and wide range of features developers can use to build their apps. I don't know if Apple would ever just sell the OS and let it run on any type of x86 hardware, but pretty soon, developers will have the luxury of a single architecture that can more or less run Windows or OS X. How many people will stick with Windows, given the glut of spyware and viruses? I'm hoping there are more cheap Mac Mini-like machines coming down the road as well.
The other great upside is that video cards should speed up again. I don't think the drivers will be swappable, but at least we won't need separate video cards for our Macs anymore. You never know on the driver front though, FreeBSD 6 has support for using binary windows network drivers, so maybe there's some magic glue that will let OS X take advantage of the tuned windows video card drivers. Either way, I'm hoping that Leopard will be spending time on OpenGL performance, since Tiger is fast on the UI side, but not as fast as it could be when crunching OpenGL for things like video or image servers.
Posted by djb at 12:37 PM | Comments (0)
June 04, 2005
I've got a new job...
Well, this is exciting news... I'm moving to the Silicon Valley in a couple months. I'll be living somewhere around San Jose, but I'm not quite sure exactly where yet. Well, that, and I'm also looking at living in San Francisco and taking mass transit south for my daily commute.
So the next couple months are going to be very busy for me. I'll be going through all my stuff, and trying to only take 25% of it down with me, with the rest going to Goodwill and the local dump. I'll be finding a new home for my Sony WEGA tv, because there's no way in hell I'm getting movers to carry that 350lb behemoth into my new place, the last two times I moved it was two times too many. I've had people move that tv into four residences in the past five years. I'll also be preparing my condo for sale, which should be fun, I'll end up making all the little improvements I wanted to do over the past couple years but didn't have time to do yet. Optimally, I wouldn't be selling it this summer but instead in early fall, but we'll see what kind of interest there is. I have a great view of Lake Union from my deck, so I'm thinking it will go pretty quickly.
I've worked out a budget for air travel, so barring any global crash in the oil market, I'll be able to fly back to Seattle every couple months and for Thanksgiving and Christmas. I'm really looking forward to exploring San Francisco's food options, and hoping that I'll be able to find a relatively urban place that has character. I wouldn't be happy in the suburban developments.
It's sad to leave Seattle, but I've never lived anywhere but here, so moving somewhere more cosmopolitan is very appealing to me. The next few months are going to be pretty wild.
Posted by djb at 05:50 PM | Comments (0)
June 02, 2005
seattle.rb hackfest and monthly meeting
The seattle.rb hackfest last weekend turned out pretty well. I was only able to make it on Saturday afternoon, since I had homework, but it was a lot of fun.
We also had a seattle.rb meeting on Tuesday night, using the new meeting location at Amazon's building down in the international district. The ID is a great place for user group meetings, there are several great restaurants for pre-meeting meals. A group of us met up at Shanghai Garden, where we introduced each other to our favorite dishes. Afterwards, we walked through the Uwajimaya grocery store and got ice cream at the milk tea stand.
The meeting was interesting, it was a little less structured than normal. We had nine people show up, with a mix of experience represented. We summarized the hackfest changes to rubygems and that led to an impromptu demo of 43 things by Eric. This led to a discussion of how different people were using Ruby at work, and what were the shortcomings that people had run into. The main thing mentioned was how the green threads made things difficult at times, and the lack of a unified bundling format for third-party ruby code out there, and the large amount of abandonware there was. rubygems is addressing the packaging and install aspects of third-party code, but there's a lot of good stuff out there that is either abandoned or only has docs in Japanese.
I got the room to do a little brainstorming on new debugger features, but it turned out that most of the users at the meeting hadn't used it much before. So we took some time to cover some of the debugger's more interesting features.
Eric also demoed his new database query grapher. It's pretty sweet. Rails keeps a detailed log of all queries done, so Eric wrote a script that parses the sql trace for each query and finds all queries that joined tables together. For each join pair, he created an edge in a graph, with the label being how many times those two tables were joined in the logs. He plugged the graph object into a Graphviz dot file, and then used the OS X Graphviz client (I use it too, it rocks!) to manipulate the graph to show us which tables were most heavily linked together. This was an interesting way to determine how to partition the existing 43 Things database into multiple databases.
I also discussed a change I'd like to make to the Ruby garbage collector. I posted back in March how Ruby's garbage collector destroys copy-on-write semantics because when it does its mark pass through the list of ruby objects, it does the marking on the objects themselves instead of a separate mark tree.
I used a small program that required several libraries, and then forked off 10 children. In the normal case, the kids didn't do anything, so they used 250KB of memory a piece. When I flipped a flag, each child forced a GC run as soon as it was spawned, the result being that each child was consuming 2500KB, or 90% of what the parent was using.
This doesn't seem like a big deal, since most small Ruby programs just use threads for concurrency, but this ignores the juggernaut that is Rails. Production Rails installs use small httpds for the frontend that connect to Rails FastCGI processes running on the backend. 43 Things is consuming approximately 50MB of RSS (i.e. non-shared and actually allocated) memory per FCGI child. They don't have to, though. Each child should only be consuming 2-5MB a piece, and if Ruby's GC didn't stomp on copy-on-write semantics, they would.
The work of moving the marks to a separate tree structure isn't as bad as you'd think, since all the code for traversing the nodes and marking them up only exists in gc.c. Ruby already provides a nice C tree implementation, so it's basically a task of refactoring 500 lines of a 2000 line file. Once the work is done, forked children will consume much less memory than they do now, which means that Rails installs will be able to scale more linearly. Right now, you can only run so many Rails FCGI processes on a box before you start to thrash virtual memory.
I've got at least one hacker who's interested in doing the work with me, so we're going to tackle that over the next couple months. I'm thinking it would only take a week to make the bulk of the changes, and then a few more weeks to flush out the bugs and write some good unit tests to ensure correctness.
Posted by djb at 10:56 AM | Comments (1)