August 2010 – The Observation Deck

DTrace, node.js and the Robinson Projection

August 30, 2010
4 Comments

When I joined Joyent, I mentioned that I was seeking to apply DTrace to the cloud, and that I was particularly excited about the development of node.js — leaving it implict that the intersection of the two technologies would be naturally interesting, As it turns out, we have had an early opportunity to show the potential here: as you might have seen, the Node Knockout programming contest was held over the weekend; when I first joined Joyent (but four weeks ago!), Ryan was very interested in potentially using DTrace to provide a leaderboard for the competition. I got to work, adding USDT probes to node.js. To be fair, this still has some disabled overhead (namely, getting into and out of the node addon that has the true USDT probe), but it’s sufficiently modest to deploy DTrace-enabled node’s in production.

And thanks to incredibly strong work by Joyent engineers, we were able to make available a new node.js service that allocated a container per user. This service allowed us to make available a DTrace-enabled node to contestants — and then observe all of that from the global zone.

For example of the DTrace provider for node.js, here’s a simple enabling to print out HTTP requests as zones handle them (running on one of the Node Knockout machines):

# dtrace -n 'node*:::http-server-request{printf("%s: %s of %s\n", \
    zonename, args[0]->method, args[0]->url)}' -q
nodelay: GET of /poll6759.479651377309
nodelay: GET of /poll6148.392275444794
nodebodies: GET of /latest/
nodebodies: GET of /latest/
nodebodies: GET of /count/
nodebodies: GET of /count/
nodelay: GET of /poll8973.863890386003
nodelay: GET of /poll2097.9667574643568
awesometown: GET of /graphs/4c7a650eba12e9c41d000005.js
awesometown: POST of /graphs/4c7a650eba12e9c41d000005/appendValue
awesometown: GET of /graphs/4c7acd5ca121636840000002.js
awesometown: GET of /graphs/4c7a650eba12e9c41d000005.js
awesometown: GET of /graphs/4c7a650eba12e9c41d000005.js
awesometown: GET of /graphs/4c7a650eba12e9c41d000005.js
awesometown: GET of /graphs/4c7b2408546a64b81f000001.js
awesometown: POST of /faye
awesometown: POST of /faye
...

I added probes around both HTTP request and HTTP response; treating the file descriptor as a token that describes that uniquely describes that request while it is pending (an assumption that would only be invalid in the presence of HTTP pipelining), allows one to actually determine the latency for requests:

# cat http.d
#pragma D option quiet

http-server-request
{
        ts[this->fd = args[1]->fd] = timestamp;
        vts[this->fd] = vtimestamp;
}

http-server-response
/this->ts = ts[this->fd = args[0]->fd]/
{
        @t[zonename] = quantize(timestamp - this->ts);
        @v[zonename] = quantize(vtimestamp - vts[this->fd]);
        ts[this->fd] = 0;
        vts[this->fd] = 0;
}

tick-1sec
{
        printf("Wall time:\n");
        printa(@t);

        printf("CPU time:\n");
        printa(@v);
}

This script makes the distinction between wall time and CPU time; for wall-time, you can see the effect of long-polling, e.g. (the values are nanoseconds):

    nodelay
           value  ------------- Distribution ------------- count
           32768 |                                         0
           65536 |                                         4
          131072 |@@@@@                                    52
          262144 |@@@@@@@@@@@@@@@@@@                       183
          524288 |@@@@@                                    55
         1048576 |@@@                                      27
         2097152 |@                                        9
         4194304 |                                         5
         8388608 |@                                        8
        16777216 |@                                        6
        33554432 |@                                        9
        67108864 |@                                        7
       134217728 |@                                        12
       268435456 |@                                        11
       536870912 |                                         1
      1073741824 |                                         4
      2147483648 |                                         1
      4294967296 |                                         5
      8589934592 |                                         0
     17179869184 |                                         1
     34359738368 |                                         1
     68719476736 |                                         0

You can also look at the CPU time to see those that are doing more actual work. For example, one zone with interesting CPU time outliiers:

  nodebodies
           value  ------------- Distribution ------------- count
         4194304 |                                         0
         8388608 |@@@@@@@@@@@@                             57
        16777216 |@@@@                                     21
        33554432 |@@@@                                     18
        67108864 |@@@@@@@                                  34
       134217728 |@@@@@@@@@@@                              54
       268435456 |                                         0
       536870912 |                                         0
      1073741824 |                                         0
      2147483648 |                                         0
      4294967296 |@                                        3
      8589934592 |@                                        4
     17179869184 |                                         0

Note that because node has a single thread do all processing, we cannot assume that the requests themselves are inducing the work — only that CPU work was done between request and response. Still, this data would probably be interesting to the nodebodies team…

I also added probes around connection establishment; so here’s a simple way of looking at new connections by zone:

# dtrace -n 'node*:::net-server-connection{@[zonename] = count()}'
dtrace: description 'node*:::net-server-connection' matched 44 probes
^C

  explorer-sox                                                      1
  nodebodies                                                        1
  anansi                                                           69
  nodelay                                                         102
  awesometown                                                     146

Or if we wanted to see which IP addresses were connecting to, say, our good friends at awesometown (with actual addresses
in the output elided):

# dtrace -n 'node*:::net-server-connection \
    /zonename == "awesometown"/{@[args[0]->remoteAddress] = count()}'
dtrace: description 'node*:::net-server-connection' matched 44 probes
  XXX.XXX.XXX.XXX                                                   1
  XX.XXX.XX.XXX                                                     1
  XX.XXX.XXX.XXX                                                    1
  XX.XXX.XXX.XX                                                     1
  XXX.XXX.XX.XXX                                                    1
  XXX.XXX.XX.XX                                                     2
  XXX.XXX.XXX.XX                                                    8

Ryan saw the DTrace support I had added, and had a great idea: what if we took the IPs of incoming connections and geolocated them, throwing them on a world map and coloring them by team name? This was an idea that was just too exciting not to take a swing at, so we got to work. For the backend, the machinery was begging to itself be written in node, so I did a libdtrace addon for node and started building a scalable backend for processing the DTrace data from the different Node Knockout machines. Meanwhile, Joni came up with some mockups that had everyone drooling, and Mark contacted Brian from Nitobi about working on the front-end. Brian and crew were as excited about it as we were, and they put front-end engineer extraordinaire Yohei on the case — who worked with Rob on the Joyent side to pull it all together. Among Rob’s other feats, he managed to implement in JavaScript the logic for plotting longitude and latitude in the beautiful Robinson projection — which is a brutally complicated transformation. It was an incredible team, and we were pulling it off in such a short period of time and with such a firm deadline that we often felt like contestants ourselves!

The result — which it must be said works best in Safari and Chrome — is at http://leaderboard.no.de. In keeping with both the spirit of node and DTrace, the leaderboard is updated in real-time; from the time you connect to one of the Joyent-hostest (no.de) contestants, you should see yourself show up in the map in no more than 700 milliseconds (plus your network latwork latency). For crowded areas like the Bay Area, it can be hard to see yourself — but try moving to Cameroon for best results. It’s fun to watch as certain contestants go viral (try both hovering over a particular data point and clicking on the team name in the leaderboard) — and you can know which continent you’re cursing at in http://saber-tooth-moose-lion.no.de (now known to the world as Swarmation).

Enjoy both the leaderboard and the terrific Node Knockout entries (be sure to vote for your favorites!) — and know that we’ve only scratched the surface of what DTrace and node.js can do together!

The liberation of OpenSolaris

August 19, 2010
12 Comments

As many have seen, Oracle has elected to stop contributing to OpenSolaris. This decision is, to put it bluntly, stupid. Indeed, I would (and did) liken it to L. Paul Bremer‘s decision to disband the Iraqi military after the fall of Saddam Hussein: beyond merely a foolish decision borne out of a distorted worldview, it has created combatants unnecessarily. As with Bremer’s infamous decision, the bitter irony is that the new combatants were formerly the strongest potential allies — and in Oracle’s case, it is the community itself.

As it apparently needs to be said, one cannot close an open source project — one can only fork it. So contrary to some reports, Oracle has not decided to close OpenSolaris, they have actually decided to fork it. That is, they have (apparently) decided that it is more in their interest to compete with the community that to cooperate with it — that they can in fact out-innovate the community. This confidence is surprising (and ironic) given that it comes exactly at the moment that the historic monopoly on Solaris talent has been indisputably and irrevocably broken — as most recently demonstrated by the departure of my former colleague, Adam Leventhal.

Adam’s case is instructive: Adam is a brilliantly creative engineer — one with whom it was my pleasure to work closely over nearly a decade. Time and time again, I saw Adam not only come up with innovative solutions to tough problems, but run those innovations through the punishing gauntlet that separates idea from product. One does not replace an engineer like Adam; one can only hope to grow another. And given his nine years of experience at the company and in the guts of the system, one cannot expect to grow a replacement quickly — if at all. Oracle’s loss, however, is the community’s gain; I hope I’m not tipping his hand too much to say that Adam will continue to be deeply engaged in the system, leading a new generation of engineers — but this time within a larger community that spans multiple companies and interests.

And in this way, odd as it may be, Oracle’s decision to fork is actually a relief to those of us whose businesses depend on OpenSolaris: instead of waiting for Oracle to engage the community, we can be secure in the knowledge that no engagement is forthcoming — and we can invest and plan accordingly. So instead of waiting for Oracle to fix a nagging driver bug or address a critical request for enhancement (a wait that has more often than not ended in disappointment anyway), we can tap our collective expertise as a community. And where that expertise doesn’t exist or is otherwise unavailable, those of us who are invested in the system can explicitly invest in building it — and then use it to give back to the community and contribute.

Speaking for Joyent, all of this has been tangibly liberating: just the knowledge that we are going to be cranking our own builds has allowed us to start thinking along new dimensions of innovation, giving us a renewed sense of control over our stack and our fate. I have already seen this shift in our engineers, who have begun to conceive of ideas that might not have been thought practical in a world in which Oracle’s engagement was so uncertain. Yes, hard problems lie ahead — but ideas are flowing, and the future feels alive with possibility; in short, innovation is afoot!

The node.js demographic

August 11, 2010
3 Comments

I went to the node.js meetup last night in Palo Alto, and it was an interesting affair on several levels. First (and least surprisingly), it was packed, with the Sencha folks joking that they would need to move to a bigger space just to be able to host the event. Second, the technical content itself was intruiging, with fellow Joyeur (and node BDFL) Ryan on dealing with flow control in node, Jed on (fab), future fellow Joyeur Issac on npm; and Tim demo’ing some Connect-based apps, including a simple web-based shared world app in which the room could (and did) participate. Not surprisingly, the performance of this last demo was snappy under load — so much so that it merits repeating an observation that many are currently making: it is increasingly clear that an early space — if not the first — in which we are going to see broad deployment of node-based apps is online social gaming, a space in which node represents a decisive competitive advantage by offering the potential for much more interactive (and more social) gameplay, and one in which there is substantial code churn to begin with. (And of course, speaking from Joyent’s perspective, this is a fortunate confluence: online gaming is also a space that sorely needs the elasticity that the cloud alone can provide.)

So the attendance and content were certainly notable, but most interesting of all to me was the demographic: given that node has become something of the latest hotness (and especially given that it being in JavaScript gives it a pretty wide net), one might expect node’s enthusiasts to be amateurs or novices. That this was emphatically not the case was clear to me shortly after arriving, when I had the unexpected pleasure of reuniting with fellow CS169 head TA Peter Griess. Not to be overly chummy or clubby, but walking into a meetup and seeing one of the tribe tells you something immediate about not just the room, but the technology itself: that it is not mere syntactic sugar or iconoclasm for its own sake, but rather a true revolution in the way certain classes of systems are designed and built. And indeed, over the course of the evening, it became clear that within the room there was an impressive amount of actual experience deploying real systems, with seasoned technologists like Matt Ranney who aren’t merely writing new apps in node, they are rewriting old apps in node. This is a key point, and it goes to the fact that node is not just an easier way of doing things (though that too, certainly) but rather that it offers such a vastly improved runtime that it merits reevaluation of systems that one has already built and deployed.

To me, the systems experience in the room offered an implicit rebuttal to some of the inane criticism of node — criticism that essentially amounts to discrediting node merely because of its newness or its popularity. (And even more enlightened criticism ultimately disappoints with what essentially amounts to an attack on the basis of style, not substance.) To be sure, node is still a young technology, and there is much engineering work still to be done. (For a concrete example of this, see Paul‘s description of the SSL problem.) But with so much deep systems experience in the community — and with the healthy, collaborative vibe that was on display last night — it’s hard to be anything but optimistic!

OpenSolaris and the power to fork

August 3, 2010
8 Comments

Back when Solaris was initially open sourced, there was a conscious effort to be mindful of the experiences of other projects. In particular — even though it was somewhat of a paradox — it was understood how important it was for the community to have the power to fork the operating system. As I wrote in January, 2005:

If there’s one thing we’ve learned from watching Linux, it’s to not become forkophobic. Paradoxically, in an environment where forks are actively encouraged (e.g. Linux) forking seems to be less of a problem than in environments where forking is viewed as apostasy (e.g. BSD).

Unfortunately — and now in hindsight — we know that OpenSolaris didn’t go far enough: even though the right to fork was understood, there was not enough attention paid to the power to fork. As a result, the operating system never quite got to being 100% open: there remained some annoying (but essential) little bits that could not be opened for one historical (i.e., legal) reason or another. When coupled with the fact that Sun historically had a monopoly or near-monopoly on Solaris engineering talent, the community was entirely deprived of the oxygen that it would have needed to exercise its right to fork.

But change is afoot: over the last six months, the monopoly over Solaris engineering talent has been broken. And now today, we as a community have turned an important corner with the announcement of the Illumos project. Thanks to the hard work of Garrett D’Amore and his band of co-conspirators, we have the beginning of open sourced variants of those final bits that will allow for not just the right but the power to fork. Not that anyone wants to set out to fork the system, of course, but that power is absolutely essential for the vitality of any open source community — and so will be for ours. Kudos to Garrett and crew; on behalf of all of us in the community, thank you!

Month: August 2010

Recent Posts

Archives

Archives