The Observation Deck

Month: July 2007

I finished Beautiful Code this week, and have been reflecting on the book and its development. In particular, I have thought back to some of the authors’ discussion, in which some advocated a different title. Many of us were very strongly in favor of the working title of “Beautiful Code”, and I weighed in with my specific views on the matter:

Date: Tue, 19 Sep 2006 16:18:22 -0700
From: Bryan Cantrill
To: [Beautiful Code Co-Authors]
Subject: Re: [Beautifulcode] reminder: outlines due by end of September
Probably pointless to pile on here, but I'm for "Beautiful Code", if
only because it appropriately expresses the nature of our craft.  We
suffer -- tremendously -- from a bias from traditional engineering that
writing code is like digging a ditch:  that it is a mundane activity best
left to day labor -- and certainly beneath the Gentleman Engineer.  This
belief is profoundly wrong because software is not like a dam or a
superhighway or a power plant:  in software, the blueprints _are_ the
thing; the abstraction _is_ the machine.

It is long past time that our discipline stood on its feet, stepped out
from the shadow of sooty 19th century disciplines, and embraced our
unique confluence between mathematics and engineering.  In short,
"Beautiful Code" is long, long overdue.

Now, I don’t disagree with my reasoning from last September (though I think that the Norma Rae-esque tone was probably taking it a bit too far), but having now read the book, I stand by the title for a very different reason: this book is so widely varied — there are so many divergent ideas here — that only the most subjective possible title could encompass them. That is, any term less subjective than “beautiful” would be willfully ignorant of the disagreements (if implicit) among the authors about what constitutes ideal software.

To give an idea of what I’m talking about, here is the breakdown of languages and their representations in chapters:

Language Chapters
C 11
Java 5
Scheme/Lisp 3
C++ 2
Fortran 2
Perl 2
Python 2
Ruby 2
C# 1
JavaScript 1
Haskell 1
VisualBASIC 1

(I’m only counting each chapter once, so for the very few chapters that included two languages, I took whatever appeared more frequently. Also, note that some chapters were about the implementation of one language feature in a different language — so for example, while there are two additional chapters on Python, both pertain more to the C-based implementation of those features than to their actual design or use in Python.)

Now, one could argue (and it would be an interesting argument) about how much choice of language matters in software or, for that matter, in thought. And indeed, in some (if not many) of these chapters, the language of implementation is completely orthogonal to the idea being discussed. But I believe that language does ultimately affect thought, and it’s hard to imagine how one could have a sense of beauty that is so uncritical as to equally accommodate all of these languages.

More specifically: read say, R. Kent Dybvig’s chapter on the implementation of syntax-case in Scheme and William Otte and Douglas Schmidt’s chapter on implementing a distributed logging service using an object-oriented C++ framework. It seems unlikely to me that one person will come away saying that both are beautiful to them. (And I’m not talking new-agey “beautiful to someone” kind of beautiful — I’m talking the “I want to write code like that” kind of beautiful.) This is not meant to be a value judgement on either of these chapters — just the observation that their definitions of beauty are (in my opinion, anyway) so wildly divergent as to be nearly mutually exclusive. And that’s why the title is perfect: both of these chapters are beautiful to their authors, and we can come away saying “Hey, if it’s beautiful to you, then great.”

So I continue to strongly recommend Beautiful Code, but perhaps not in the way that O’Reilly might intend: you should read this book not because it’s cover-to-cover perfection, but rather to hone your own sense of beauty. To that end, this is a book best read concurrently with one’s peers: discussing (and arguing about) what is beautiful, what isn’t beautiful, and why will help you discover and refine your own voice in your code. And doing this will enable you to write the most important code of all: code that is, if nothing else, beautiful to you.

So my copy of Beautiful Code showed up last week. Although I am one of the (many) authors and I have thus had access to the entire book online for some time, I do all of my pleasure reading in venues that need the printed page (e.g. the J Church) and have therefore waited for the printed copy to start reading.

Although I have only read the first twelve chapters or so, it’s already clear (and perhaps not at all surprising) that there are starkly different definitions of beauty here: the book’s greatest strength — and, frankly, its greatest weakness — is that the chapters are so incredibly varied. For one chapter, beauty is a small and titilating act of recursion; for the next, it’s that a massive and complicated integrated system could be delivered quickly and cheaply. (I might add that the definition of beauty in my own chapter draws something from both of these poles: that in software, the smallest and most devilish details can affect the system at the largest and most basic levels.)

If one can deal with the fact that the chapters are widely divergent, and that there is not even a token attempt to weave them together into a larger tapestry, this book (at least so far, anyway) is (if nothing else) exceptionally thought provoking; if Oprah were a code cranking propeller head, this would be the ideal choice for her book club.

Now in terms of some of my specific thoughts that have been provoked: as I mentioned, quite a few of my coauthors are enamored with the elegance of recursion. While I confess that I like writing a neatly recursive routine, I also find that I frequently end up having to unroll the recursion when I discover that I must deal with data structures that are bigger than I anticipated — and that my beautiful code is resulting (or can result) in a stack overflow. (Indeed, I spent several unpleasant days last week doing exactly this when I discovered that pathologically bad input could cause blown stacks in some software that I’m working on.)

To take a concrete example, Brian Kernighan has a great chapter in Beautiful Code about some tight, crisp code written by Rob Pike to perform basic globbing. And the code is indeed beautiful. But it’s also (at least in a way) busted: it overflows the stack on some categories of bad input. Admittedly, one is talking about very bad input here — strings that consist of hundreds of thousands of stars in this case — but this highlights exactly the problem I have with recursion: it leaves you with edge conditions that on the one hand really are edge conditions (deeply
pathological input), but with a failure mode (a stack overflow) that’s just too nasty to ignore.

Now, there are ways to deal with this. If one can stomach it, the simplest way to deal with this is to setup a sigaltstack and then siglongjmp out of a SIGSEGV/SIGBUS signal handler. You have to be very careful about doing this: the signal handler should look at the si_addr field in the siginfo and comparing it to the stack bounds to confirm that it’s a stack overflow, lest it end up siglongjmp‘ing out of a non-recursion induced SIGSEGV (which, needless to say, would make a bad problem much worse). While an alternative signal stack solution may sound hideous to some, at least the recursion doesn’t have to go under the knife in this approach. If having a SIGSEGV handler to catch this condition feels uncomfortably brittle (as well it might), or if one’s state cannot be neatly unwound after an arbitrary siglongjmp (as well it might not), the code will have to change: either a depth counter will have to be passed down and failure propagated when depth exceeds a reasonable maximum, or the recursion will have to be unrolled into iteration. For most aesthetic senses, none of these options is going to make the code more beautiful — but they will make it indisputably more correct.

I was actually curious about where exactly the Pike/Kernighan code would blow up, so I threw together a little program that uses sigaltstack along with sigsetjmp/siglongjmp to binary search to find the shortest input that induces the failure. My program, which (naturally) includes the Pike/Kernighan code, is here.

Here are the results of running my program on a variety of Solaris platforms, with each number denoting the maximum string length that can be processed by the Pike/Kernighan code without the possibility of stack overflow.

32-bit 64-bit 32-bit 64-bit
Sun cc, unoptimized 403265 187225 77649 38821
gcc, unoptimized 327651 218429 69883 40315
Sun cc, optimized 327651 327645 174723 95303
gcc, optimized 582489 524227 149769 87367

As can be seen, there is a tremendous range here, even across just two different ISAs, two different data models and two different compilers: from 38,821 on 64-bit SPARC using Sun cc without optimization to 582,489 on 32-bit x86 using gcc with optimization — an order of magnitude difference. So while recursion is a beautiful technique, it is one that ends up with the ugliest of implicit dependencies: on the CPU architecture, on the data model and on the compiler. And while recursion is still beautiful to me personally, it will always be a beauty that is more superficial than profound…

For those who didn’t see it, Team DTrace was on the Scoble Show. As I mention at the end of the interview, this was the day after my younger son was born (if you look closely at my right wrist, you will note that I am still wearing my hospital bracelet in the interview). You can see the cigars that I offered at the end of the interview (and they were damn fine cigars, by the way) in a photo of Team DTrace that Scoble took afterwards. After the photo, we returned to our office and smoked the cigars — and then had an unplanned conversation with the building management about never again smoking cigars in the office. (I responded that I was done having kids, so they had nothing to worry about.)

Finally, as for DTrace on the iPhone (to which we made brief reference in the interview): it is now our understanding that alas, DTrace is not on the iPhone — Apple has apparently not yet ported DTrace to the ARM — but that a DTrace port “may be” in the works. So the dream is alive!

Recent Posts

November 26, 2023
November 18, 2023
November 27, 2022
October 11, 2020
July 31, 2019
December 16, 2018
September 18, 2018
December 21, 2016
September 30, 2016
September 26, 2016
September 13, 2016
July 29, 2016
December 17, 2015
September 16, 2015
January 6, 2015
November 10, 2013
September 3, 2013
June 7, 2012
September 15, 2011
August 15, 2011
March 9, 2011
September 24, 2010
August 11, 2010
July 30, 2010
July 25, 2010
March 10, 2010
November 26, 2009
February 19, 2009
February 2, 2009
November 10, 2008
November 3, 2008
September 3, 2008
July 18, 2008
June 30, 2008
May 31, 2008
March 16, 2008
December 18, 2007
December 5, 2007
November 11, 2007
November 8, 2007
September 6, 2007
August 21, 2007
August 2, 2007
July 11, 2007
May 20, 2007
March 19, 2007
October 12, 2006
August 17, 2006
August 7, 2006
May 1, 2006
December 13, 2005
November 16, 2005
September 13, 2005
September 9, 2005
August 21, 2005
August 16, 2005