Debuggability. If you read just a couple of the
Opening Day entries,
you probably noticed a trend: many of the entries were about finding
some nasty bug in the system.
This is an accurate reflection of our ethos in developing Solaris:
the operating system must be reliable above all else, and we view
debugging the operating system as our primary responsibility.
This responsibility runs deeper than just the act of debugging, because
our needs so outstripped existing tools that
we designed and built
our own — most notably
Fortunately, we ship these tools to you, so you can use them on your
own system and on your own applications.
There are many entries describing these tools and how they were used
to tackle a problem.
Fittingly, a good place to start is
entry describing using mdb to debug a sendmail bug. This bug is
which has one of the
greatest bug synopses of all time: “sendmail died in a two SIGALRM fire.”
For more on the power of mdb,
take a look at
using mdb to debug a scheduling problem,
Ashish Mehta’s entry
mdb to debug
a race condition, and
Eric Kustarz’s entry demonstrating an mdb debugger command (“dcmd”) that he wrote to
retrieve NFSv4 recovery messages postmortem.
This last example is a particularly good one
because this is exactly the kind of custom debugging
infrastructure that mdb’s modular architecture makes easy to build.
For a comprehensive example of how we have developed subsystem-specific
debugging infrastructure, read
entry on the
dcmds related to STREAMS.
As Sasha mentions, the place to start for learning to write your
own modules is the
but you can get a flavor for it by reading
entry on writing a
a module for kmdb.
kmdb is the in-situ kernel debugger that implements mdb, and when you
need it, nothing else will do — as
Dan Mick describes
in his entry on debugging with kmdb and moddebug.
For more details on kmdb itself,
kmdb’s design and implementation.
To see how mdb can help debug your application, take a look at
Will Fiveash’s notes
on using debugging application memory problems. Will
mentions ::findleaks, a debugger command that I originally
implemented for kernel crash dumps, and that
ported to work on application core files and — as he mentions in
his entry —
reworking it substantially in the process.
While mdb is the acme of postmortem debugging,
if the manifestation of a bug is non-fatal, it’s often more
to use DTrace to debug it.
For an exanple of this,
entry on using DTrace to debug jitter.
It was gratifying to see Bart debug this problem using DTrace, because
latency bubbles were actually one of the motivating pathologies behind
And finally, debuggability doesn’t end with tools; subsystems must be
debuggability in mind, as
describes in his entry on
designing libuutil for debuggability.