If you didn’t see it,
picked up where
left off, adding
a blog entry pointing to
more Opening Day entries — this
in the categories of
devices and device configuration, security, networking,
and standards. But there are still a ton of entries to
categorize, so picking up again in no particular order…
System calls are the among most fundamental mechanisms in operating systems:
they are the mechanism by which untrusted, unprivileged software requests
a service of trusted, privileged software. We are lucky to have two
great entries describing the architecture-specific mechanisms of
system calls in Solaris:
Russ Blaine’s entry
system calls on x86, and
calls on SPARC. Then, to understand the architectural-neutral aspects of
system calls, head over to
how to add a system call.
As a quick aside, that
last entry is a great example of how we in Solaris Kernel Development
are using blogs to write
down information that (believe it or not) has just been an unspoked part
of the craft before now. As
Tim Bray observed,
blogs have become a critical conduit of information for us — we believe
that they are the most scalable way to get information from the
people who have it to the people who need it. If (when?) you become
an OpenSolaris developer,
you can expect some friendly peer pressure to create a blog and
join the party.
Build process and workspace management.
We pride ourselves on a seamless build process,
and a couple of entries have gone into various aspects of this in depth.
To give you an idea of how seriously we take the build process — and
why — check out
entry on using lint to find security vulnerabilities.
In particular, note what Scott says when he added a new lint option that
500 new warnings: “I needed to fix all of these before integrating
my change to
Makefile.master because we require the Solaris source to be
lint-clean.” To which I add only, “dammit.”
Next, head over to
entry describing the work he did to support
non-root builds. Jim’s entry demonstrates how difficult it is to
radically change the build process — and how he managed to pull it off.
Finally, if you want to really let your makefile flag fly,
entry describing the build support for localized messages.
In terms of workspace management, you’ll want to check out
entry describing our workspace management tool, wx. For a long
was a shell script in
Bonwick’s home directory.
It was incredibly useful, but it was also easy to accidentally blow your
Bart is fond of saying, it
was “all blade and no handle.”) Will’s rewrite made for a much more
safer, much more sophisticated wx — and it was a huge help to
us in automating the final approach of the
Debuggability. If you read just a couple of the
Opening Day entries,
you probably noticed a trend: many of the entries were about finding
some nasty bug in the system.
This is an accurate reflection of our ethos in developing Solaris:
the operating system must be reliable above all else, and we view
debugging the operating system as our primary responsibility.
This responsibility runs deeper than just the act of debugging, because
our needs so outstripped existing tools that
we designed and built
our own — most notably
Fortunately, we ship these tools to you, so you can use them on your
own system and on your own applications.
There are many entries describing these tools and how they were used
to tackle a problem.
Fittingly, a good place to start is
entry describing using mdb to debug a sendmail bug. This bug is
which has one of the
greatest bug synopses of all time: “sendmail died in a two SIGALRM fire.”
For more on the power of mdb,
take a look at
using mdb to debug a scheduling problem,
Ashish Mehta’s entry
mdb to debug
a race condition, and
Eric Kustarz’s entry demonstrating an mdb debugger command (“dcmd”) that he wrote to
retrieve NFSv4 recovery messages postmortem.
This last example is a particularly good one
because this is exactly the kind of custom debugging
infrastructure that mdb’s modular architecture makes easy to build.
For a comprehensive example of how we have developed subsystem-specific
debugging infrastructure, read
entry on the
dcmds related to STREAMS.
As Sasha mentions, the place to start for learning to write your
own modules is the
but you can get a flavor for it by reading
entry on writing a
a module for kmdb.
kmdb is the in-situ kernel debugger that implements mdb, and when you
need it, nothing else will do — as
Dan Mick describes
in his entry on debugging with kmdb and moddebug.
For more details on kmdb itself,
kmdb’s design and implementation.
To see how mdb can help debug your application, take a look at
Will Fiveash’s notes
on using debugging application memory problems. Will
mentions ::findleaks, a debugger command that I originally
implemented for kernel crash dumps, and that
ported to work on application core files and — as he mentions in
his entry —
reworking it substantially in the process.
While mdb is the acme of postmortem debugging,
if the manifestation of a bug is non-fatal, it’s often more
to use DTrace to debug it.
For an exanple of this,
entry on using DTrace to debug jitter.
It was gratifying to see Bart debug this problem using DTrace, because
latency bubbles were actually one of the motivating pathologies behind
And finally, debuggability doesn’t end with tools; subsystems must be
debuggability in mind, as
describes in his entry on
designing libuutil for debuggability.
I think that about does it for today. As someone pointed out on Liane’s
blog, we need a Wiki for this; we agree — it’s on the list of planned
opensolaris.org. Until then,
stay tuned for more sifting…