The Observation Deck

Close this search box.

Sifting through the blogs…

June 15, 2005

Yesterday was Opening Day for
we welcomed OpenSolaris with
of blog entries

various aspects of
the implementation.
The breadth and depth of our blogging
will hopefully
put to rest any notion that open sourcing Solaris isn’t a grass-roots
effort: if nothing else, it should be clear that we in the trenches
are very excited to finally be able to talk about the system
that we have poured so much of our lives into — and to welcome
new would-be contributors into the fold.

In our excitement, we may have overwhelmed a tad:
there was so much content yesterday, that it would have been impossible
for anyone to keep up — we blogged over 200,000 words (over 800 pages!)
yesterday alone.
So over the next few days, I want to highlight some entries that you
might have missed, broken down by subject area. In no particular order…

  • Fault management. Fault management in Solaris 10 has been completely
    revolutionized by the new predictive self-healing feature pioneered
    by my longtime co-conspirator
    Mike Shapiro. There are
    two must-read entries in this area:
    Andy Rudoff’s entry
    providing a
    predictive self-healing overview, and
    Dilpreet Bindra‘s
    entry going into more depth on PCI error handling. (If for nothing
    read Dilpreet’s entry for his Reading of the Vows between OpenSolaris and the

  • Virtual memory.
    The virtual memory system is core to any modern operating system, and
    there are several interesting entries here.
    Start with
    Eric Lowe‘s
    extensive entry
    describing page fault handling. As Eric rightly points out,
    page fault handling is the epicenter of the VM system; one can learn a
    tremendous amount about the system just by following page fault processing —
    and Eric is a great guide on this journey.
    Once you’ve read Eric’s entry,
    check out Michael Corcoran‘s
    entry on page coalescing,
    a technique to assure availability of
    large-sized pages — which are in turn necessary to increase TLB reach.
    And discussion of page_t‘s leads naturally
    brings you to
    Rick Mesta
    entry describing a
    big performance win by
    these structures during boot.

    A less-discussed aspect of virtual memory is the virtual memory layout
    of the kernel itself. To learn about some of the complexities of this,
    check out
    Kit Chow’s entry
    on address space limitations on 32-bit kernels.
    The limitation that Kit describes is one of the nasty gotchas of running
    32-bit x86 in flat mode. As Kit mentions, the best workaround is to run
    a 64-bit kernel — but if you’re stuck with a 32-bit x86 chip, you’ll want
    to read Kit’s suggestions carefully. Kit’s entry is a good segue to
    Prakash Sangappa’s
    entry describing his work on
    dynamic segkp for 32-bit x86 systems. Prakash’s work was critical
    for getting some more breathing space on 32-bit x86 systems — saving hundreds
    of megabytes of precious VA. Of course, the ultimate breathing space is
    that afforded by 64 bits of VA — and in this vein check out
    Nils Nieuwejaar‘s
    entry on the kernel address space layout on x64. Both
    Prakash and Nils
    quote one of those comments in the kernel source code that you really need to
    know about if you’re going to do serious kernel development: the comment
    describing the address space layout in
    i86pc/os/startup.c and
    This comment is one of the canonical ASCII-art comments (more on these
    eventually), and I usually find these comments in startup.c by
    searching forward for “----“.

  • Linking and Loading. One of the most polished subsystems in Solaris
    is the linker and loader — the craftsmanship of the engineers that have
    built it has been an ongoing inspiration for many of us in Solaris
    development. To learn more about the linker,
    start with
    Rod Evans’ entry
    taking you on
    a source tour of the link-editors, and then head over to
    Mike Walker’s
    entry describing library bindings.
    As long as you’re checking out
    the linker, be sure to look at past entries like
    Rod’s entry
    tracing of a
    As you can imagine, because the
    dynamic linker is invoked whenever a dynamically-linked binary is executed,
    it’s a natural place to improve performance — especially with
    complicated programs like Mozilla or StarOffice that are linked
    to hundreds (!) of shared objects. We’ve certainly found some big wins
    in the linker over the years, but we’ve also discovered that it’s difficult
    to help megaprograms without hurting nanoprograms — and vice versa.
    For an interesting description of this tradeoff, check out
    David Safford’s
    entry on dynamic
    linker performance
    . If nothing else, you’ll see from David’s work
    the research element of operating system development: we often aren’t
    assured of success when we endeavor to improve the system.

  • Scheduling. CPU scheduling is one of the most basic properties
    of a multitasking operating system. Despite being an old problem,
    we find ourselves constantly improving and extending this subsystem.
    To learn about CPU scheduling, start with
    Bill Kucharski’s
    entry describing
    architecture-specific elements of context switching
    . Then head
    over to
    Gavin Maltby’s
    entry describing
    short-term prevention of thread migration
    . (Before Gavin introduced
    this facility, the only way to prevent migration was to prevent kernel
    preemption — an overly blunt mechanism that led to
    really nasty latency bubble
    that I debugged many years ago.)

    If you’re going to understand thread dispatching, you’ll need to understand
    the way thread state is manipulated — and for that you’ll want to look at
    Saurabh Mishra’s
    entry describing
    . Thread locks are different from normal synchronization primitives,
    as you can infer from
    my own entry describing
    bug in user-level
    priority inheritance

    which is a good segue to a more general problem when dealing with
    thread control: how does one change the scheduling properties of a
    running thread?
    For an idea of how tricky this can be,
    check out
    Andrei Dorofeev’s
    entry describing
    processes to resource pools
    Andrei’s problem was even more challenging than traditional thread
    manipulation, as he needed to
    change the scheduling properties of a group of threads atomically.
    If for no other reason,
    you should read Andrei’s entry to learn of the
    “curse of disp.c.”
    Speaking of the cursed, wrap up your tour of scheduling
    Eric Saxe’s entry describing
    a wedged kernel
    — you’ll see from Eric’s odyssey
    that scheduling problems can require a lot of brain-bending (and patience) to debug!

Okay, I think that’s enough for today — and yet it
barely scratches the surface! I didn’t even touch on gigantic
topics with many Opening Day entries
like security, networking, I/O, filesystems, performance, scheduling,
service management, observability, etc. etc. Stay tuned — or check out
Opening Day entries
for yourself…

Technorati tags:

Leave a Reply

Recent Posts

November 18, 2023
November 27, 2022
October 11, 2020
July 31, 2019
December 16, 2018
September 18, 2018
December 21, 2016
September 30, 2016
September 26, 2016
September 13, 2016
July 29, 2016
December 17, 2015
September 16, 2015
January 6, 2015
November 10, 2013
September 3, 2013
June 7, 2012
September 15, 2011
August 15, 2011
March 9, 2011
September 24, 2010
August 11, 2010
July 30, 2010
July 25, 2010
March 10, 2010
November 26, 2009
February 19, 2009
February 2, 2009
November 10, 2008
November 3, 2008
September 3, 2008
July 18, 2008
June 30, 2008
May 31, 2008
March 16, 2008
December 18, 2007
December 5, 2007
November 11, 2007
November 8, 2007
September 6, 2007
August 21, 2007
August 2, 2007
July 11, 2007
May 20, 2007
March 19, 2007
October 12, 2006
August 17, 2006
August 7, 2006
May 1, 2006
December 13, 2005
November 16, 2005
September 13, 2005
September 9, 2005
August 21, 2005
August 16, 2005