Recently, Karim Yaghmour posted the following to the linux-kernel mailing list:
As I noted when discussing this with Andrew, we've been trying to get LTT into the kernel for the past five (5) years. During that time we've repeatedly encountered the same type of arguments for not including it, and have provided proof as to why those arguments are not substantiated. Lately I've at least got Andrew to admit that there were no maintenance issues with the LTT trace statements (given that they've literally remained unchanged ever since LTT was introduced.) In an effort to address the issues regarding the usefulness of such a tool, I direct those interested to this article on DTrace, a trace utility for Solaris: http://www.theregister.co.uk/2004/07/08/dtrace_user_take/ <rant> With LTT and DProbes, we've basically got almost everything this tool claims to provide, save that we would be even further down the road if we did not need to spend so much time updating patches ... </rant> Karim -- Author, Speaker, Developer, Consultant
Now, Karim’s really only interested in DTrace it that it helps him make his larger point that his project has been unfairly (or unwisely) denied entry into the Linux kernel.
His is a legitimate point, and something that is often lost in the assertions that Linux is developed faster than other operating systems: for all of its putative development speed, Linux has a surprising number of otherwise valuable projects that have been repeatedly denied entry for reasons that seem to be petty and non-technical. DProbes/LTT is certainly one example of such a project, and LKCD is probably another.
But what of Karim’s assertion that LTT and DProbes “basically [have] everything [DTrace] claims to provide”? This claim is false, and indicates that while Karim may have scanned
The Register article, he didn’t bother to browse even
our USENIX paper — let alone
our documentation. From these, one will note that
while LTT lacks many DTrace niceties, it also lacks several vital features. Two among these are
aggregations and
thread-local variables — two features that are not syntactic sugar or bolted-on afterthoughts, but rather are core to the DTrace architecture. These features turn out to be essential in using DTrace to quickly resolve problems. For an example of how these features are used, see Section 9 of
our USENIX paper — and note that every script that we wrote to debug that problem used aggregations, and that several critical steps were only possible with thread-local variables.
And fortunately, you don’t even have to take my word for it: RedHat developer Daniel Berrangé has
posted a comparison of DTrace and DProbes/LTT that reaches roughly the same conclusions…
4 Responses
LTT hasn’t been unfairly denied entry. It’s been denied entry because it introduces lots of ugly, permanent, trace points (and earlier patches were also utterly vile; it’s a little better now). Also, Karim has not delivered the patch in the standard way (sets of small, obvious patches).
kprobes, if anything, is what’s been treated unfairly, though I’ll note that the first serious attempt to submit it happened during the 2.5 feature freeze.
The current LTT patch should be abandoned, and later re-worked to work completely on top of kprobes using nop-insn instrumentation techniques. It might stand a chance then.
I certainly didn’t intend to come across as defending Karim or the LTT patch — I’m sure there are plenty of good reasons for not accepting the patch. That said, Karim’s obviously right in that not having DProbes/LTT in Linux ends up costing the DProbes/LTT developers, as they spend much of their time issuing new patches instead of undertaking new development. But whether this cost is more or less than the cost would have been to Linux had DProbes/LTT been accepted is really not for me to judge — I’m much more concerned about making clear that DProbes/LTT lack features considered fundamental in DTrace.
I don’t know how to explain these advantages to those who don’t understand perlish languages like D. Would it be accurate to say that Dtrace aggregation allows you to gather data from an arbitrary database of virtual tracepoints throughout the kernel? Do thread local variables give you the ability to preserve parameters passed to the kernel from specific threads? (i.e. you don’t just know that some thread called read(), you know who called read() and with what parameters.)
DTrace aggregation allows you to collate data at the source — eliminating unnecessary processing. And thread-local variables certainly allow you to preserve parameters. Combining them allows you to (say) aggregate time for a read() based on file descriptor.