The Observation Deck

Search
Close this search box.

Month: August 2005

It’s been an exciting few weeks for
DTrace.
The party got started with
Wez Furlong’s new
PHP
DTrace provider
at OSCON. Then
Devon O’Dell
announced that he was starting to work in earnest on a
DTrace
port to FreeBSD
. And now,
Rich Lowe
has made available a prototype
Ruby
DTrace provider
.
To install this, grab
Ruby 1.8.2,
apply
Rich’s
patch
, and run ./configure with
the --enable-dtrace option.
When you run the resulting ruby, you’ll see two probes:
function-entry and
function-return.
The arguments to these probes are as follows:

  • arg0 is the name of the class (a pointer to a string within Ruby)

  • arg1 is the name of the method (also a pointer to
    a string within Ruby)

  • arg2 is the name of the file containing the call site (again,
    a pointer to a string within Ruby)

  • arg3 is the line number of the call site.

So if, for example, you’d like to know the classes and methods that
are called in a particular Ruby script, you could do it with this
simple D script:

#pragma D option quiet
ruby$target:::function-entry
{
@[copyinstr(arg0), copyinstr(arg1)] = count();
}
END
{
printf("%15s %30s   %s\n", "CLASS", "METHOD", "COUNT");
printa("%15s %30s   %@d\n", @);
}

To run this against the cal.rb that ships in the sample
directory of Ruby, call the above script whatmethods.d and
run it this way:

# dtrace -s ./whatmethods.d -c "../ruby ./cal.rb"
August 2005
S  M Tu  W Th  F  S
1  2  3  4  5  6
7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
CLASS                         METHOD   COUNT
Array                             <=   2
Hash                           each   2
Hash                           keys   2
Module                           attr   2
Module               method_undefined   2
Module                         public   2
Rational                         coerce   2
Array                              +   3
Class                    civil_to_jd   3
Hash                             []   3
Object                        collect   3
Array                        collect   4
Class                      inherited   4
Range                           each   4
String                           size   4
Module           private_class_method   5
Object                           eval   5
Object                        require   5
String                           gsub   5
Class                     jd_to_wday   7
Class                           once   7
Date                       __8713__   7
Date                           wday   7
Fixnum                              %   8
Array                           join   10
Hash                            []=   10
String                              +   10
Array                           each   11
NilClass                           to_s   11
Module                   alias_method   22
Module                        private   22
Symbol                           to_s   26
Module                    module_eval   28
Date                           mday   31
Object                           send   31
Date                            mon   42
Date                      __11105__   43
Class                    jd_to_civil   45
Date                           succ   47
Class                            os?   48
Date                              +   49
Fixnum                             <=   49
String                          rjust   49
Class                      ajd_to_jd   50
Class                        clfloor   50
Date                      __10417__   50
Integer                           to_i   50
Object                        Integer   50
Rational                         divmod   50
Rational                           to_i   50
Date                               51
Date                            ajd   51
Date                             jd   51
Rational                               51
Class                           new0   52
Date                     initialize   52
Integer                           to_r   52
Object         singleton_method_added   67
Date                          civil   75
Symbol                           to_i   91
Float                              *   96
Float                         coerce   96
Fixnum                              /   97
Object                        frozen?   100
Rational                              -   104
Fixnum                           to_s   123
Array                             []   141
Object                          class   150
Module                   method_added   154
Float                              /   186
Module                            ===   200
Rational                              /   204
Rational                              +   248
Float                          floor   282
Fixnum                             <<   306
Class                         reduce   356
Integer                            gcd   356
Object                       Rational   356
Fixnum                              +   414
Class                           new!   610
Rational                     initialize   610
Class                            new   612
Fixnum                            abs   712
Fixnum                            div   762
Fixnum                              *   1046
Fixnum                                 1970
Fixnum                              -   2398
Object                       kind_of?   2439
Fixnum                             >>   4698
Fixnum                             []   7689
Fixnum                             ==   11436

This may leave us with many questions. For example, there are a couple
of calls to construct new objects — where are they coming from? To
answer that question:

#pragma D option quiet
ruby$target:::function-entry
/copyinstr(arg1) == "initialize"/
{
@[copyinstr(arg0), copyinstr(arg2), arg3] = count();
}
END
{
printf("%-10s %-40s %-10s %s\n", "CLASS",
"INITIALIZED IN FILE", "AT LINE", "COUNT");
printa("%-10s %-40s %-10d %@d\n", @);
}

Calling the above whereinit.d, we can run it in a similar
manner:

# dtrace -s ./whereinit.d -c "../ruby ./cal.rb"
August 2005
S  M Tu  W Th  F  S
1  2  3  4  5  6
7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
CLASS      INITIALIZED IN FILE                      AT LINE    COUNT
Cal        ./cal.rb                                 144        1
Date       /usr/local/lib/ruby/1.8/date.rb          593        1
Date       /usr/local/lib/ruby/1.8/date.rb          703        1
Date       /usr/local/lib/ruby/1.8/date.rb          916        1
Time       /usr/local/lib/ruby/1.8/date.rb          702        1
Date       /usr/local/lib/ruby/1.8/date.rb          901        49
Rational   /usr/local/lib/ruby/1.8/rational.rb      374        610

Looking at the Date class, it’s interesting to look at line 901 of
file /usr/local/lib/ruby/1.8/date.rb:

897   # If +n+ is not a Numeric, a TypeError will be thrown.  In
898   # particular, two Dates cannot be added to each other.
899   def + (n)
900     case n
   901     when Numeric; return self.class.new0(@ajd + n, @of, @sg)
902     end
903     raise TypeError, 'expected numeric'
904   end

This makes sense: we’re initializing new Date
instances in the +
method for Date. And where are those coming from?
It’s not hard to build a script that will tell us the file and line
for the call site of an arbitrary class and method:

#pragma D option quiet
ruby$target:::function-entry
/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
{
@[copyinstr(arg2), arg3] = count();
}
END
{
printf("%-40s %-10s %s\n", "FILE", "LINE", "COUNT");
printa("%-40s %-10d %@d\n", @);
}

For this particular example (Date#+()), call the above
wherecall.d and run it this way:

# dtrace -s ./wherecall.d "Date" "+" -c "../ruby ./cal.rb"
August 2005
S  M Tu  W Th  F  S
1  2  3  4  5  6
7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
FILE                                     LINE       COUNT
./cal.rb                                 102        2
./cal.rb                                 60         6
./cal.rb                                 63         41

And looking at the indicated lines in cal.rb:

55   def pict(y, m)
56     d = (1..31).detect{|d| Date.valid_date?(y, m, d, @start)}
57     fi = Date.new(y, m, d, @start)
58     fi -= (fi.jd - @k + 1) % 7
59
    60     ve  = (fi..fi +  6).collect{|cu|
61       %w(S M Tu W Th F S)[cu.wday]
62     }
    63     ve += (fi..fi + 41).collect{|cu|
64       if cu.mon == m then cu.send(@da) end.to_s
65     }
66

So this is doing exactly what we would expect, given the code.
Now, if we were interested in making this perform a little better,
we might be interested to know the work that is being induced
by Date#+(). Here’s a script that reports the classes and
methods called by a given class/method — and its callees:

#pragma D option quiet
ruby$target:::function-entry
/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
{
self->date = 1;
}
ruby$target:::function-entry
/self->date/
{
@[strjoin(strjoin(copyinstr(arg0), "#"),
copyinstr(arg1))] = count();
}
ruby$target:::function-return
/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
{
self->date = 0;
ndates++;
}
END
{
normalize(@, ndates);
printf("Each call to %s#%s() induced:\n\n", $$1, $$2);
printa("%@8d call(s) to %s()\n", @);
}

Calling the above whatcalls.d, we can answer the question
about Date#+() this way:

# dtrace -s ./whatcalls.d "Date" "+" -c "../ruby ./cal.rb"
August 2005
S  M Tu  W Th  F  S
1  2  3  4  5  6
7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
Each call to Date#+() induced:
1 call(s) to Class#new0()
1 call(s) to Class#reduce()
1 call(s) to Date#+()
1 call(s) to Date#initialize()
1 call(s) to Fixnum#+()
1 call(s) to Fixnum#<<()
1 call(s) to Integer#gcd()
1 call(s) to Module#===()
1 call(s) to Object#Rational()
1 call(s) to Object#class()
2 call(s) to Class#new()
2 call(s) to Class#new!()
2 call(s) to Fixnum#abs()
2 call(s) to Fixnum#div()
2 call(s) to Rational#+()
2 call(s) to Rational#initialize()
3 call(s) to Fixnum#*()
3 call(s) to Fixnum#()
23 call(s) to Fixnum#>>()
37 call(s) to Fixnum#[]()
52 call(s) to Fixnum#==()

That’s a lot of work for something that should be pretty simple! Indeed,
it’s counterintuitive that, say, Integer#gcd() would be called
from Date#+() — and it certainly seems suboptimal. I’ll leave
further exploration into this as an exercise to the reader, but suffice
it to say that this has to do with the use of a rational number in the
Date class —
the elimination of which would elminate most of the above calls and
presumably greatly improve the performance of Date#+().

Now, Ruby aficionados may note that some of the above functionality
has been available in Ruby by setting the
set_trace_func function (upon which the
Ruby profiler
is implemented). While that’s true (to a point — the set_trace_func
seems to be a pretty limited mechanism),
the Ruby DTrace provider is nonetheless a great lurch forward for
Ruby developers: it allows developers to use DTrace-specific constructs
like aggregations and thread-local variables to hone in on a problem;
it allows Ruby-related work performed lower in the stack (e.g.,
in the I/O subsystem, CPU dispatcher or network stack)
to be connected to the Ruby
code inducing it; it allows a running Ruby script to be
instrumented (and reinstrumented) without stopping or restarting it; and
it allows multiple, disjoint Ruby scripts to be coherently observed and
understood as a single entity. More succinctly, it’s just damned cool.
So thanks to Rich for developing the prototype provider — and if you’re
a Ruby developer, enjoy!

One of the exciting things about OpenSolaris is that it’s released under a license — the CDDL — that allows ports of individual components to other systems. In particular, at my OSCON presentation two weeks ago, I discussed some of the expertise required to port one such component, DTrace, to another system. I’m happy to now report that Devon O’Dell has started working on a port to one such system, FreeBSD. This has been talked about before (in some cases, with braggadocio), but Devon is the first to start the work in earnest. And indeed, work it will be: DTrace isn’t a simple system, and it has several dependencies on other, Solaris-specific system components. That said, it should certainly be possible, and we on Team DTrace are available to help out in any way we can. So if you’re interested in working on this, you should ping Devon — I know that he’ll welcome the help. And if you have specific questions about DTrace internals (or anything, for that matter), swing by #opensolaris and join the party!

Today,
Mike,
Adam
and I had lunch with
Jeff Waugh, who is in town for LinuxWorld.
He showed us his, we showed him ours, and a great time was had by all.
I think everyone agreed that
a system with Debian packages,
Ubuntu package management,
the Solaris Service Management Facility
and (of course) DTrace would be one hell of a system.
And with OpenSolaris, this paradise seems tantalizingly attainable — or
does original sin
prevent us from reentering the garden?

If you were at my presentation at OSCON yesterday, I apologize for the brisk pace — there was a ton of material to cover, and forty-five minutes isn’t much time. Now that I’ve got a little more time, I’d like to get into the details of the PHP demonstration that I did during the presentation. Here is the (very) simple PHP program that I was using:

<?php
function blastoff()
{
        echo "Blastoff!\\n\\n";
}

function one()
{
        echo "One...\\n";
        blastoff();
}

function two()
{
        echo "Two...\\n";
        one();
}

function three()
{
        echo "Three...\\n";
        two();
}

function launch()
{
        three();
}

while (1)
{
        launch();
        sleep(1);
}
?>

Running this in a window just repeats this output:

% php ./blastoff.php
php ./blastoff.php
Content-type: text/html
X-Powered-By: PHP/5.1.0-dev

Three...
Two...
One...
Blastoff!

Three...
Two...
One...
Blastoff!
...

Now, because I have specified Wez Furlong’s new dtrace.so as an extension in my php.ini file, when I run the above, two probes show up:

# dtrace -l | grep php
4   php806       dtrace.so          php_dtrace_execute function-return
5   php806       dtrace.so          php_dtrace_execute function-entry

The function-entry and function-return probes have three arguments:

  • arg0 is the name of the function (a pointer to a string within PHP)
  • arg1 is the name of the file containing the call site (also a pointer to a string within PHP)
  • arg2 is the line number of the call site a pointer to a string within PHP)

So for starters, let’s just get an idea of which functions are being called in my PHP program:

# dtrace -n function-entry'{printf("called %s() in %s at line %d\n", \
copyinstr(arg0), copyinstr(arg1), arg2)}' -q
called launch() in /export/php/blastoff.php at line 32
called three() in /export/php/blastoff.php at line 27
called two() in /export/php/blastoff.php at line 22
called one() in /export/php/blastoff.php at line 16
called blastoff() in /export/php/blastoff.php at line 10
called launch() in /export/php/blastoff.php at line 32
called three() in /export/php/blastoff.php at line 27
called two() in /export/php/blastoff.php at line 22
called one() in /export/php/blastoff.php at line 16
called blastoff() in /export/php/blastoff.php at line 10
^C

If you’re new to DTrace, note that you have the power to trace an arbitrary D expression in your action. For example, instead of printing out the file and line number of the call site, we could trace the walltimestamp:

# dtrace -n function-entry'{printf("called %s() at %Y\n", \
copyinstr(arg0), walltimestamp)}' -q
called launch() at 2005 Aug  5 08:08:24
called three() at 2005 Aug  5 08:08:24
called two() at 2005 Aug  5 08:08:24
called one() at 2005 Aug  5 08:08:24
called blastoff() at 2005 Aug  5 08:08:24
called launch() at 2005 Aug  5 08:08:25
called three() at 2005 Aug  5 08:08:25
called two() at 2005 Aug  5 08:08:25
called one() at 2005 Aug  5 08:08:25
called blastoff() at 2005 Aug  5 08:08:25
^C

Note, too, that (unless I direct it not to) this will aggregate across PHP instances. So, for example:

#!/usr/sbin/dtrace -s

#pragma D option quiet

php*:::function-entry
{
        @bypid[pid] = count();
        @byfunc[copyinstr(arg0)] = count();
        @bypidandfunc[pid, copyinstr(arg0)] = count();
}

END
{
        printf("By pid:\n\n");
        printa("  %-40d %@d\n", @bypid);

        printf("\nBy function:\n\n");
        printa("  %-40s %@d\n", @byfunc);

        printf("\nBy pid and function:\n\n");
        printa("  %-9d %-30s %@d\n", @bypidandfunc);
}

If I run three instances of blastoff.php and then the above script, I see the following after I ^C:

By pid:

  806                                      30
  875                                      30
  889                                      30

By function:

  launch                                   18
  three                                    18
  two                                      18
  one                                      18
  blastoff                                 18

By pid and function:

  875       two                            6
  875       three                          6
  875       launch                         6
  875       blastoff                       6
  889       blastoff                       6
  806       launch                         6
  889       one                            6
  806       three                          6
  889       two                            6
  806       two                            6
  889       three                          6
  806       one                            6
  889       launch                         6
  806       blastoff                       6
  875       one                            6

The point is that DTrace allows you to aggregate across PHP instances, allowing you to understand not just what a particular PHP is doing, but what PHP is doing more generally.

If we’re interested in better understanding the code flow in a particular PHP instance, we can write a script that uses a thread-local variable to follow the code flow:

#pragma D option quiet

self int indent;

php$target:::function-entry
/copyinstr(arg0) == "launch"/
{
        self->follow = 1;
}

php$target:::function-entry
/self->follow/
{
        printf("%*s", self->indent, "");
        printf("-> %s\n", copyinstr(arg0));
        self->indent += 2;
}

php$target:::function-return
/self->follow/
{
        self->indent -= 2;
        printf("%*s", self->indent, "");
        printf("follow = 0;
        exit(0);
}

Running the above requires giving the script a particular PHP process; assuming the above script is named blast.d, it might look like this:

# dtrace -s ./blast.d -p `pgrep -n php`
-> launch
  -> three
    -> two
      -> one
        -> blastoff
        <- blastoff
      <- one
    <- two
  <- three
<- launch

This shows us all of the PHP functions that were called from launch(), but it doesn’t tell the full story — we know that our PHP functions are calling into native code to do some of their work. To add this to the mix, we’ll write a slightly longer script:

#pragma D option quiet

self int indent;

php$target:::function-entry
/copyinstr(arg0) == "launch"/
{
        self->follow = 1;
}

php$target:::function-entry
/self->follow/
{
        printf("%\*s", self->indent, "");
        printf("-> %-20s %\*sphp\\n", copyinstr(arg0),
            46 - self->indent, "");
        self->indent += 2;
}

php$target:::function-return
/self->follow/
{
        self->indent -= 2;
        printf("%\*s", self->indent, "");
        printf("indent, "");
}

pid$target:libc.so.1::entry
/self->follow/
{
        printf("%\*s", self->indent, "");
        printf("-> %-20s %\*s%s\\n", probefunc, 46 - self->indent, "",
            probemod);
        self->indent += 2;
}

pid$target:libc.so.1::return
/self->follow/
{
        self->indent -= 2;
        printf("%\*s", self->indent, "");
        printf("indent, "",
            probemod);
}

php$target:::function-return
/copyinstr(arg0) == "launch"/
{
        self->follow = 0;
        exit(0);
}

Running the above yields the following output:

-> launch                                          php
  -> three                                         php
    -> write                                       libc.so.1
      -> _save_nv_regs                             libc.so.1
       _write                                    libc.so.1
      <- _write                                    libc.so.1
     two                                         php
      -> write                                     libc.so.1
        -> _save_nv_regs                           libc.so.1
         _write                                  libc.so.1
        <- _write                                  libc.so.1
       one                                       php
        -> write                                   libc.so.1
          -> _save_nv_regs                         libc.so.1
           _write                                libc.so.1
          <- _write                                libc.so.1
         blastoff                                php
          -> write                                 libc.so.1
            -> _save_nv_regs                       libc.so.1
             _write                              libc.so.1
            <- _write                              libc.so.1
          <- write                                 libc.so.1
        <- blastoff                                php
      <- one                                       php
    <- two                                         php
  <- three                                         php
<- launch                                          php

This shows us what’s really going on — or at least more of what’s really going on: our PHP functions are inducing work from libc. This is much more information, but of course, we’re still only seeing what’s going on at user-level. To add the kernel into the mix, add the following to the script:

fbt:genunix::entry
/self->follow/
{
        printf("%\*s", self->indent, "");
        printf("-> %-20s %\*skernel\\n", probefunc, 46 - self->indent, "");
        self->indent += 2;
}

fbt:genunix::return
/self->follow/
{
        self->indent -= 2;
        printf("%\*s", self->indent, "");
        printf("indent, "");
}

Running this new script generates much more output:

-> launch                                          php
  -> three                                         php
    -> write                                       libc.so.1
      -> _save_nv_regs                             libc.so.1
       _write                                    libc.so.1
        -> syscall_mstate                          kernel
          -> gethrtime_unscaled                    kernel
          <- gethrtime_unscaled                    kernel
         syscall_entry                           kernel
          -> pre_syscall                           kernel
           copyin_args32                         kernel
            -> copyin_nowatch                      kernel
              -> watch_disable_addr                kernel
              <- watch_disable_addr                kernel
            <- copyin_nowatch                      kernel
          <- copyin_args32                         kernel
         write32                                 kernel
          -> write                                 kernel
            -> getf                                kernel
              -> set_active_fd                     kernel
              <- set_active_fd                     kernel
             releasef                            kernel
              -> cv_broadcast                      kernel
              <- cv_broadcast                      kernel
            <- releasef                            kernel
          <- write                                 kernel
         syscall_mstate                          kernel
          -> gethrtime_unscaled                    kernel
          <- gethrtime_unscaled                    kernel
        <- syscall_mstate                          kernel
      <- _write                                    libc.so.1
     two                                         php
      -> write                                     libc.so.1
        -> _save_nv_regs                           libc.so.1
         _write                                  libc.so.1
          -> syscall_mstate                        kernel
            -> gethrtime_unscaled                  kernel
            <- gethrtime_unscaled                  kernel
          <- syscall_mstate                        kernel
          ...

Now we’re seeing everything that our PHP program is doing, from the PHP through the native code, into the kernel and back. So using DTrace on PHP has a number of unique advantages: you can look at the entire system, and you can look at the entire stack — and you can do it all in production. Thanks again to Wez for putting together the PHP provider — and if you’re a PHP developer, bon appetit!

Tonight during our OpenSolaris BOF at OSCON, PHP core developer Wez Furlong was busy adding a DTrace provider to PHP. After a little bit of work (and a little bit of debugging), we got it working — and damn is it cool. Wez implemented it as a shared object, which may then be loaded via an explicit extension directive in php.ini. Once loaded, two probes show up: function-entry and function-return. These probes have as their arguments a pointer to the function name, a pointer to the file name, and a line number. This allows one to, for example, get a count of all PHP functions being called:

# dtrace -n function-entry'{@[copyinstr(arg0)] = count()}'

Or you can aggregate on file name and quantize by line number:

# dtrace -n function-entry'{@[copyinstr(arg1)] = lquantize(arg2, 0, 5000)}'

Or you can determine the amount of wall time taken by a given PHP function:

# dtrace -n function-entry'/copyinstr(arg0) == "myfunc"/{self->ts = timestamp}'
-n function-return'/self->ts/{@ = avg(timestamp - self->ts); self->ts = 0)}'

And because it’s DTrace, this can all be done on a production box — and without regard to the number of PHP processes. (So if you have 200 Apache processes handling PHP, the above invocations would aggregate across them.) When I get back, I’ll download Wez’s provider and post some more comprehensive examples. In the meantime, if you’re a PHP developer at OSCON, stop Wez if you see him and ask him to give you a demo — it’s the kind of thing that needs to be seen to be appreciated…

Finally, if you’re interested in adding your own DTrace provider to the application, language or system that you care about, be sure to check out my presentation on DTrace tomorrow at 4:30 in Portland room 255. (Hopefully this time I won’t be tortured by memories of being mindfucked by Inside UFO 54-40.)

If you’re at OSCON and you’re interested in an after-party with — I’m told — “premium drinks”, swing by the OpenSolaris suite (room 1560 at the Doubletree) from 10pm on. If for nothing else, you’ll want to be there in the early morning hours when Keith throws the furniture out the window in a drunken rage against gcc’s SPARC backend.

Recent Posts

November 26, 2023
November 18, 2023
November 27, 2022
October 11, 2020
July 31, 2019
December 16, 2018
September 18, 2018
December 21, 2016
September 30, 2016
September 26, 2016
September 13, 2016
July 29, 2016
December 17, 2015
September 16, 2015
January 6, 2015
November 10, 2013
September 3, 2013
June 7, 2012
September 15, 2011
August 15, 2011
March 9, 2011
September 24, 2010
August 11, 2010
July 30, 2010
July 25, 2010
March 10, 2010
November 26, 2009
February 19, 2009
February 2, 2009
November 10, 2008
November 3, 2008
September 3, 2008
July 18, 2008
June 30, 2008
May 31, 2008
March 16, 2008
December 18, 2007
December 5, 2007
November 11, 2007
November 8, 2007
September 6, 2007
August 21, 2007
August 2, 2007
July 11, 2007
May 20, 2007
March 19, 2007
October 12, 2006
August 17, 2006
August 7, 2006
May 1, 2006
December 13, 2005
November 16, 2005
September 13, 2005
September 9, 2005
August 21, 2005
August 16, 2005

Archives