[htdig3-dev] More gprof results


Subject: [htdig3-dev] More gprof results
From: loic@ceic.com
Date: Mon Feb 28 2000 - 04:28:26 PST


Geoff Hutchison writes:
> Anyway, here are the gprof results. The attachment has the full
> results and here are the top 25 calls. Is there anything we can do
> about making some of these faster? I haven't looked at the WordKey
> code, but it's being called a *lot*, which is causing a big slowdown.
> The slowdown in Retriever doesn't even look very important from these
> results. (I'm also amazed by the sheer number of calls--2 billion?)
>
> -Geoff
>
> Each sample counts as 0.01 seconds.
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 11.49 125.99 125.99 2046356275 0.00 0.00 WordKey::Info(void)
> 10.80 244.43 118.44 2420354574 0.00 0.00
> WordKeyInfo::Instance(void)
> 10.20 356.27 111.84 1613865799 0.00 0.00 WordKey::NFields(void)
> 6.84 431.24 74.97 461307291 0.00 0.00
> WordKey::Set(int, unsigned int)
> 5.29 489.28 58.04 99298535 0.00 0.00 WordKey::Clear(void)
> 5.27 547.08 57.80 30122303 0.00 0.00 Object::~Object(void)
> 4.27 593.93 46.85 25878449 0.00 0.01
> WordKey::CopyFrom(WordKey const &)

...

 I was first amazed to see this. It looks really bad. In a way it is.
The NFields() method is used by the compression without restriction or
cache, hence the huge number of calls. However, NFields() and Info() are
inlined and only show when compiling with -g -p. Even inlined, given the
number of calls, a cache of the result is more than necessary.

 I've added a cache for the NFields() method and checked with the following
(profiling the t_htdig test):

before:

Each sample counts as 0.01 seconds.
  % cumulative self self total
 time seconds seconds calls us/call us/call name
 11.43 0.04 0.04 __bam_search
 11.43 0.08 0.04 memp_fget
  8.57 0.11 0.03 __bam_cmp
  5.71 0.13 0.02 112949 0.18 0.18 WordKey::Set(int, unsigned int)
  5.71 0.15 0.02 24381 0.82 1.96 WordKey::CopyFrom(WordKey const &)
  5.71 0.17 0.02 Dbt::Dbt(void *, unsigned int)
  5.71 0.19 0.02 __bam_c_get
  5.71 0.21 0.02 memp_fset
  2.86 0.22 0.01 766407 0.01 0.01 WordKeyInfo::Instance(void)
  2.86 0.23 0.01 499098 0.02 0.04 WordKey::Info(void)
  2.86 0.24 0.01 418869 0.02 0.07 WordKey::NFields(void)
  2.86 0.25 0.01 41513 0.24 0.62 WordKey::Clear(void)
  2.86 0.26 0.01 34738 0.29 0.29 Object::~Object(void)

after:

Each sample counts as 0.01 seconds.
  % cumulative self self total
 time seconds seconds calls us/call us/call name
 10.26 0.04 0.04 45146 0.89 0.89 stat
 10.26 0.08 0.04 __bam_search
  7.69 0.11 0.03 __bam_c_get
  7.69 0.14 0.03 __bam_c_put
  7.69 0.17 0.03 memp_fget
  5.13 0.19 0.02 83026 0.24 0.24 WordKeyInfo::Instance(void)
  5.13 0.21 0.02 5333 3.75 17.02 Retriever::got_word(char const *, int, int)
  5.13 0.23 0.02 7 2857.14 15820.53 HTML::parse(Retriever &, URL &)
  5.13 0.25 0.02 __bam_defpfx
  5.13 0.27 0.02 __db_rdonly
  2.56 0.28 0.01 282611 0.04 0.04 WordKey::Clear(void)
  2.56 0.29 0.01 76884 0.13 0.13 WordKey::WordKey(WordKey const &)
  2.56 0.30 0.01 22045 0.45 0.45 WordKey::Info(void)

 And WordKey::NFields does not even show in the list of functions reported
by gprof. I will try to run profiling without -g to see how it performs,
this will use inlining and provide a basis for comparison.

 Thanks a lot for doing this profiling and apologies for this performance
bottleneck.

 Cheers,

-- 
		Loic Dachary

24 av Secretan 75019 Paris Tel: 33 1 42 45 09 16 e-mail: loic@dachary.org URL: http://www.senga.org/

------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Feb 28 2000 - 05:33:01 PST