.. highlight:: c++ .. feed_date:: 2011-01-08 20:47 .. summary:: #include CALLGRIND_START_INSTRUMENTATION; //treatment CALLGRIND_STOP_INSTRUMENTATION; CALLGRIND_DUMP_STATS; valgrind –tool=callgrind –instr-atstart=no ./my_soft How to profile only a part of the code using callgrind ====================================================== `Callgrind `_ allows to profile an application. Using `kcachegrind `_ it’s easy to get a picture of your application CPU consumption. .. NOTE:: Callgrind records only CPU consumption, if your application is slow down by IO, callgrind won’t help you. Code with a long initialisation ------------------------------- Sometimes it’s convenient to profile only a little part of your code. For example, what if you have a long initialization phase ? Let’s take a little dummy example (code follow), Let’s say I’m interested in the f1 call CPU consumption. :: // long initialisation will produce a big tmp.txt file filled with numbers void longInitialisation() { ofstream fout( "tmp.txt", ios_base::out | ios_base::trunc ); for (int i=0; i <100000; ++i) { fout << i << endl; } } // no inline so I see it in callgrind report __attribute__((noinline)) int f2(int i) { return i*2; } __attribute__((noinline)) int f1(int seed) { int res = 0; for (int i = 0; i<100; ++i) { res += f2(i) * seed; } return res; } int main(int argc, char * argv[]) { longInitialisation(); int a = f1(argc); //take argc as param to prevent optimisations cout << a << endl; return 0; } When running this example with callgrind : .. highlights:: valgrind –tool=callgrind ./my_example The output mainly show my longInitialisation function cost : .. image:: img/callgrind_piece_full.png Only showing part of interest ----------------------------- Callgrind provides a nice feature, the application can ask the instrumentation to start and stop. The code becomes : :: #include void longInitialisation() { ofstream fout( "tmp.txt", ios_base::out | ios_base::trunc ); for (int i=0; i<100000; ++i) { fout << i << endl; } } __attribute__((noinline)) int f2(int i) { return i*2; } __attribute__((noinline)) int f1(int seed) { int res = 0; for (int i = 0; i<100; ++i) { res += f2(i) * seed; } return res; } int main(int argc, char * argv[]) { longInitialisation(); CALLGRIND_START_INSTRUMENTATION; int a = f1(argc); CALLGRIND_STOP_INSTRUMENTATION; CALLGRIND_DUMP_STATS; cout << a << endl; return 0; } Three macros defined in the callgrind.h header are used to control the callgrind instrumentation. Running once again callgrind (notice the *–instr-atstart=no* which ask the instrumentation to not begin at the start of the program) : .. highlights:: valgrind –tool=callgrind –instr-atstart=no ./my_example And the result is much cleaner : .. image:: img/callgrind_piece_only_f11.png