Reference documentation for deal.II version 9.1.0-pre
Classes | Public Types | Public Member Functions | Private Attributes | List of all members
TimerOutput Class Reference

#include <deal.II/base/timer.h>

Classes

class  Scope
 
struct  Section
 

Public Types

Public Member Functions

 TimerOutput (std::ostream &stream, const OutputFrequency output_frequency, const OutputType output_type)
 
 TimerOutput (ConditionalOStream &stream, const OutputFrequency output_frequency, const OutputType output_type)
 
 TimerOutput (MPI_Comm mpi_comm, std::ostream &stream, const OutputFrequency output_frequency, const OutputType output_type)
 
 TimerOutput (MPI_Comm mpi_comm, ConditionalOStream &stream, const OutputFrequency output_frequency, const OutputType output_type)
 
 ~TimerOutput ()
 
void enter_subsection (const std::string &section_name)
 
void enter_section (const std::string &section_name)
 
void leave_subsection (const std::string &section_name=std::string())
 
void exit_section (const std::string &section_name=std::string())
 
std::map< std::string, double > get_summary_data (const OutputData kind) const
 
void print_summary () const
 
void disable_output ()
 
void enable_output ()
 
void reset ()
 

Private Attributes

OutputFrequency output_frequency
 
OutputType output_type
 
Timer timer_all
 
std::map< std::string, Sectionsections
 
ConditionalOStream out_stream
 
bool output_is_enabled
 
std::list< std::string > active_sections
 
MPI_Comm mpi_communicator
 
Threads::Mutex mutex
 

Detailed Description

This class can be used to generate formatted output from time measurements of different subsections in a program. It is possible to create several sections that perform certain aspects of the program. A section can be entered several times. By changing the options in OutputFrequency and OutputType, the user can choose whether output should be generated every time a section is joined or just in the end of the program. Moreover, it is possible to show CPU times, wall times or both.

Usage

Use of this class could be as follows:

timer.enter_subsection ("Setup dof system");
setup_dofs();
timer.leave_subsection();
timer.enter_subsection ("Assemble");
assemble_system_1();
timer.leave_subsection();
timer.enter_subsection ("Solve");
solve_system_1();
timer.leave_subsection();
timer.enter_subsection ("Assemble");
assemble_system_2();
timer.leave_subsection();
timer.enter_subsection ("Solve");
solve_system_2();
timer.leave_subsection();
// do something else...

When run, this program will return an output like this:

+---------------------------------------------+------------+------------+
| Total wallclock time elapsed since start | 88.8s | |
| | | |
| Section | no. calls | wall time | % of total |
+---------------------------------+-----------+------------+------------+
| Assemble | 2 | 19.7s | 22% |
| Solve | 2 | 3.03s | 3.4% |
| Setup dof system | 1 | 3.97s | 4.5% |
+---------------------------------+-----------+------------+------------+

The output will see that we entered the assembly and solve section twice, and reports how much time we spent there. Moreover, the class measures the total time spent from start to termination of the TimerOutput object. In this case, we did a lot of other stuff, so that the time proportions of the functions we measured are far away from 100 percent.

Using scoped timers

The scheme above where you have to have calls to TimerOutput::enter_subsection() and TimerOutput::leave_subsection() is awkward if the sections in between these calls contain return statements or may throw exceptions. In that case, it is easy to forget that one nevertheless needs to leave the section somehow, somewhere. An easier approach is to use "scoped" sections. This is a variable that when you create it enters a section, and leaves the section when you destroy it. If this is a variable local to a particular scope (a code block between curly braces) and you leave this scope due to a return statements or an exception, then the variable is destroyed and the timed section is left automatically. Consequently, we could have written the code piece above as follows, with exactly the same result but now exception-safe:

{
TimerOutput::Scope timer_section(timer, "Setup dof system");
setup_dofs();
}
{
TimerOutput::Scope timer_section(timer, "Assemble");
assemble_system_1();
}
{
TimerOutput::Scope timer_section(timer, "Solve");
solve_system_1();
}
{
TimerOutput::Scope timer_section(timer, "Assemble");
assemble_system_2();
}
{
TimerOutput::Scope timer_section(timer, "Solve");
solve_system_2();
}
// do something else...

Usage in parallel programs using MPI

In a parallel program built on MPI, using the class in a way such as the one shown above would result in a situation where each process times the corresponding sections and then outputs the resulting timing information at the end. This is annoying since you'd get a lot of output – one set of timing information from each processor.

This can be avoided by only letting one processor generate screen output, typically by using an object of type ConditionalOStream instead of std::cout to write to screen (see, for example, step-17, step-18, step-32 and step-40, all of which use this method).

This way, only a single processor outputs timing information, typically the first process in the MPI universe. However, if you take the above code snippet as an example, imagine what would happen if setup_dofs() is fast on processor zero and slow on at least one of the other processors; and if the first thing assemble_system_1() does is something that requires all processors to communicate. In this case, on processor zero, the timing section with name "Setup dof system" will yield a short run time on processor zero, whereas the section "Assemble" will take a long time: not because assemble_system_1() takes a particularly long time, but because on the processor on which we time (or, rather, the one on which we generate output) happens to have to wait for a long time till the other processor is finally done with setup_dofs() and starts to participate in assemble_system_1(). In other words, the timing that is reported is unreliable because it reflects run times from other processors. Furthermore, the run time of this section on processor zero has nothing to do with the run time of the section on other processors but instead with the run time of the previous section on another processor.

The usual way to avoid this is to introduce a barrier into the parallel code just before we start and stop timing sections. This ensures that all processes are at the same place and the timing information then reflects the maximal run time across all processors. To achieve this, you need to initialize the TimerOutput object with an MPI communicator object, for example as in the following code:

Here, pcout is an object of type ConditionalOStream that makes sure that we only generate output on a single processor. See the step-32, step-40, and step-42 tutorial programs for this kind of usage of this class.

Author
M. Kronbichler, 2009.

Definition at line 580 of file timer.h.

Member Enumeration Documentation

An enumeration data type that describes whether to generate output every time we exit a section, just in the end, both, or never.

Enumerator
every_call 

Generate output after every call.

summary 

Generate output in summary at the end.

every_call_and_summary 

Generate output both after every call and in summary at the end.

never 

Never generate any output.

Definition at line 630 of file timer.h.

An enumeration data type that describes the type of data to return when fetching the data from the timer.

Enumerator
total_cpu_time 

Output CPU times.

total_wall_time 

Output wall clock times.

n_calls 

Output number of calls.

Definition at line 654 of file timer.h.

An enumeration data type that describes whether to show CPU times, wall times, or both CPU and wall times whenever we generate output.

Enumerator
cpu_times 

Output CPU times.

wall_times 

Output wall clock times.

cpu_and_wall_times 

Output both CPU and wall clock times in separate tables.

cpu_and_wall_times_grouped 

Output both CPU and wall clock times in a single table.

Definition at line 674 of file timer.h.

Constructor & Destructor Documentation

TimerOutput::TimerOutput ( std::ostream &  stream,
const OutputFrequency  output_frequency,
const OutputType  output_type 
)

Constructor.

Parameters
streamThe stream (of type std::ostream) to which output is written.
output_frequencyA variable indicating when output is to be written to the given stream.
output_typeA variable indicating what kind of timing the output should represent (CPU or wall time).

Definition at line 316 of file timer.cc.

TimerOutput::TimerOutput ( ConditionalOStream stream,
const OutputFrequency  output_frequency,
const OutputType  output_type 
)

Constructor.

Parameters
streamThe stream (of type ConditionalOstream) to which output is written.
output_frequencyA variable indicating when output is to be written to the given stream.
output_typeA variable indicating what kind of timing the output should represent (CPU or wall time).

Definition at line 328 of file timer.cc.

TimerOutput::TimerOutput ( MPI_Comm  mpi_comm,
std::ostream &  stream,
const OutputFrequency  output_frequency,
const OutputType  output_type 
)

Constructor that takes an MPI communicator as input. A timer constructed this way will sum up the CPU times over all processors in the MPI network for calculating the CPU time, or take the maximum over all processors, depending on the value of output_type . See the documentation of this class for the rationale for this constructor and an example.

Parameters
mpi_commAn MPI communicator across which we should accumulate or otherwise synchronize the timing information we produce on every MPI process.
streamThe stream (of type std::ostream) to which output is written.
output_frequencyA variable indicating when output is to be written to the given stream.
output_typeA variable indicating what kind of timing the output should represent (CPU or wall time). In this parallel context, when this argument selects CPU time, then times are accumulated over all processes participating in the MPI communicator. If this argument selects wall time, then reported times are the maximum over all processors' run times for this section. (The latter is computed by placing an MPI_Barrier call before starting and stopping the timer for each section.

Definition at line 340 of file timer.cc.

TimerOutput::TimerOutput ( MPI_Comm  mpi_comm,
ConditionalOStream stream,
const OutputFrequency  output_frequency,
const OutputType  output_type 
)

Constructor that takes an MPI communicator as input. A timer constructed this way will sum up the CPU times over all processors in the MPI network for calculating the CPU time, or take the maximum over all processors, depending on the value of output_type . See the documentation of this class for the rationale for this constructor and an example.

Parameters
mpi_commAn MPI communicator across which we should accumulate or otherwise synchronize the timing information we produce on every MPI process.
streamThe stream (of type ConditionalOstream) to which output is written.
output_frequencyA variable indicating when output is to be written to the given stream.
output_typeA variable indicating what kind of timing the output should represent (CPU or wall time). In this parallel context, when this argument selects CPU time, then times are accumulated over all processes participating in the MPI communicator. If this argument selects wall time, then reported times are the maximum over all processors' run times for this section. (The latter is computed by placing an MPI_Barrier call before starting and stopping the timer for each section.)

Definition at line 353 of file timer.cc.

TimerOutput::~TimerOutput ( )

Destructor. Calls print_summary() in case the option for writing the summary output is set.

Definition at line 366 of file timer.cc.

Member Function Documentation

void TimerOutput::enter_subsection ( const std::string &  section_name)

Open a section by given a string name of it. In case the name already exists, that section is entered once again and times are accumulated.

Definition at line 413 of file timer.cc.

void TimerOutput::enter_section ( const std::string &  section_name)
inline

Same as enter_subsection.

Definition at line 1005 of file timer.h.

void TimerOutput::leave_subsection ( const std::string &  section_name = std::string())

Leave a section. If no name is given, the last section that was entered is left.

Definition at line 455 of file timer.cc.

void TimerOutput::exit_section ( const std::string &  section_name = std::string())
inline

Same as leave_subsection.

Definition at line 1013 of file timer.h.

std::map< std::string, double > TimerOutput::get_summary_data ( const OutputData  kind) const

Get a map with the collected data of the specified type for each subsection

Definition at line 519 of file timer.cc.

void TimerOutput::print_summary ( ) const

Print a formatted table that summarizes the time consumed in the various sections.

Definition at line 545 of file timer.cc.

void TimerOutput::disable_output ( )

By calling this function, all output can be disabled. This function together with enable_output() can be useful if one wants to control the output in a flexible way without putting a lot of if clauses in the program.

Definition at line 860 of file timer.cc.

void TimerOutput::enable_output ( )

This function re-enables output of this class if it was previously disabled with disable_output(). This function together with disable_output() can be useful if one wants to control the output in a flexible way without putting a lot of if clauses in the program.

Definition at line 868 of file timer.cc.

void TimerOutput::reset ( )

Resets the recorded timing information.

Definition at line 874 of file timer.cc.

Member Data Documentation

OutputFrequency TimerOutput::output_frequency
private

When to output information to the output stream.

Definition at line 854 of file timer.h.

OutputType TimerOutput::output_type
private

Whether to show CPU times, wall times, or both CPU and wall times.

Definition at line 859 of file timer.h.

Timer TimerOutput::timer_all
private

A timer object for the overall run time. If we are using MPI, this timer also accumulates over all MPI processes.

Definition at line 866 of file timer.h.

std::map<std::string, Section> TimerOutput::sections
private

A list of all the sections and their information.

Definition at line 883 of file timer.h.

ConditionalOStream TimerOutput::out_stream
private

The stream object to which we are to output.

Definition at line 888 of file timer.h.

bool TimerOutput::output_is_enabled
private

A boolean variable that sets whether output of this class is currently on or off.

Definition at line 894 of file timer.h.

std::list<std::string> TimerOutput::active_sections
private

A list of the sections that have been entered and not exited. The list is kept in the order in which sections have been entered, but elements may be removed in the middle if an argument is given to the exit_section() function.

Definition at line 902 of file timer.h.

MPI_Comm TimerOutput::mpi_communicator
private

mpi communicator

Definition at line 907 of file timer.h.

Threads::Mutex TimerOutput::mutex
private

A lock that makes sure that this class gives reasonable results even when used with several threads.

Definition at line 913 of file timer.h.


The documentation for this class was generated from the following files: