#include <deal.II/matrix_free/matrix_free.h>

Public Types

Public Member Functions
	AdditionalData (const TasksParallelScheme tasks_parallel_scheme=partition_partition, const unsigned int tasks_block_size=0, const UpdateFlags mapping_update_flags=update_gradients\|update_JxW_values, const UpdateFlags mapping_update_flags_boundary_faces=update_default, const UpdateFlags mapping_update_flags_inner_faces=update_default, const UpdateFlags mapping_update_flags_faces_by_cells=update_default, const unsigned int level_mg_handler=numbers::invalid_unsigned_int, const bool store_plain_indices=true, const bool initialize_indices=true, const bool initialize_mapping=true, const bool overlap_communication_computation=true, const bool hold_all_faces_to_owned_cells=false, const bool cell_vectorization_categories_strict=false)

Public Attributes
TasksParallelScheme	tasks_parallel_scheme

unsigned int	tasks_block_size

UpdateFlags	mapping_update_flags

UpdateFlags	mapping_update_flags_boundary_faces

UpdateFlags	mapping_update_flags_inner_faces

UpdateFlags	mapping_update_flags_faces_by_cells

unsigned int	level_mg_handler

bool	store_plain_indices

bool	initialize_indices

bool	initialize_mapping

bool	overlap_communication_computation

bool	hold_all_faces_to_owned_cells

std::vector< unsigned int >	cell_vectorization_category

bool	cell_vectorization_categories_strict

Detailed Description

template<int dim, typename Number = double>
struct MatrixFree< dim, Number >::AdditionalData

Collects the options for initialization of the MatrixFree class. The first parameter specifies the MPI communicator to be used, the second the parallelization options in shared memory (task-based parallelism, where one can choose between no parallelism and three schemes that avoid that cells with access to the same vector entries are accessed simultaneously), the third with the block size for task parallel scheduling, the fourth the update flags that should be stored by this class.

The fifth parameter specifies the level in the triangulation from which the indices are to be used. If the level is set to numbers::invalid_unsigned_int, the active cells are traversed, and otherwise the cells in the given level. This option has no effect in case a DoFHandler or hp::DoFHandler is given.

The parameter initialize_plain_indices indicates whether the DoFInfo class should also allow for access to vectors without resolving constraints.

The two parameters initialize_indices and initialize_mapping allow the user to disable some of the initialization processes. For example, if only the scheduling that avoids touching the same vector/matrix indices simultaneously is to be found, the mapping needs not be initialized. Likewise, if the mapping has changed from one iteration to the next but the topology has not (like when using a deforming mesh with MappingQEulerian), it suffices to initialize the mapping only.

The two parameters cell_vectorization_categories and cell_vectorization_categories_strict control the formation of batches for vectorization over several cells. It is used implicitly when working with hp adaptivity but can also be useful in other contexts, such as in local time stepping where one would like to control which elements together form a batch of cells. The array cell_vectorization_categories is accessed by the number given by cell->active_cell_index() when working on the active cells with level_mg_handler set to -1 and by cell->index() for the level cells. By default, the different categories in cell_vectorization_category can be mixed and the algorithm is allowed to merge lower category numbers with the next higher categories if it is necessary inside the algorithm, in order to avoid partially filled SIMD lanes as much as possible. This gives a better utilization of the vectorization but might need special treatment, in particular for face integrals. If set to true, the algorithm will instead keep different categories separate and not mix them in a single vectorized array.

Definition at line 172 of file matrix_free.h.

Member Enumeration Documentation

template<int dim, typename Number = double>

enum MatrixFree::AdditionalData::TasksParallelScheme

Collects options for task parallelism. See the documentation of the member variable MatrixFree::AdditionalData::tasks_parallel_scheme for a thorough description.

Enumerator
none	Perform application in serial.
partition_partition	Partition the cells into two levels and afterwards form chunks.
partition_color	Partition on the global level and color cells within the partitions.
color	Use the traditional coloring algorithm: this is like TasksParallelScheme::partition_color, but only uses one partition.

Definition at line 179 of file matrix_free.h.

Constructor & Destructor Documentation

template<int dim, typename Number = double>

MatrixFree< dim, Number >::AdditionalData::AdditionalData	(	const TasksParallelScheme	tasks_parallel_scheme = `partition_partition`,
		const unsigned int	tasks_block_size = `0`,
		const UpdateFlags	mapping_update_flags = `update_gradients \| update_JxW_values`,
		const UpdateFlags	mapping_update_flags_boundary_faces = `update_default`,
		const UpdateFlags	mapping_update_flags_inner_faces = `update_default`,
		const UpdateFlags	mapping_update_flags_faces_by_cells = `update_default`,
		const unsigned int	level_mg_handler = `numbers::invalid_unsigned_int`,
		const bool	store_plain_indices = `true`,
		const bool	initialize_indices = `true`,
		const bool	initialize_mapping = `true`,
		const bool	overlap_communication_computation = `true`,
		const bool	hold_all_faces_to_owned_cells = `false`,
		const bool	cell_vectorization_categories_strict = `false`
	)

inline

Constructor for AdditionalData.

Definition at line 205 of file matrix_free.h.

Member Data Documentation

template<int dim, typename Number = double>

TasksParallelScheme MatrixFree< dim, Number >::AdditionalData::tasks_parallel_scheme

Set the scheme for task parallelism. There are four options available. If set to none, the operator application is done in serial without shared memory parallelism. If this class is used together with MPI and MPI is also used for parallelism within the nodes, this flag should be set to none. The default value is partition_partition, i.e. we actually use multithreading with the first option described below.

The first option partition_partition is to partition the cells on two levels in onion-skin-like partitions and forming chunks of tasks_block_size after the partitioning. The partitioning finds sets of independent cells that enable working in parallel without accessing the same vector entries at the same time.

The second option partition_color is to use a partition on the global level and color cells within the partitions (where all chunks within a color are independent). Here, the subdivision into chunks of cells is done before the partitioning, which might give worse partitions but better cache performance if degrees of freedom are not renumbered.

The third option color is to use a traditional algorithm of coloring on the global level. This scheme is a special case of the second option where only one partition is present. Note that for problems with hanging nodes, there are quite many colors (50 or more in 3D), which might degrade parallel performance (bad cache behavior, many synchronization points).

Note: Threading support is currently experimental for the case inner face integrals are performed and it is recommended to use MPI parallelism if possible. While the scheme has been verified to work with the partition_partition option in case of usual DG elements, no comprehensive tests have been performed for systems of more general elements, like combinations of continuous and discontinuous elements that add face integrals to all terms.

Definition at line 272 of file matrix_free.h.

template<int dim, typename Number = double>

unsigned int MatrixFree< dim, Number >::AdditionalData::tasks_block_size

Set the number of so-called macro cells that should form one partition. If zero size is given, the class tries to find a good size for the blocks based on MultithreadInfo::n_threads() and the number of cells present. Otherwise, the given number is used. If the given number is larger than one third of the number of total cells, this means no parallelism. Note that in the case vectorization is used, a macro cell consists of more than one physical cell.

Definition at line 283 of file matrix_free.h.

template<int dim, typename Number = double>

UpdateFlags MatrixFree< dim, Number >::AdditionalData::mapping_update_flags

This flag determines the mapping data on cells that is cached. This class can cache data needed for gradient computations (inverse Jacobians), Jacobian determinants (JxW), quadrature points as well as data for Hessians (derivative of Jacobians). By default, only data for gradients and Jacobian determinants times quadrature weights, JxW, are cached. If quadrature points or second derivatives are needed, they must be specified by this field (even though second derivatives might still be evaluated on Cartesian cells without this option set here, since there the Jacobian describes the mapping completely).

Definition at line 296 of file matrix_free.h.

template<int dim, typename Number = double>

UpdateFlags MatrixFree< dim, Number >::AdditionalData::mapping_update_flags_boundary_faces

This flag determines the mapping data on boundary faces to be cached. Note that MatrixFree uses a separate loop layout for face integrals in order to effectively vectorize also in the case of hanging nodes (which require different subface settings on the two sides) or some cells in the batch of a VectorizedArray of cells that are adjacent to the boundary and others that are not.

If set to a value different from update_general (default), the face information is explicitly built. Currently, MatrixFree supports to cache the following data on faces: inverse Jacobians, Jacobian determinants (JxW), quadrature points, data for Hessians (derivative of Jacobians), and normal vectors.

Note: In order to be able to perform a face_operation or boundary_operation in the MatrixFree::loop()`, either this field or mapping_update_flags_inner_faces must be set to a value different from UpdateFlags::update_default.

Definition at line 317 of file matrix_free.h.

template<int dim, typename Number = double>

UpdateFlags MatrixFree< dim, Number >::AdditionalData::mapping_update_flags_inner_faces

This flag determines the mapping data on interior faces to be cached. Note that MatrixFree uses a separate loop layout for face integrals in order to effectively vectorize also in the case of hanging nodes (which require different subface settings on the two sides) or some cells in the batch of a VectorizedArray of cells that are adjacent to the boundary and others that are not.

If set to a value different from update_general (default), the face information is explicitly built. Currently, MatrixFree supports to cache the following data on faces: inverse Jacobians, Jacobian determinants (JxW), quadrature points, data for Hessians (derivative of Jacobians), and normal vectors.

Note: In order to be able to perform a face_operation or boundary_operation in the MatrixFree::loop()`, either this field or mapping_update_flags_boundary_faces must be set to a value different from UpdateFlags::update_default.

Definition at line 338 of file matrix_free.h.

template<int dim, typename Number = double>

UpdateFlags MatrixFree< dim, Number >::AdditionalData::mapping_update_flags_faces_by_cells

This flag determines the mapping data for faces in a different layout with respect to vectorizations. Whereas mapping_update_flags_inner_faces and mapping_update_flags_boundary_faces trigger building the data in a face-centric way with proper vectorization, the current data field attaches the face information to the cells and their way of vectorization. This is only needed in special situations, as for example for block-Jacobi methods where the full operator to a cell including its faces are evaluated. This data is accessed by FEFaceEvaluation::reinit(cell_batch_index, face_number). However, currently no coupling terms to neighbors can be computed with this approach because the neighbors are not laid out by the VectorizedArray data layout with an array-of-struct-of-array-type data structures.

Note that you should only compute this data field in case you really need it as it more than doubles the memory required by the mapping data on faces.

If set to a value different from update_general (default), the face information is explicitly built. Currently, MatrixFree supports to cache the following data on faces: inverse Jacobians, Jacobian determinants (JxW), quadrature points, data for Hessians (derivative of Jacobians), and normal vectors.

Definition at line 366 of file matrix_free.h.

template<int dim, typename Number = double>

unsigned int MatrixFree< dim, Number >::AdditionalData::level_mg_handler

This option can be used to define whether we work on a certain level of the mesh, and not the active cells. If set to invalid_unsigned_int (which is the default value), the active cells are gone through, otherwise the level given by this parameter. Note that if you specify to work on a level, its dofs must be distributed by using dof_handler.distribute_mg_dofs(fe);.

Definition at line 376 of file matrix_free.h.

template<int dim, typename Number = double>

bool MatrixFree< dim, Number >::AdditionalData::store_plain_indices

Controls whether to allow reading from vectors without resolving constraints, i.e., just read the local values of the vector. By default, this option is disabled, so if you want to use FEEvaluationBase::read_dof_values_plain, this flag needs to be set.

Definition at line 384 of file matrix_free.h.

template<int dim, typename Number = double>

bool MatrixFree< dim, Number >::AdditionalData::initialize_indices

Option to control whether the indices stored in the DoFHandler should be read and the pattern for task parallelism should be set up in the initialize method of MatrixFree. The default value is true. Can be disabled in case the mapping should be recomputed (e.g. when using a deforming mesh described through MappingEulerian) but the topology of cells has remained the same.

Definition at line 395 of file matrix_free.h.

template<int dim, typename Number = double>

bool MatrixFree< dim, Number >::AdditionalData::initialize_mapping

Option to control whether the mapping information should be computed in the initialize method of MatrixFree. The default value is true. Can be disabled when only some indices should be set up (e.g. when only a set of independent cells should be computed).

Definition at line 404 of file matrix_free.h.

template<int dim, typename Number = double>

bool MatrixFree< dim, Number >::AdditionalData::overlap_communication_computation

Option to control whether the loops should overlap communications and computations as far as possible in case the vectors passed to the loops support non-blocking data exchange. In most situations, overlapping is faster in case the amount of data to be sent is more than a few kilobytes. If less data is sent, the communication is latency bound on most clusters (point-to-point latency is around 1 microsecond on good clusters by 2016 standards). Depending on the MPI implementation and the fabric, it may be faster to not overlap and wait for the data to arrive. The default is true, i.e., communication and computation are overlapped.

Definition at line 418 of file matrix_free.h.

template<int dim, typename Number = double>

bool MatrixFree< dim, Number >::AdditionalData::hold_all_faces_to_owned_cells

By default, the face part will only hold those faces (and ghost elements behind faces) that are going to be processed locally. In case MatrixFree should have access to all neighbors on locally owned cells, this option enables adding the respective faces at the end of the face range.

Definition at line 427 of file matrix_free.h.

template<int dim, typename Number = double>

std::vector<unsigned int> MatrixFree< dim, Number >::AdditionalData::cell_vectorization_category

This data structure allows to assign a fraction of cells to different categories when building the information for vectorization. It is used implicitly when working with hp adaptivity but can also be useful in other contexts, such as in local time stepping where one would like to control which elements together form a batch of cells.

This array is accessed by the number given by cell->active_cell_index() when working on the active cells with level_mg_handler set to -1 and by cell->index() for the level cells.

Note: This field is empty upon construction of AdditionalData. It is the responsibility of the user to resize this field to triangulation.n_active_cells() or triangulation.n_cells(level) when filling data.

Definition at line 445 of file matrix_free.h.

template<int dim, typename Number = double>

bool MatrixFree< dim, Number >::AdditionalData::cell_vectorization_categories_strict

By default, the different categories in cell_vectorization_category can be mixed and the algorithm is allowed to merge lower categories with the next higher categories if it is necessary inside the algorithm. This gives a better utilization of the vectorization but might need special treatment, in particular for face integrals. If set to true, the algorithm will instead keep different categories separate and not mix them in a single vectorized array.

Definition at line 456 of file matrix_free.h.

The documentation for this struct was generated from the following file:

deal.II/matrix_free/matrix_free.h

Public Types

Public Member Functions

Public Attributes

Detailed Description

template<int dim, typename Number = double> struct MatrixFree< dim, Number >::AdditionalData

Member Enumeration Documentation

Constructor & Destructor Documentation

Member Data Documentation

template<int dim, typename Number = double>
struct MatrixFree< dim, Number >::AdditionalData