This tutorial depends on step-15, step-22.

Table of contents

Introduction
The commented program

Results
The plain program

This program was contributed by Liang Zhao and Timo Heister.

This material is based upon work partially supported by National Science Foundation grant DMS1522191 and the Computational Infrastructure in Geodynamics initiative (CIG), through the National Science Foundation under Award No. EAR-0949446 and The University of California-Davis.

Note: If you use this program as a basis for your own work, please consider citing it in your list of references. The initial version of this work was contributed to the deal.II project by the authors listed in the following citation:

Introduction

Navier Stokes Equations

In this tutorial we show how to solve the incompressible Navier Stokes equations (NSE) by Newton's method. The flow we consider here is assumed to be steady. In a domain \(\Omega \subset \mathbb{R}^{d}\), \(d=2,3\), with a piecewise smooth boundary \(\partial \Omega\), and a given force field \(\textbf{f}\), we seek a velocity field \(\textbf{u}\) and a pressure field \(\textbf{p}\) satisfying

\begin{eqnarray*} - \nu \Delta\textbf{u} + (\textbf{u} \cdot \nabla)\textbf{u} + \nabla p &=& \textbf{f}\\ - \nabla \cdot \textbf{u} &=& 0 \end{eqnarray*}

Different from the Stokes equations as discussed in step-22, the NSE are a nonlinear system because of the convective term \((\textbf{u} \cdot \nabla)\textbf{u}\). The first step of computing a numerical solution is to linearize the system and this will be done using Newton's method. A time-dependent problem is discussed in step-35, where the system is linearized using the solution from the last time step and no nonlinear solve is necessary.

Linearization of Navier-Stokes Equations

Moving the right-hand side terms to the left, a nonlinear function is created as

\begin{eqnarray*} F(\mathbf{u}, p) = \left( \begin{array}{c} - \nu \Delta\mathbf{u} + (\mathbf{u} \cdot \nabla)\mathbf{u} + \nabla p - \mathbf{f} \\ - \nabla \cdot \mathbf{u} \\ \end{array} \right). \end{eqnarray*}

\(F(\textbf{u}, p)\) is a nonlinear function whose root is the solution to the NSE. Assuming the initial guess is good enough to guarantee the convergence of Newton's iteration and denoting \(\textbf{x} = (\textbf{u}, p)\), Newton's iteration on a vector field can be defined as

\begin{eqnarray*} \textbf{x}^{k+1} = \textbf{x}^{k} - (\nabla F(\textbf{x}^{k}))^{-1} F(\textbf{x}^{k}), \end{eqnarray*}

where \(\textbf{x}^{k+1}\) is the approximate solution in step k+1, \(\textbf{x}^{k}\) represents the solution from the last step, and \(\nabla F(\textbf{x}^{k})\) is the Jacobian matrix evaluated at \(\textbf{x}^{k}\). A similar iteration can be found in step-15.

From Newton's iteration formula, we can observe that the new solution is obtained by adding an update term to the old solution. Instead of evaluating the Jacobian matrix and taking its inverse, we consider the update term as a whole, that is

\begin{eqnarray*} \delta \textbf{x}^{k} = - (\nabla F(\textbf{x}^{k}))^{-1} F(\textbf{x}^{k}), \end{eqnarray*}

where \(x^{k+1}=x^{k}+\delta x^{k}\).

Then we can evaluate the update term by solving the system

\begin{eqnarray*} \nabla F(\textbf{x}^{k}) \delta \textbf{x}^{k} = -F(\textbf{x}^{k}). \end{eqnarray*}

Here, the left of the previous equation represents the directional gradient of \(F(\textbf{x})\) along \(\delta \textbf{x}^{k}\) at \(\textbf{x}^{k}\). By definition, the directional gradient is given by

\begin{eqnarray*} & &\nabla F(\mathbf{u}^{k}, p^{k}) (\delta \mathbf{u}^{k}, \delta p^{k}) \\ \\ &=& \lim_{\epsilon \to 0} \frac{1}{\epsilon} (F(\mathbf{u}^{k}+\epsilon \delta \mathbf{u}^{k}, p^{k}+\epsilon\nabla\delta p^{k}) - (F(\mathbf{u}^{k}, p^{k}))\\ \\ &=& \lim_{\epsilon \to 0} \frac{1}{\epsilon} \left( \begin{array}{c} - \epsilon\nu\Delta\delta \mathbf{u}^{k} + \epsilon\mathbf{u}^{k}\cdot\nabla\delta\mathbf{u}^{k}+\epsilon\delta\mathbf{u}^{k}\cdot\nabla\mathbf{u}^{k}+\epsilon^{2}\delta\mathbf{u}^{k}\cdot\nabla\delta\mathbf{u}^{k}+\epsilon \nabla\delta p^{k}\\ - \epsilon \nabla \cdot\delta \mathbf{u}^{k}\\ \end{array} \right)\\ \\ &=& \left( \begin{array}{c} - \nu\Delta\delta \mathbf{u}^{k} + \mathbf{u}^{k}\cdot\nabla\delta\mathbf{u}^{k}+\delta\mathbf{u}^{k}\cdot\nabla\mathbf{u}^{k}+ \nabla\delta p^{k}\\ - \nabla \cdot\delta \mathbf{u}^{k}\\ \end{array} \right). \end{eqnarray*}

Therefore, we arrive at the linearized system:

\begin{eqnarray*} - \nu\Delta\delta\mathbf{u}^{k} + \mathbf{u}^{k}\cdot\nabla\delta\mathbf{u}^{k}+\delta\mathbf{u}^{k}\cdot\nabla\mathbf{u}^{k}+ \nabla\delta p^{k} = \mathbf{g}, \\ - \nabla \cdot\delta \mathbf{u}^{k} = \nabla\cdot\mathbf{u}^{k}, \end{eqnarray*}

where \(\textbf{g} =\textbf{f}+\nu \Delta\textbf{u}^k -(\textbf{u}^k \cdot \nabla)\textbf{u}^k -\nabla p^k\) and \(\textbf{u}^k\) and \(p^k\) are the solutions from the previous iteration. Additionally, the right hand side of the second equation is not zero since the discrete solution is not exactly divergence free (divergence free for the continuous solution). The right hand side here acts as a correction which leads the discrete solution of the velocity to be divergence free along Newton's iteration. In this linear system, the only unknowns are the update terms \(\delta \textbf{u}^{k}\) and \(\delta p^{k}\), and we can use a similar strategy to the one used in step-22. The weak form is derived like it is done in step-22.

Now, Newton's iteration can be used to solve for the update terms:

Initialization: Initial guess \(u_0\) and \(p_0\), tolerance \(\tau\);
Linear solve to compute update term \(\delta\textbf{u}^{k}\) and \(\delta p^k\);
Update the approximation: \(\textbf{u}^{k+1} = \textbf{u}^{k} + \delta\textbf{u}^{k}\) and \(p^{k+1} = p^{k} + \delta p^{k}\);
Check residual norm: \(E^{k+1} = \|F(\mathbf{u}^{k+1}, p^{k+1})\|\):
- If \(E^{k+1} \leq \tau\), STOP.
- If \(E^{k+1} > \tau\), back to step 2.

Finding an Initial Guess

Getting Newton's method to converge, the initial guess needs to be close enough to the solution, so it is crucial to find a good starting value.

When the viscosity \(\nu\) is large, a good initial guess can be obtained by solving the Stokes equation with viscosity \(\nu\). While problem dependent, this works for \(\nu \geq 1/400\) for the test problem considered here.

However, the convective term \((\mathbf{u}\cdot\nabla)\mathbf{u}\) will be dominant if the viscosity is small, like 1/7500 in test case 2. In this situation, we use a continuation method to set up a series of auxiliary NSE with viscosity approaching the one in the target NSE. Correspondingly, we create a sequence \(\{\nu_{i}\}\) with \(\nu_{n}= \nu\), and accept that the solutions to two NSE with viscosity \(\nu_{i}\) and \(\nu_{i+1}\) are close if \(|\nu_{i} - \nu_{i+1}|\) is small. Then we use the solution to the NSE with viscosity \(\nu_{i}\) as the initial guess of the NSE with \(\nu_{i+1}\). This can be thought of as a staircase from the Stokes equations to the NSE we want to solve.

That is, we first solve a Stokes problem

\begin{eqnarray*} - \nu_{1} \Delta\textbf{u} + \nabla p &=& \textbf{f}\\ - \nabla \cdot \textbf{u} &=& 0 \end{eqnarray*}

to get the initial guess for

\begin{eqnarray*} - \nu_{1} \Delta\textbf{u} + (\textbf{u} \cdot \nabla)\textbf{u} + \nabla p &=& \textbf{f},\\ - \nabla \cdot \textbf{u} &=& 0, \end{eqnarray*}

which also acts as the initial guess of the continuation method. Here \(\nu_{1}\) is relatively large so that the solution to the Stokes problem with viscosity \(\nu_{1}\) can be used as an initial guess for the NSE in Newton's iteration.

Then the solution to

\begin{eqnarray*} - \nu_{i} \Delta\textbf{u} + (\textbf{u} \cdot \nabla)\textbf{u} + \nabla p &=& \textbf{f},\\ - \nabla \cdot \textbf{u} &=& 0. \end{eqnarray*}

acts as the initial guess for

\begin{eqnarray*} - \nu_{i+1} \Delta\textbf{u} + (\textbf{u} \cdot \nabla)\textbf{u} + \nabla p &=& \textbf{f},\\ - \nabla \cdot \textbf{u} &=& 0. \end{eqnarray*}

This process is repeated with a sequence of viscosities, \(\{\nu_i\}\) that is determined experimentally so that the final solution can used as a starting guess for the Newton iteration.

The Solver and Preconditioner

At each step of Newton's iteration, the problem results in solving a saddle point systems of the form

\begin{eqnarray*} \left( \begin{array}{cc} A & B^{T} \\ B & 0 \\ \end{array} \right) \left( \begin{array}{c} U \\ P \\ \end{array} \right) = \left( \begin{array}{c} F \\ 0 \\ \end{array} \right). \end{eqnarray*}

This system matrix has the same block structure as the one in step-22. However, the matrix \(A\) at (1, 1) corner is not symmetric because of the nonlinear term. Instead of solving the above system, we can solve the equivalent system

\begin{eqnarray*} \left( \begin{array}{cc} A + \gamma B^TW^{-1}B & B^{T} \\ B & 0 \\ \end{array} \right) \left( \begin{array}{c} U \\ P \\ \end{array} \right) = \left( \begin{array}{c} F \\ 0 \\ \end{array} \right) \end{eqnarray*}

with a parameter \(\gamma\) and an invertible matrix W. Here \(\gamma B^TW^{-1}B\) is the Augmented Lagrangian term and see [1] for details.

Denoting the system matrix of the new system by \(G\) and the right-hand side by \(b\), we solve it iteratively with right preconditioning \(P^{-1}\) as \(GP^{-1}y = b\), where

\begin{eqnarray*} P^{-1} = \left(\begin{array}{cc} \tilde{A} & B^T \\ 0 & \tilde{S} \end{array}\right)^{-1}, \end{eqnarray*}

with \(\tilde{A} = A + \gamma B^TW^{-1}B\) and \(\tilde{S}\) is the corresponding Schur complement \(\tilde{S} = B^T \tilde{A}^{-1} B\). We let \(W = M_p\) where \(M_p\) is the pressure mass matrix, then \(\tilde{S}^{-1}\) can be approximated by

\begin{eqnarray*} \tilde{S}^{-1} \approx -(\nu+\gamma)M_p^{-1}. \end{eqnarray*}

See [1] for details.

We decompose \(P^{-1}\) as

\begin{eqnarray*} P^{-1} = \left(\begin{array}{cc} \tilde{A}^{-1} & 0 \\ 0 & I \end{array}\right) \left(\begin{array}{cc} I & -B^T \\ 0 & I \end{array}\right) \left(\begin{array}{cc} I & 0 \\ 0 & \tilde{S}^{-1} \end{array}\right). \end{eqnarray*}

Here two inexact solvers will be needed for \(\tilde{A}^{-1}\) and \(\tilde{S}^{-1}\), respectively (see [1]). Since the pressure mass matrix is symmetric and positive definite, CG with ILU as a preconditioner is appropriate to use for \(\tilde{S}^{-1}\). For simplicity, we use the direct solver UMFPACK for \(\tilde{A}^{-1}\). The last ingredient is a sparse matrix-vector product with \(B^T\). Instead of computing the matrix product in the augmented Lagrangian term in \(\tilde{A}\), we assemble Grad-Div stabilization \((\nabla \cdot \phi _{i}, \nabla \cdot \phi _{j}) \approx (B^T M_p^{-1}B)_{ij}\), as explained in [2].

Test Case

Here we use the lid driven cavity flow as our test case, see [3] for details. The computational domain is the unit square and the right-hand side \(f=0\). The boundary condition is

\begin{eqnarray*} (u(x, y), v(x,y)) &=& (1,0) \qquad\qquad \textrm{if}\ y = 1 \\ (u(x, y), v(x,y)) &=& (0,0) \qquad\qquad \textrm{else}. \end{eqnarray*}

When solving this problem, the error consists of the nonlinear error (from Newton's iteration) and the discretization error (depending on mesh size). The nonlinear part decreases with each Newton iteration and the discretization error reduces with mesh refinement. In this example, the solution from the coarse mesh is transferred to successively finer meshes and used as an initial guess. Therefore, the nonlinear error is always brought below the tolerance of Newton's iteration and the discretization error is reduced with each mesh refinement.

Inside the loop, we involve three solvers: one for \(\tilde{A}^{-1}\), one for \(M_p^{-1}\) and one for \(Gx=b\). The first two solvers are invoked in the preconditioner and the outer solver gives us the update term. Overall convergence is controlled by the nonlinear residual and Newton's method does not have to require an exact Jacobian, so for the outer linear solver we employ FGMRES with a relative tolerance of only 1e-4. In fact, we use the truncated Newton solve for this system. As described in step-22, the inner linear solves are also not required to be done very accurately. Here we use CG with a relative tolerance of 1e-6 for the pressure mass matrix. As expected, we still see convergence of the nonlinear residual down to 1e-14. Also, we use a simple line search algorithm for globalization of the Newton method.

The cavity reference values for Re=400 and Re=7500 are from [4] and [5], respectively, where "Re" represents the Reynold number and can be located at [8]. Here the viscosity is defined by 1/Re. Even though we can still find a solution for Re=10000 and the references contain results for comparison, we limit our discussion here to Re=7500. This is because the solution is no longer stationary starting around Re=8000 but instead becomes periodic, see [7] for details.

Reference

An Augmented Lagrangian-Based Approach to the Oseen Problem, M. Benzi and M. Olshanskii, SIAM J. SCI. COMPUT. 2006
Efficient augmented Lagrangian-type preconditioning for the Oseen problem using Grad-Div stabilization, Timo Heister and Gerd Rapin
http://www.cfd-online.com/Wiki/Lid-driven_cavity_problem
High-Re solution for incompressible flow using the Navier-Stokes Equations and a Multigrid Method, U. Ghia, K. N. Ghia, and C. T. Shin
Numerical solutions of 2-D steady incompressible driven cavity flow at high Reynolds numbers, E. Erturk, T.C. Corke and C. Gokcol
Implicit Weighted ENO Schemes for the Three-Dimensional Incompressible Navier-Stokes Equations, Yang et al, 1998
The 2D lid-driven cavity problem revisited, C. Bruneau and M. Saad, 2006
https://en.wikipedia.org/wiki/Reynolds_number

The commented program

Include files

As usual, we start by including some well-known files:

#include <deal.II/base/quadrature_lib.h>
#include <deal.II/base/function.h>
#include <deal.II/base/logstream.h>
#include <deal.II/base/utilities.h>
#include <deal.II/base/tensor.h>
#include <deal.II/lac/block_vector.h>
#include <deal.II/lac/full_matrix.h>
#include <deal.II/lac/block_sparse_matrix.h>
#include <deal.II/lac/dynamic_sparsity_pattern.h>
#include <deal.II/lac/solver_cg.h>
#include <deal.II/lac/solver_gmres.h>
#include <deal.II/lac/precondition.h>
#include <deal.II/lac/affine_constraints.h>
#include <deal.II/lac/sparse_direct.h>
#include <deal.II/grid/tria.h>
#include <deal.II/grid/grid_generator.h>
#include <deal.II/grid/tria_accessor.h>
#include <deal.II/grid/tria_iterator.h>
#include <deal.II/grid/manifold_lib.h>
#include <deal.II/grid/grid_refinement.h>
#include <deal.II/grid/grid_tools.h>
#include <deal.II/dofs/dof_handler.h>
#include <deal.II/dofs/dof_renumbering.h>
#include <deal.II/dofs/dof_accessor.h>
#include <deal.II/dofs/dof_tools.h>
#include <deal.II/fe/fe_q.h>
#include <deal.II/fe/fe_system.h>
#include <deal.II/fe/fe_values.h>
#include <deal.II/numerics/vector_tools.h>
#include <deal.II/numerics/matrix_tools.h>
#include <deal.II/numerics/data_out.h>
#include <deal.II/numerics/error_estimator.h>

To transfer solutions between meshes, this file is included:

#include <deal.II/numerics/solution_transfer.h>

This file includes UMFPACK: the direct solver:

#include <deal.II/lac/sparse_direct.h>

And the one for ILU preconditioner:

#include <deal.II/lac/sparse_ilu.h>
#include <fstream>
#include <iostream>
namespace Step57
{
  using namespace dealii;

The `NavierStokesProblem` class template

This is the main function and its member functions. As explained in the introduction, what we obtain at each step is the Newton's update, so we define two variables: the present solution and the update. Additionally, the evaluation point is for temporarily holding Newton update in line search. A sparse matrix for the pressure mass matrix is created for the operator of a block Schur complement preconditioner. We use one AffineConstraints for Dirichlet boundary conditions at the initial step and a zero AffineConstraints for the Newton is defined by 1/Re which has been discussed in the introduction.

template <int dim>
class StationaryNavierStokes
{
public:
  StationaryNavierStokes(const unsigned int degree);
  void run(const unsigned int refinement);
private:
  void setup_dofs();
  void initialize_system();
  void assemble(const bool initial_step, const bool assemble_matrix);
  void assemble_system(const bool initial_step);
  void assemble_rhs(const bool initial_step);
  void solve(const bool initial_step);
  void refine_mesh();
  void process_solution(unsigned int refinement);
  void output_results(const unsigned int refinement_cycle) const;
  void newton_iteration(const double       tolerance,
                        const unsigned int max_iteration,
                        const unsigned int n_refinements,
                        const bool         is_initial_step,
                        const bool         output_result);
  void compute_initial_guess(double step_size);
  double                               viscosity;
  double                               gamma;
  const unsigned int                   degree;
  std::vector<types::global_dof_index> dofs_per_block;
  Triangulation<dim> triangulation;
  FESystem<dim>      fe;
  DoFHandler<dim>    dof_handler;
  AffineConstraints<double> zero_constraints;
  AffineConstraints<double> nonzero_constraints;
  BlockSparsityPattern      sparsity_pattern;
  BlockSparseMatrix<double> system_matrix;
  SparseMatrix<double>      pressure_mass_matrix;
  BlockVector<double> present_solution;
  BlockVector<double> newton_update;
  BlockVector<double> system_rhs;
  BlockVector<double> evaluation_point;
};

Boundary values and right hand side

In this problem we set the velocity along the upper surface of the cavity to be one and zero on the other three walls. The right hand side function is zero so we do not need to set the right hand side function in this tutorial. The number of components of the boundary function is dim+1. In practice, the boundary values are applied to our solution through AffineConstraints which is obtained by using VectorTools::interpolate_boundary_values. The components of boundary value functions are required to be chosen according to the finite element space. Therefore we have to define the boundary value of pressure even though we actually do not need it.

The following function represents the boundary values:

template <int dim>
class BoundaryValues : public Function<dim>
{
public:
  BoundaryValues()
    : Function<dim>(dim + 1)
  {}
  virtual double value(const Point<dim> & p,
                       const unsigned int component) const override;
  virtual void vector_value(const Point<dim> &p,
                            Vector<double> &  values) const override;
};
template <int dim>
double BoundaryValues<dim>::value(const Point<dim> & p,
                                  const unsigned int component) const
{
  Assert(component < this->n_components,
         ExcIndexRange(component, 0, this->n_components));
  if (component == 0 && std::abs(p[dim - 1] - 1.0) < 1e-10)
    return 1.0;
  return 0;
}
template <int dim>
void BoundaryValues<dim>::vector_value(const Point<dim> &p,
                                       Vector<double> &  values) const
{
  for (unsigned int c = 0; c < this->n_components; ++c)
    values(c) = BoundaryValues<dim>::value(p, c);
}

BlockSchurPreconditioner for Navier Stokes equations

The block Schur complement preconditioner is defined in this part. As discussed in the introduction, the preconditioner in Krylov iterative methods is implemented as a matrix-vector product operator. In practice, the Schur complement preconditioner is decomposed as a product of three matrices (as presented in the first section). The \(\tilde{A}^{-1}\) in the first factor involves a solve for the linear system \(\tilde{A}x=b\). Here we solve this system via a direct solver for simplicity. The computation involved in the second factor is a simple matrix-vector multiplication. The Schur complement \(\tilde{S}\) can be well approximated by the pressure mass matrix and its inverse can be obtained through an inexact solver. Because the pressure mass matrix is symmetric and positive definite, we can use CG to solve the corresponding linear system.

template <class PreconditionerMp>
class BlockSchurPreconditioner : public Subscriptor
{
public:
  BlockSchurPreconditioner(double                           gamma,
                           double                           viscosity,
                           const BlockSparseMatrix<double> &S,
                           const SparseMatrix<double> &     P,
                           const PreconditionerMp &         Mppreconditioner);
  void vmult(BlockVector<double> &dst, const BlockVector<double> &src) const;
private:
  const double                     gamma;
  const double                     viscosity;
  const BlockSparseMatrix<double> &stokes_matrix;
  const SparseMatrix<double> &     pressure_mass_matrix;
  const PreconditionerMp &         mp_preconditioner;
  SparseDirectUMFPACK              A_inverse;
};

We can notice that the initialization of the inverse of the matrix at (0,0) corner is completed in the constructor. If so, every application of the preconditioner then no longer requires the computation of the matrix factors.

template <class PreconditionerMp>
BlockSchurPreconditioner<PreconditionerMp>::BlockSchurPreconditioner(
  double                           gamma,
  double                           viscosity,
  const BlockSparseMatrix<double> &S,
  const SparseMatrix<double> &     P,
  const PreconditionerMp &         Mppreconditioner)
  : gamma(gamma)
  , viscosity(viscosity)
  , stokes_matrix(S)
  , pressure_mass_matrix(P)
  , mp_preconditioner(Mppreconditioner)
{
  A_inverse.initialize(stokes_matrix.block(0, 0));
}
template <class PreconditionerMp>
void BlockSchurPreconditioner<PreconditionerMp>::vmult(
  BlockVector<double> &      dst,
  const BlockVector<double> &src) const
{
  Vector<double> utmp(src.block(0));
  {
    SolverControl solver_control(1000, 1e-6 * src.block(1).l2_norm());
    SolverCG<>    cg(solver_control);
    dst.block(1) = 0.0;
    cg.solve(pressure_mass_matrix,
             dst.block(1),
             src.block(1),
             mp_preconditioner);
    dst.block(1) *= -(viscosity + gamma);
  }
  {
    stokes_matrix.block(0, 1).vmult(utmp, dst.block(1));
    utmp *= -1.0;
    utmp += src.block(0);
  }
  A_inverse.vmult(dst.block(0), utmp);
}

StationaryNavierStokes class implementation

StationaryNavierStokes::StationaryNavierStokes

The constructor of this class looks very similar to the one in step-22. The only difference is the viscosity and the Augmented Lagrangian coefficient gamma.

template <int dim>
StationaryNavierStokes<dim>::StationaryNavierStokes(const unsigned int degree)
  : viscosity(1.0 / 7500.0)
  , gamma(1.0)
  , degree(degree)
  , triangulation(Triangulation<dim>::maximum_smoothing)
  , fe(FE_Q<dim>(degree + 1), dim, FE_Q<dim>(degree), 1)
  , dof_handler(triangulation)
{}

StationaryNavierStokes::setup_dofs

This function initializes the DoFHandler enumerating the degrees of freedom and constraints on the current mesh.

template <int dim>
void StationaryNavierStokes<dim>::setup_dofs()
{
  system_matrix.clear();
  pressure_mass_matrix.clear();

The first step is to associate DoFs with a given mesh.

dof_handler.distribute_dofs(fe);

We renumber the components to have all velocity DoFs come before the pressure DoFs to be able to split the solution vector in two blocks which are separately accessed in the block preconditioner.

std::vector<unsigned int> block_component(dim + 1, 0);
block_component[dim] = 1;
DoFRenumbering::component_wise(dof_handler, block_component);
dofs_per_block.resize(2);
DoFTools::count_dofs_per_block(dof_handler,
                               dofs_per_block,
                               block_component);
unsigned int dof_u = dofs_per_block[0];
unsigned int dof_p = dofs_per_block[1];

In Newton's scheme, we first apply the boundary condition on the solution obtained from the initial step. To make sure the boundary conditions remain satisfied during Newton's iteration, zero boundary conditions are used for the update \(\delta u^k\). Therefore we set up two different constraint objects.

  FEValuesExtractors::Vector velocities(0);
  {
    nonzero_constraints.clear();
    DoFTools::make_hanging_node_constraints(dof_handler, nonzero_constraints);
    VectorTools::interpolate_boundary_values(dof_handler,
                                             0,
                                             BoundaryValues<dim>(),
                                             nonzero_constraints,
                                             fe.component_mask(velocities));
  }
  nonzero_constraints.close();
  {
    zero_constraints.clear();
    DoFTools::make_hanging_node_constraints(dof_handler, zero_constraints);
    VectorTools::interpolate_boundary_values(dof_handler,
                                             0,
                                             Functions::ZeroFunction<dim>(
                                               dim + 1),
                                             zero_constraints,
                                             fe.component_mask(velocities));
  }
  zero_constraints.close();
  std::cout << "   Number of active cells: " << triangulation.n_active_cells()
            << std::endl
            << "   Number of degrees of freedom: " << dof_handler.n_dofs()
            << " (" << dof_u << '+' << dof_p << ')' << std::endl;
}

StationaryNavierStokes::initialize_system

On each mesh the sparsity pattern and the size of the linear system are different. This function initializes them after mesh refinement.

template <int dim>
void StationaryNavierStokes<dim>::initialize_system()
{
  {
    BlockDynamicSparsityPattern dsp(dofs_per_block, dofs_per_block);
    DoFTools::make_sparsity_pattern(dof_handler, dsp, nonzero_constraints);
    sparsity_pattern.copy_from(dsp);
  }
  system_matrix.reinit(sparsity_pattern);
  present_solution.reinit(dofs_per_block);
  newton_update.reinit(dofs_per_block);
  system_rhs.reinit(dofs_per_block);
}

StationaryNavierStokes::assemble

This function builds the system matrix and right hand side that we currently work on. The initial_step argument is used to determine which set of constraints we apply (nonzero for the initial step and zero for the others). The assemble_matrix flag determines whether to assemble the whole system or only the right hand side vector, respectively.

template <int dim>
void StationaryNavierStokes<dim>::assemble(const bool initial_step,
                                           const bool assemble_matrix)
{
  if (assemble_matrix)
    system_matrix = 0;
  system_rhs = 0;
  QGauss<dim> quadrature_formula(degree + 2);
  FEValues<dim> fe_values(fe,
                          quadrature_formula,
                          update_values | update_quadrature_points |
                            update_JxW_values | update_gradients);
  const unsigned int dofs_per_cell = fe.dofs_per_cell;
  const unsigned int n_q_points    = quadrature_formula.size();
  const FEValuesExtractors::Vector velocities(0);
  const FEValuesExtractors::Scalar pressure(dim);
  FullMatrix<double> local_matrix(dofs_per_cell, dofs_per_cell);
  Vector<double>     local_rhs(dofs_per_cell);
  std::vector<types::global_dof_index> local_dof_indices(dofs_per_cell);

For the linearized system, we create temporary storage for present velocity and gradient, and present pressure. In practice, they are all obtained through their shape functions at quadrature points.

std::vector<Tensor<1, dim>> present_velocity_values(n_q_points);
std::vector<Tensor<2, dim>> present_velocity_gradients(n_q_points);
std::vector<double>         present_pressure_values(n_q_points);
std::vector<double>         div_phi_u(dofs_per_cell);
std::vector<Tensor<1, dim>> phi_u(dofs_per_cell);
std::vector<Tensor<2, dim>> grad_phi_u(dofs_per_cell);
std::vector<double>         phi_p(dofs_per_cell);
for (const auto &cell : dof_handler.active_cell_iterators())
  {
    fe_values.reinit(cell);
    local_matrix = 0;
    local_rhs    = 0;
    fe_values[velocities].get_function_values(evaluation_point,
                                              present_velocity_values);
    fe_values[velocities].get_function_gradients(
      evaluation_point, present_velocity_gradients);
    fe_values[pressure].get_function_values(evaluation_point,
                                            present_pressure_values);

The assembly is similar to step-22. An additional term with gamma as a coefficient is the Augmented Lagrangian (AL), which is assembled via grad-div stabilization. As we discussed in the introduction, the bottom right block of the system matrix should be zero. Since the pressure mass matrix is used while creating the preconditioner, we assemble it here and then move it into a separate SparseMatrix at the end (same as in step-22).

    for (unsigned int q = 0; q < n_q_points; ++q)
      {
        for (unsigned int k = 0; k < dofs_per_cell; ++k)
          {
            div_phi_u[k]  = fe_values[velocities].divergence(k, q);
            grad_phi_u[k] = fe_values[velocities].gradient(k, q);
            phi_u[k]      = fe_values[velocities].value(k, q);
            phi_p[k]      = fe_values[pressure].value(k, q);
          }
        for (unsigned int i = 0; i < dofs_per_cell; ++i)
          {
            if (assemble_matrix)
              {
                for (unsigned int j = 0; j < dofs_per_cell; ++j)
                  {
                    local_matrix(i, j) +=
                      (viscosity *
                         scalar_product(grad_phi_u[j], grad_phi_u[i]) +
                       present_velocity_gradients[q] * phi_u[j] * phi_u[i] +
                       grad_phi_u[j] * present_velocity_values[q] *
                         phi_u[i] -
                       div_phi_u[i] * phi_p[j] - phi_p[i] * div_phi_u[j] +
                       gamma * div_phi_u[j] * div_phi_u[i] +
                       phi_p[i] * phi_p[j]) *
                      fe_values.JxW(q);
                  }
              }
            double present_velocity_divergence =
              trace(present_velocity_gradients[q]);
            local_rhs(i) +=
              (-viscosity * scalar_product(present_velocity_gradients[q],
                                           grad_phi_u[i]) -
               present_velocity_gradients[q] * present_velocity_values[q] *
                 phi_u[i] +
               present_pressure_values[q] * div_phi_u[i] +
               present_velocity_divergence * phi_p[i] -
               gamma * present_velocity_divergence * div_phi_u[i]) *
              fe_values.JxW(q);
          }
      }
    cell->get_dof_indices(local_dof_indices);
    const AffineConstraints<double> &constraints_used =
      initial_step ? nonzero_constraints : zero_constraints;
    if (assemble_matrix)
      {
        constraints_used.distribute_local_to_global(local_matrix,
                                                    local_rhs,
                                                    local_dof_indices,
                                                    system_matrix,
                                                    system_rhs);
      }
    else
      {
        constraints_used.distribute_local_to_global(local_rhs,
                                                    local_dof_indices,
                                                    system_rhs);
      }
  }
if (assemble_matrix)
  {

Finally we move pressure mass matrix into a separate matrix:

pressure_mass_matrix.reinit(sparsity_pattern.block(1, 1));

pressure_mass_matrix.copy_from(system_matrix.block(1, 1));

Note that settings this pressure block to zero is not identical to not assembling anything in this block, because this operation here will (incorrectly) delete diagonal entries that come in from hanging node constraints for pressure DoFs. This means that our whole system matrix will have rows that are completely zero. Luckily, FGMRES handles these rows without any problem.

      system_matrix.block(1, 1) = 0;
    }
}
template <int dim>
void StationaryNavierStokes<dim>::assemble_system(const bool initial_step)
{
  assemble(initial_step, true);
}
template <int dim>
void StationaryNavierStokes<dim>::assemble_rhs(const bool initial_step)
{
  assemble(initial_step, false);
}

StationaryNavierStokes::solve

In this function, we use FGMRES together with the block preconditioner, which is defined at the beginning of the program, to solve the linear system. What we obtain at this step is the solution vector. If this is the initial step, the solution vector gives us an initial guess for the Navier Stokes equations. For the initial step, nonzero constraints are applied in order to make sure boundary conditions are satisfied. In the following steps, we will solve for the Newton update so zero constraints are used.

template <int dim>
void StationaryNavierStokes<dim>::solve(const bool initial_step)
{
  const AffineConstraints<double> &constraints_used =
    initial_step ? nonzero_constraints : zero_constraints;
  SolverControl                     solver_control(system_matrix.m(),
                               1e-4 * system_rhs.l2_norm(),
                               true);
  SolverFGMRES<BlockVector<double>> gmres(solver_control);
  SparseILU<double> pmass_preconditioner;
  pmass_preconditioner.initialize(pressure_mass_matrix,
                                  SparseILU<double>::AdditionalData());
  const BlockSchurPreconditioner<SparseILU<double>> preconditioner(
    gamma,
    viscosity,
    system_matrix,
    pressure_mass_matrix,
    pmass_preconditioner);
  gmres.solve(system_matrix, newton_update, system_rhs, preconditioner);
  std::cout << " ****FGMRES steps: " << solver_control.last_step()
            << std::endl;
  constraints_used.distribute(newton_update);
}

StationaryNavierStokes::refine_mesh

After finding a good initial guess on the coarse mesh, we hope to decrease the error through refining the mesh. Here we do adaptive refinement similar to step-15 except that we use the Kelly estimator on the velocity only. We also need to transfer the current solution to the next mesh using the SolutionTransfer class.

template <int dim>
void StationaryNavierStokes<dim>::refine_mesh()
{
  Vector<float> estimated_error_per_cell(triangulation.n_active_cells());
  FEValuesExtractors::Vector velocity(0);
  KellyErrorEstimator<dim>::estimate(
    dof_handler,
    QGauss<dim - 1>(degree + 1),
    std::map<types::boundary_id, const Function<dim> *>(),
    present_solution,
    estimated_error_per_cell,
    fe.component_mask(velocity));
  GridRefinement::refine_and_coarsen_fixed_number(triangulation,
                                                  estimated_error_per_cell,
                                                  0.3,
                                                  0.0);
  triangulation.prepare_coarsening_and_refinement();
  SolutionTransfer<dim, BlockVector<double>> solution_transfer(dof_handler);
  solution_transfer.prepare_for_coarsening_and_refinement(present_solution);
  triangulation.execute_coarsening_and_refinement();

First the DoFHandler is set up and constraints are generated. Then we create a temporary vector "tmp", whose size is according with the solution on the new mesh.

setup_dofs();
BlockVector<double> tmp(dofs_per_block);

Transfer solution from coarse to fine mesh and apply boundary value constraints to the new transferred solution. Note that present_solution is still a vector corresponding to the old mesh.

solution_transfer.interpolate(present_solution, tmp);

nonzero_constraints.distribute(tmp);

Finally set up matrix and vectors and set the present_solution to the interpolated data.

  initialize_system();
  present_solution = tmp;
}

StationaryNavierStokes<dim>::newton_iteration

This function implements the Newton iteration with given tolerance, maximum number of iterations, and the number of mesh refinements to do. "is_initial_step" is the flag to tell us whether "setup_system" is necessary, and which part, system matrix or right hand side vector, should be assembled. If we do a line search, the right hand side is already assembled while checking the residual norm in the last iteration. Therefore, we just need to assemble the system matrix at the current iteration. The last argument "output_result" is whether output should be produced.

template <int dim>
void StationaryNavierStokes<dim>::newton_iteration(
  const double       tolerance,
  const unsigned int max_iteration,
  const unsigned int max_refinement,
  const bool         is_initial_step,
  const bool         output_result)
{
  double current_res;
  double last_res;
  bool   first_step = is_initial_step;
  for (unsigned int refinement = 0; refinement < max_refinement + 1;
       ++refinement)
    {
      unsigned int outer_iteration = 0;
      last_res                     = 1.0;
      current_res                  = 1.0;
      std::cout << "*****************************************" << std::endl;
      std::cout << "************  refinement = " << refinement
                << " ************ " << std::endl;
      std::cout << "viscosity= " << viscosity << std::endl;
      std::cout << "*****************************************" << std::endl;
      while ((first_step || (current_res > tolerance)) &&
             outer_iteration < max_iteration)
        {
          if (first_step)
            {
              setup_dofs();
              initialize_system();
              evaluation_point = present_solution;
              assemble_system(first_step);
              solve(first_step);
              present_solution = newton_update;
              nonzero_constraints.distribute(present_solution);
              first_step       = false;
              evaluation_point = present_solution;
              assemble_rhs(first_step);
              current_res = system_rhs.l2_norm();
              std::cout << "******************************" << std::endl;
              std::cout << " The residual of initial guess is " << current_res
                        << std::endl;
              std::cout << " Initialization complete!  " << std::endl;
              last_res = current_res;
            }
          else
            {
              evaluation_point = present_solution;
              assemble_system(first_step);
              solve(first_step);

To make sure our solution is getting close to the exact solution, we let the solution be updated with a weight alpha such that the new residual is smaller than the one of last step, which is done in the following loop. Also the line search method can be located in step-15.

              for (double alpha = 1.0; alpha > 1e-5; alpha *= 0.5)
                {
                  evaluation_point = present_solution;
                  evaluation_point.add(alpha, newton_update);
                  nonzero_constraints.distribute(evaluation_point);
                  assemble_rhs(first_step);
                  current_res = system_rhs.l2_norm();
                  std::cout << " alpha = " << std::setw(6) << alpha
                            << std::setw(0) << " res = " << current_res
                            << std::endl;
                  if (current_res < last_res)
                    break;
                }
              {
                present_solution = evaluation_point;
                std::cout << " ---- Iteration " << outer_iteration
                          << " residual: " << current_res << std::endl;
                last_res = current_res;
              }
            }
          ++outer_iteration;
          if (output_result)
            {
              output_results(max_iteration * refinement + outer_iteration);
              if (current_res <= tolerance)
                process_solution(refinement);
            }
        }
      if (refinement < max_refinement)
        {
          refine_mesh();
          std::cout << "*****************************************"
                    << std::endl
                    << "        Do refinement ------   " << std::endl;
        }
    }
}

StationaryNavierStokes::compute_initial_guess

This function will provide us with an initial guess by using a continuation method as we discussed in the introduction. The Reynolds number is increased step-by-step until we reach the target value. By experiment, the solution to Stokes is good enough to be the initial guess of NSE with Reynolds number 1000 so we start there. To make sure the solution from previous problem is close enough to the next one, the step size must be small enough.

template <int dim>
void StationaryNavierStokes<dim>::compute_initial_guess(double step_size)
{
  const double target_Re = 1.0 / viscosity;
  bool is_initial_step = true;
  for (double Re = 1000.0; Re < target_Re;
       Re        = std::min(Re + step_size, target_Re))
    {
      viscosity = 1.0 / Re;
      std::cout << "*****************************************" << std::endl;
      std::cout << " Searching for initial guess with Re = " << Re
                << std::endl;
      std::cout << "*****************************************" << std::endl;
      newton_iteration(1e-12, 50, 0, is_initial_step, false);
      is_initial_step = false;
    }
}

StationaryNavierStokes::output_results

This function is the same as in step-22 except that we choose a name for the output file that also contains the Reynolds number (i.e., the inverse of the viscosity in the current context).

template <int dim>
void StationaryNavierStokes<dim>::output_results(
  const unsigned int output_index) const
{
  std::vector<std::string> solution_names(dim, "velocity");
  solution_names.emplace_back("pressure");
  std::vector<DataComponentInterpretation::DataComponentInterpretation>
    data_component_interpretation(
      dim, DataComponentInterpretation::component_is_part_of_vector);
  data_component_interpretation.push_back(
    DataComponentInterpretation::component_is_scalar);
  DataOut<dim> data_out;
  data_out.attach_dof_handler(dof_handler);
  data_out.add_data_vector(present_solution,
                           solution_names,
                           DataOut<dim>::type_dof_data,
                           data_component_interpretation);
  data_out.build_patches();
  std::ofstream output(std::to_string(1.0 / viscosity) + "-solution-" +
                       Utilities::int_to_string(output_index, 4) + ".vtk");
  data_out.write_vtk(output);
}

StationaryNavierStokes::process_solution

In our test case, we do not know the analytical solution. This function outputs the velocity components along x=0.5 and y from 0 to 1 so they can be compared with data from the literature.

template <int dim>
void StationaryNavierStokes<dim>::process_solution(unsigned int refinement)
{
  std::ofstream f(std::to_string(1.0 / viscosity) + "-line-" +
                  std::to_string(refinement) + ".txt");
  f << "# y u_x u_y" << std::endl;
  Point<dim> p;
  p(0) = 0.5;
  p(1) = 0.5;
  f << std::scientific;
  for (unsigned int i = 0; i <= 100; ++i)
    {
      p(dim - 1) = i / 100.0;
      Vector<double> tmp_vector(dim + 1);
      VectorTools::point_value(dof_handler, present_solution, p, tmp_vector);
      f << p(dim - 1);
      for (int j = 0; j < dim; j++)
        f << " " << tmp_vector(j);
      f << std::endl;
    }
}

StationaryNavierStokes::run

This is the last step of this program. In this part, we generate the grid and run the other functions respectively. The max refinement can be set by the argument.

template <int dim>
void StationaryNavierStokes<dim>::run(const unsigned int refinement)
{
  GridGenerator::hyper_cube(triangulation);
  triangulation.refine_global(5);
  const double Re = 1.0 / viscosity;

If the viscosity is smaller than 1/1000, we have to first search for an initial guess via a continuation method. What we should notice is the search is always on the initial mesh, that is the \(8 \times 8\) mesh in this program. After that, we just do the same as we did when viscosity is larger than 1/1000: run Newton's iteration, refine the mesh, transfer solutions, and repeat.

if (Re > 1000.0)
  {
    std::cout << "       Searching for initial guess ... " << std::endl;
    const double step_size = 2000.0;
    compute_initial_guess(step_size);
    std::cout << "*****************************************" << std::endl
              << "       Initial guess obtained            " << std::endl
              << "                  *                      " << std::endl
              << "                  *                      " << std::endl
              << "                  *                      " << std::endl
              << "                  *                      " << std::endl
              << "*****************************************" << std::endl;
    std::cout << "       Computing solution with target Re = " << Re
              << std::endl;
    viscosity = 1.0 / Re;
    newton_iteration(1e-12, 50, refinement, false, true);
  }
else
  {

When the viscosity is larger than 1/1000, the solution to Stokes equations is good enough as an initial guess. If so, we do not need to search for the initial guess using a continuation method. Newton's iteration can be started directly.

        newton_iteration(1e-12, 50, refinement, true, true);
      }
  }
} // namespace Step57
int main()
{
  try
    {
      using namespace dealii;
      using namespace Step57;
      StationaryNavierStokes<2> flow(/ * degree = * / 1);
      flow.run(4);
    }
  catch (std::exception &exc)
    {
      std::cerr << std::endl
                << std::endl
                << "----------------------------------------------------"
                << std::endl;
      std::cerr << "Exception on processing: " << std::endl
                << exc.what() << std::endl
                << "Aborting!" << std::endl
                << "----------------------------------------------------"
                << std::endl;
      return 1;
    }
  catch (...)
    {
      std::cerr << std::endl
                << std::endl
                << "----------------------------------------------------"
                << std::endl;
      std::cerr << "Unknown exception!" << std::endl
                << "Aborting!" << std::endl
                << "----------------------------------------------------"
                << std::endl;
      return 1;
    }
  return 0;
}

Results

Now we use the method we discussed above to solve Navier Stokes equations with viscosity 1/400 and 1/7500.

Test case 1: Re=400

In the first test case the viscosity is set to be 1/400. As we discussed in the introduction, the initial guess is the solution to the corresponding Stokes problem. In the following table, the residuals at each Newton's iteration on every mesh is shown. The data in the table shows that Newton's iteration converges quadratically.

Re=400	Mesh0		Mesh1		Mesh2		Mesh3		Mesh4
Newton iter	Residual	FGMRES	Residual	FGMRES	Residual	FGMRES	Residual	FGMRES	Residual	FGMRES
1	3.7112e-03	5	6.4189e-03	3	2.4338e-03	3	1.0570e-03	3	4.9499e-04	3
2	7.0849e-04	5.0000e+00	9.9458e-04	5	1.1409e-04	6	1.3544e-05	6	1.4171e-06	6
3	1.9980e-05	5.0000e+00	4.5007e-05	5	2.9020e-08	5	4.4021e-10	6	6.3435e-11	6
4	2.3165e-09	6.0000e+00	1.6891e-07	5	1.2338e-14	7	1.8506e-14	8	8.8563e-15	8
5	1.2585e-13	7.0000e+00	1.4520e-11	6	1.9044e-13	8
6			1.3998e-15	8

The following figures show the sequence of generated grids. For the case of Re=400, the initial guess is obtained by solving Stokes on an \(8 \times 8\) mesh, and the mesh is refined adaptively. Between meshes, the solution from the coarse mesh is interpolated to the fine mesh to be used as an initial guess.

This picture is the graphical streamline result of lid-driven cavity with Re=400.

Then the solution is compared with a reference solution from [4] and the reference solution data can be found in the file "ref_2d_ghia_u.txt".

Test case 2: Re=7500

Newton's iteration requires a good initial guess. However, the nonlinear term dominates when the Reynold number is large, so that the solution to the Stokes equations may be far away from the exact solution. If the Stokes solution acts as the initial guess, the convergence will be lost. The following picture shows that the nonlinear iteration gets stuck and the residual no longer decreases in further iterations.

The initial guess, therefore, has to be obtained via a continuation method which has been discussed in the introduction. Here the step size in the continuation method, that is \(|\nu_{i}-\nu_{i+1}|\), is 2000 and the initial mesh is of size \(32 \times 32\). After obtaining an initial guess, the mesh is refined as in the previous test case. The following picture shows that at each refinement Newton's iteration has quadratic convergence. 52 steps of Newton's iterations are executed for solving this test case.

We also show the residual from each step of Newton's iteration on every mesh. The quadratic convergence is clearly visible in the table.

Re=7500	Mesh0		Mesh1		Mesh2		Mesh3		Mesh4
Newton iter	Residual	FGMRES	Residual	FGMRES	Residual	FGMRES	Residual	FGMRES	Residual	FGMRES
1	1.8922e-06	6	4.2506e-03	3	1.4299e-03	3	4.8793e-04	2	1.8998e-04	2
2	3.1644e-09	8	1.3732e-03	7	4.1506e-04	7	9.1119e-05	8	1.3555e-05	8
3	1.7611e-14	9	2.1946e-04	6	1.7881e-05	6	5.2678e-07	7	9.3739e-09	7
4			8.8269e-06	6	6.8210e-09	7	2.2770e-11	8	1.2588e-13	9
5			1.2974e-07	7	1.2515e-13	9	1.7801e-14	1
6			4.4352e-11	7
7			6.2863e-15	9

The sequence of generated grids looks like this:

We compare our solution with reference solution from [5].

The following picture presents the graphical result.

Furthermore, the error consists of the nonlinear error, which is decreasing as Newton's iteration going on, and the discretization error, which depends on the mesh size. That is why we have to refine the mesh and repeat Newton's iteration on the next finer mesh. From the table above, we can see that the residual (nonlinear error) is below 1e-12 on each mesh, but the following picture shows us the difference between solutions on subsequently finer meshes.

Possibilities for extensions

Compare to other solvers

It is easy to compare the currently implemented linear solver to just using UMFPACK for the whole linear system. You need to remove the nullspace containing the constant pressures and it is done in step-56. More interesting is the comparison to other state of the art preconditioners like PCD. It turns out that the preconditioner here is very competitive, as can be seen in the paper [2].

The following table shows the timing results between our iterative approach (FGMRES) compared to a direct solver (UMFPACK) for the whole system with viscosity set to 1/400. Even though we use the same direct solver for the velocity block in the iterative solver, it is considerably faster and consumes less memory. This will be even more pronounced in 3d.

Ref	DoFs	Iterative: Total/s (Setup/s)	Direct: Total/s (Setup/s)
5	9539	0.10 (0.06)	0.13 (0.12)
6	37507	0.58 (0.37)	1.03 (0.97)
7	148739	3.59 (2.73)	7.78 (7.53)
8	592387	29.17 (24.94)	(>4GB RAM)

3D computations

The code is set up to also run in 3d. Of course the reference values are different, see [6] for example. High resolution computations are not doable with this example as is, because a direct solver for the velocity block does not work well in 3d. Rather, a parallel solver based on algebraic or geometric multigrid is needed. See below.

Parallelization

For larger computations, especially in 3D, it is necessary to implement MPI parallel solvers and preconditioners. A good starting point would be step-55, which uses algebraic multigrid for the velocity block for the Stokes equations.

The plain program

/* ---------------------------------------------------------------------
 *
 * Copyright (C) 2008 - 2018 by the deal.II authors
 *
 * This file is part of the deal.II library.
 *
 * The deal.II library is free software; you can use it, redistribute
 * it, and/or modify it under the terms of the GNU Lesser General
 * Public License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 * The full text of the license can be found in the file LICENSE.md at
 * the top level directory of deal.II.
 *
 * ---------------------------------------------------------------------
 *
 * Author: Liang Zhao and Timo Heister, Clemson University, 2016
 */
#include <deal.II/base/quadrature_lib.h>
#include <deal.II/base/function.h>
#include <deal.II/base/logstream.h>
#include <deal.II/base/utilities.h>
#include <deal.II/base/tensor.h>
#include <deal.II/lac/block_vector.h>
#include <deal.II/lac/full_matrix.h>
#include <deal.II/lac/block_sparse_matrix.h>
#include <deal.II/lac/dynamic_sparsity_pattern.h>
#include <deal.II/lac/solver_cg.h>
#include <deal.II/lac/solver_gmres.h>
#include <deal.II/lac/precondition.h>
#include <deal.II/lac/affine_constraints.h>
#include <deal.II/lac/sparse_direct.h>
#include <deal.II/grid/tria.h>
#include <deal.II/grid/grid_generator.h>
#include <deal.II/grid/tria_accessor.h>
#include <deal.II/grid/tria_iterator.h>
#include <deal.II/grid/manifold_lib.h>
#include <deal.II/grid/grid_refinement.h>
#include <deal.II/grid/grid_tools.h>
#include <deal.II/dofs/dof_handler.h>
#include <deal.II/dofs/dof_renumbering.h>
#include <deal.II/dofs/dof_accessor.h>
#include <deal.II/dofs/dof_tools.h>
#include <deal.II/fe/fe_q.h>
#include <deal.II/fe/fe_system.h>
#include <deal.II/fe/fe_values.h>
#include <deal.II/numerics/vector_tools.h>
#include <deal.II/numerics/matrix_tools.h>
#include <deal.II/numerics/data_out.h>
#include <deal.II/numerics/error_estimator.h>
#include <deal.II/numerics/solution_transfer.h>
#include <deal.II/lac/sparse_direct.h>
#include <deal.II/lac/sparse_ilu.h>
#include <fstream>
#include <iostream>
namespace Step57
{
  using namespace dealii;
  template <int dim>
  class StationaryNavierStokes
  {
  public:
    StationaryNavierStokes(const unsigned int degree);
    void run(const unsigned int refinement);
  private:
    void setup_dofs();
    void initialize_system();
    void assemble(const bool initial_step, const bool assemble_matrix);
    void assemble_system(const bool initial_step);
    void assemble_rhs(const bool initial_step);
    void solve(const bool initial_step);
    void refine_mesh();
    void process_solution(unsigned int refinement);
    void output_results(const unsigned int refinement_cycle) const;
    void newton_iteration(const double       tolerance,
                          const unsigned int max_iteration,
                          const unsigned int n_refinements,
                          const bool         is_initial_step,
                          const bool         output_result);
    void compute_initial_guess(double step_size);
    double                               viscosity;
    double                               gamma;
    const unsigned int                   degree;
    std::vector<types::global_dof_index> dofs_per_block;
    Triangulation<dim> triangulation;
    FESystem<dim>      fe;
    DoFHandler<dim>    dof_handler;
    AffineConstraints<double> zero_constraints;
    AffineConstraints<double> nonzero_constraints;
    BlockSparsityPattern      sparsity_pattern;
    BlockSparseMatrix<double> system_matrix;
    SparseMatrix<double>      pressure_mass_matrix;
    BlockVector<double> present_solution;
    BlockVector<double> newton_update;
    BlockVector<double> system_rhs;
    BlockVector<double> evaluation_point;
  };
  template <int dim>
  class BoundaryValues : public Function<dim>
  {
  public:
    BoundaryValues()
      : Function<dim>(dim + 1)
    {}
    virtual double value(const Point<dim> & p,
                         const unsigned int component) const override;
    virtual void vector_value(const Point<dim> &p,
                              Vector<double> &  values) const override;
  };
  template <int dim>
  double BoundaryValues<dim>::value(const Point<dim> & p,
                                    const unsigned int component) const
  {
    Assert(component < this->n_components,
           ExcIndexRange(component, 0, this->n_components));
    if (component == 0 && std::abs(p[dim - 1] - 1.0) < 1e-10)
      return 1.0;
    return 0;
  }
  template <int dim>
  void BoundaryValues<dim>::vector_value(const Point<dim> &p,
                                         Vector<double> &  values) const
  {
    for (unsigned int c = 0; c < this->n_components; ++c)
      values(c) = BoundaryValues<dim>::value(p, c);
  }
  template <class PreconditionerMp>
  class BlockSchurPreconditioner : public Subscriptor
  {
  public:
    BlockSchurPreconditioner(double                           gamma,
                             double                           viscosity,
                             const BlockSparseMatrix<double> &S,
                             const SparseMatrix<double> &     P,
                             const PreconditionerMp &         Mppreconditioner);
    void vmult(BlockVector<double> &dst, const BlockVector<double> &src) const;
  private:
    const double                     gamma;
    const double                     viscosity;
    const BlockSparseMatrix<double> &stokes_matrix;
    const SparseMatrix<double> &     pressure_mass_matrix;
    const PreconditionerMp &         mp_preconditioner;
    SparseDirectUMFPACK              A_inverse;
  };
  template <class PreconditionerMp>
  BlockSchurPreconditioner<PreconditionerMp>::BlockSchurPreconditioner(
    double                           gamma,
    double                           viscosity,
    const BlockSparseMatrix<double> &S,
    const SparseMatrix<double> &     P,
    const PreconditionerMp &         Mppreconditioner)
    : gamma(gamma)
    , viscosity(viscosity)
    , stokes_matrix(S)
    , pressure_mass_matrix(P)
    , mp_preconditioner(Mppreconditioner)
  {
    A_inverse.initialize(stokes_matrix.block(0, 0));
  }
  template <class PreconditionerMp>
  void BlockSchurPreconditioner<PreconditionerMp>::vmult(
    BlockVector<double> &      dst,
    const BlockVector<double> &src) const
  {
    Vector<double> utmp(src.block(0));
    {
      SolverControl solver_control(1000, 1e-6 * src.block(1).l2_norm());
      SolverCG<>    cg(solver_control);
      dst.block(1) = 0.0;
      cg.solve(pressure_mass_matrix,
               dst.block(1),
               src.block(1),
               mp_preconditioner);
      dst.block(1) *= -(viscosity + gamma);
    }
    {
      stokes_matrix.block(0, 1).vmult(utmp, dst.block(1));
      utmp *= -1.0;
      utmp += src.block(0);
    }
    A_inverse.vmult(dst.block(0), utmp);
  }
  template <int dim>
  StationaryNavierStokes<dim>::StationaryNavierStokes(const unsigned int degree)
    : viscosity(1.0 / 7500.0)
    , gamma(1.0)
    , degree(degree)
    , triangulation(Triangulation<dim>::maximum_smoothing)
    , fe(FE_Q<dim>(degree + 1), dim, FE_Q<dim>(degree), 1)
    , dof_handler(triangulation)
  {}
  template <int dim>
  void StationaryNavierStokes<dim>::setup_dofs()
  {
    system_matrix.clear();
    pressure_mass_matrix.clear();
    dof_handler.distribute_dofs(fe);
    std::vector<unsigned int> block_component(dim + 1, 0);
    block_component[dim] = 1;
    DoFRenumbering::component_wise(dof_handler, block_component);
    dofs_per_block.resize(2);
    DoFTools::count_dofs_per_block(dof_handler,
                                   dofs_per_block,
                                   block_component);
    unsigned int dof_u = dofs_per_block[0];
    unsigned int dof_p = dofs_per_block[1];
    FEValuesExtractors::Vector velocities(0);
    {
      nonzero_constraints.clear();
      DoFTools::make_hanging_node_constraints(dof_handler, nonzero_constraints);
      VectorTools::interpolate_boundary_values(dof_handler,
                                               0,
                                               BoundaryValues<dim>(),
                                               nonzero_constraints,
                                               fe.component_mask(velocities));
    }
    nonzero_constraints.close();
    {
      zero_constraints.clear();
      DoFTools::make_hanging_node_constraints(dof_handler, zero_constraints);
      VectorTools::interpolate_boundary_values(dof_handler,
                                               0,
                                               Functions::ZeroFunction<dim>(
                                                 dim + 1),
                                               zero_constraints,
                                               fe.component_mask(velocities));
    }
    zero_constraints.close();
    std::cout << "   Number of active cells: " << triangulation.n_active_cells()
              << std::endl
              << "   Number of degrees of freedom: " << dof_handler.n_dofs()
              << " (" << dof_u << '+' << dof_p << ')' << std::endl;
  }
  template <int dim>
  void StationaryNavierStokes<dim>::initialize_system()
  {
    {
      BlockDynamicSparsityPattern dsp(dofs_per_block, dofs_per_block);
      DoFTools::make_sparsity_pattern(dof_handler, dsp, nonzero_constraints);
      sparsity_pattern.copy_from(dsp);
    }
    system_matrix.reinit(sparsity_pattern);
    present_solution.reinit(dofs_per_block);
    newton_update.reinit(dofs_per_block);
    system_rhs.reinit(dofs_per_block);
  }
  template <int dim>
  void StationaryNavierStokes<dim>::assemble(const bool initial_step,
                                             const bool assemble_matrix)
  {
    if (assemble_matrix)
      system_matrix = 0;
    system_rhs = 0;
    QGauss<dim> quadrature_formula(degree + 2);
    FEValues<dim> fe_values(fe,
                            quadrature_formula,
                            update_values | update_quadrature_points |
                              update_JxW_values | update_gradients);
    const unsigned int dofs_per_cell = fe.dofs_per_cell;
    const unsigned int n_q_points    = quadrature_formula.size();
    const FEValuesExtractors::Vector velocities(0);
    const FEValuesExtractors::Scalar pressure(dim);
    FullMatrix<double> local_matrix(dofs_per_cell, dofs_per_cell);
    Vector<double>     local_rhs(dofs_per_cell);
    std::vector<types::global_dof_index> local_dof_indices(dofs_per_cell);
    std::vector<Tensor<1, dim>> present_velocity_values(n_q_points);
    std::vector<Tensor<2, dim>> present_velocity_gradients(n_q_points);
    std::vector<double>         present_pressure_values(n_q_points);
    std::vector<double>         div_phi_u(dofs_per_cell);
    std::vector<Tensor<1, dim>> phi_u(dofs_per_cell);
    std::vector<Tensor<2, dim>> grad_phi_u(dofs_per_cell);
    std::vector<double>         phi_p(dofs_per_cell);
    for (const auto &cell : dof_handler.active_cell_iterators())
      {
        fe_values.reinit(cell);
        local_matrix = 0;
        local_rhs    = 0;
        fe_values[velocities].get_function_values(evaluation_point,
                                                  present_velocity_values);
        fe_values[velocities].get_function_gradients(
          evaluation_point, present_velocity_gradients);
        fe_values[pressure].get_function_values(evaluation_point,
                                                present_pressure_values);
        for (unsigned int q = 0; q < n_q_points; ++q)
          {
            for (unsigned int k = 0; k < dofs_per_cell; ++k)
              {
                div_phi_u[k]  = fe_values[velocities].divergence(k, q);
                grad_phi_u[k] = fe_values[velocities].gradient(k, q);
                phi_u[k]      = fe_values[velocities].value(k, q);
                phi_p[k]      = fe_values[pressure].value(k, q);
              }
            for (unsigned int i = 0; i < dofs_per_cell; ++i)
              {
                if (assemble_matrix)
                  {
                    for (unsigned int j = 0; j < dofs_per_cell; ++j)
                      {
                        local_matrix(i, j) +=
                          (viscosity *
                             scalar_product(grad_phi_u[j], grad_phi_u[i]) +
                           present_velocity_gradients[q] * phi_u[j] * phi_u[i] +
                           grad_phi_u[j] * present_velocity_values[q] *
                             phi_u[i] -
                           div_phi_u[i] * phi_p[j] - phi_p[i] * div_phi_u[j] +
                           gamma * div_phi_u[j] * div_phi_u[i] +
                           phi_p[i] * phi_p[j]) *
                          fe_values.JxW(q);
                      }
                  }
                double present_velocity_divergence =
                  trace(present_velocity_gradients[q]);
                local_rhs(i) +=
                  (-viscosity * scalar_product(present_velocity_gradients[q],
                                               grad_phi_u[i]) -
                   present_velocity_gradients[q] * present_velocity_values[q] *
                     phi_u[i] +
                   present_pressure_values[q] * div_phi_u[i] +
                   present_velocity_divergence * phi_p[i] -
                   gamma * present_velocity_divergence * div_phi_u[i]) *
                  fe_values.JxW(q);
              }
          }
        cell->get_dof_indices(local_dof_indices);
        const AffineConstraints<double> &constraints_used =
          initial_step ? nonzero_constraints : zero_constraints;
        if (assemble_matrix)
          {
            constraints_used.distribute_local_to_global(local_matrix,
                                                        local_rhs,
                                                        local_dof_indices,
                                                        system_matrix,
                                                        system_rhs);
          }
        else
          {
            constraints_used.distribute_local_to_global(local_rhs,
                                                        local_dof_indices,
                                                        system_rhs);
          }
      }
    if (assemble_matrix)
      {
        pressure_mass_matrix.reinit(sparsity_pattern.block(1, 1));
        pressure_mass_matrix.copy_from(system_matrix.block(1, 1));
        system_matrix.block(1, 1) = 0;
      }
  }
  template <int dim>
  void StationaryNavierStokes<dim>::assemble_system(const bool initial_step)
  {
    assemble(initial_step, true);
  }
  template <int dim>
  void StationaryNavierStokes<dim>::assemble_rhs(const bool initial_step)
  {
    assemble(initial_step, false);
  }
  template <int dim>
  void StationaryNavierStokes<dim>::solve(const bool initial_step)
  {
    const AffineConstraints<double> &constraints_used =
      initial_step ? nonzero_constraints : zero_constraints;
    SolverControl                     solver_control(system_matrix.m(),
                                 1e-4 * system_rhs.l2_norm(),
                                 true);
    SolverFGMRES<BlockVector<double>> gmres(solver_control);
    SparseILU<double> pmass_preconditioner;
    pmass_preconditioner.initialize(pressure_mass_matrix,
                                    SparseILU<double>::AdditionalData());
    const BlockSchurPreconditioner<SparseILU<double>> preconditioner(
      gamma,
      viscosity,
      system_matrix,
      pressure_mass_matrix,
      pmass_preconditioner);
    gmres.solve(system_matrix, newton_update, system_rhs, preconditioner);
    std::cout << " ****FGMRES steps: " << solver_control.last_step()
              << std::endl;
    constraints_used.distribute(newton_update);
  }
  template <int dim>
  void StationaryNavierStokes<dim>::refine_mesh()
  {
    Vector<float> estimated_error_per_cell(triangulation.n_active_cells());
    FEValuesExtractors::Vector velocity(0);
    KellyErrorEstimator<dim>::estimate(
      dof_handler,
      QGauss<dim - 1>(degree + 1),
      std::map<types::boundary_id, const Function<dim> *>(),
      present_solution,
      estimated_error_per_cell,
      fe.component_mask(velocity));
    GridRefinement::refine_and_coarsen_fixed_number(triangulation,
                                                    estimated_error_per_cell,
                                                    0.3,
                                                    0.0);
    triangulation.prepare_coarsening_and_refinement();
    SolutionTransfer<dim, BlockVector<double>> solution_transfer(dof_handler);
    solution_transfer.prepare_for_coarsening_and_refinement(present_solution);
    triangulation.execute_coarsening_and_refinement();
    setup_dofs();
    BlockVector<double> tmp(dofs_per_block);
    solution_transfer.interpolate(present_solution, tmp);
    nonzero_constraints.distribute(tmp);
    initialize_system();
    present_solution = tmp;
  }
  template <int dim>
  void StationaryNavierStokes<dim>::newton_iteration(
    const double       tolerance,
    const unsigned int max_iteration,
    const unsigned int max_refinement,
    const bool         is_initial_step,
    const bool         output_result)
  {
    double current_res;
    double last_res;
    bool   first_step = is_initial_step;
    for (unsigned int refinement = 0; refinement < max_refinement + 1;
         ++refinement)
      {
        unsigned int outer_iteration = 0;
        last_res                     = 1.0;
        current_res                  = 1.0;
        std::cout << "*****************************************" << std::endl;
        std::cout << "************  refinement = " << refinement
                  << " ************ " << std::endl;
        std::cout << "viscosity= " << viscosity << std::endl;
        std::cout << "*****************************************" << std::endl;
        while ((first_step || (current_res > tolerance)) &&
               outer_iteration < max_iteration)
          {
            if (first_step)
              {
                setup_dofs();
                initialize_system();
                evaluation_point = present_solution;
                assemble_system(first_step);
                solve(first_step);
                present_solution = newton_update;
                nonzero_constraints.distribute(present_solution);
                first_step       = false;
                evaluation_point = present_solution;
                assemble_rhs(first_step);
                current_res = system_rhs.l2_norm();
                std::cout << "******************************" << std::endl;
                std::cout << " The residual of initial guess is " << current_res
                          << std::endl;
                std::cout << " Initialization complete!  " << std::endl;
                last_res = current_res;
              }
            else
              {
                evaluation_point = present_solution;
                assemble_system(first_step);
                solve(first_step);
                for (double alpha = 1.0; alpha > 1e-5; alpha *= 0.5)
                  {
                    evaluation_point = present_solution;
                    evaluation_point.add(alpha, newton_update);
                    nonzero_constraints.distribute(evaluation_point);
                    assemble_rhs(first_step);
                    current_res = system_rhs.l2_norm();
                    std::cout << " alpha = " << std::setw(6) << alpha
                              << std::setw(0) << " res = " << current_res
                              << std::endl;
                    if (current_res < last_res)
                      break;
                  }
                {
                  present_solution = evaluation_point;
                  std::cout << " ---- Iteration " << outer_iteration
                            << " residual: " << current_res << std::endl;
                  last_res = current_res;
                }
              }
            ++outer_iteration;
            if (output_result)
              {
                output_results(max_iteration * refinement + outer_iteration);
                if (current_res <= tolerance)
                  process_solution(refinement);
              }
          }
        if (refinement < max_refinement)
          {
            refine_mesh();
            std::cout << "*****************************************"
                      << std::endl
                      << "        Do refinement ------   " << std::endl;
          }
      }
  }
  template <int dim>
  void StationaryNavierStokes<dim>::compute_initial_guess(double step_size)
  {
    const double target_Re = 1.0 / viscosity;
    bool is_initial_step = true;
    for (double Re = 1000.0; Re < target_Re;
         Re        = std::min(Re + step_size, target_Re))
      {
        viscosity = 1.0 / Re;
        std::cout << "*****************************************" << std::endl;
        std::cout << " Searching for initial guess with Re = " << Re
                  << std::endl;
        std::cout << "*****************************************" << std::endl;
        newton_iteration(1e-12, 50, 0, is_initial_step, false);
        is_initial_step = false;
      }
  }
  template <int dim>
  void StationaryNavierStokes<dim>::output_results(
    const unsigned int output_index) const
  {
    std::vector<std::string> solution_names(dim, "velocity");
    solution_names.emplace_back("pressure");
    std::vector<DataComponentInterpretation::DataComponentInterpretation>
      data_component_interpretation(
        dim, DataComponentInterpretation::component_is_part_of_vector);
    data_component_interpretation.push_back(
      DataComponentInterpretation::component_is_scalar);
    DataOut<dim> data_out;
    data_out.attach_dof_handler(dof_handler);
    data_out.add_data_vector(present_solution,
                             solution_names,
                             DataOut<dim>::type_dof_data,
                             data_component_interpretation);
    data_out.build_patches();
    std::ofstream output(std::to_string(1.0 / viscosity) + "-solution-" +
                         Utilities::int_to_string(output_index, 4) + ".vtk");
    data_out.write_vtk(output);
  }
  template <int dim>
  void StationaryNavierStokes<dim>::process_solution(unsigned int refinement)
  {
    std::ofstream f(std::to_string(1.0 / viscosity) + "-line-" +
                    std::to_string(refinement) + ".txt");
    f << "# y u_x u_y" << std::endl;
    Point<dim> p;
    p(0) = 0.5;
    p(1) = 0.5;
    f << std::scientific;
    for (unsigned int i = 0; i <= 100; ++i)
      {
        p(dim - 1) = i / 100.0;
        Vector<double> tmp_vector(dim + 1);
        VectorTools::point_value(dof_handler, present_solution, p, tmp_vector);
        f << p(dim - 1);
        for (int j = 0; j < dim; j++)
          f << " " << tmp_vector(j);
        f << std::endl;
      }
  }
  template <int dim>
  void StationaryNavierStokes<dim>::run(const unsigned int refinement)
  {
    GridGenerator::hyper_cube(triangulation);
    triangulation.refine_global(5);
    const double Re = 1.0 / viscosity;
    if (Re > 1000.0)
      {
        std::cout << "       Searching for initial guess ... " << std::endl;
        const double step_size = 2000.0;
        compute_initial_guess(step_size);
        std::cout << "*****************************************" << std::endl
                  << "       Initial guess obtained            " << std::endl
                  << "                  *                      " << std::endl
                  << "                  *                      " << std::endl
                  << "                  *                      " << std::endl
                  << "                  *                      " << std::endl
                  << "*****************************************" << std::endl;
        std::cout << "       Computing solution with target Re = " << Re
                  << std::endl;
        viscosity = 1.0 / Re;
        newton_iteration(1e-12, 50, refinement, false, true);
      }
    else
      {
        newton_iteration(1e-12, 50, refinement, true, true);
      }
  }
} // namespace Step57
int main()
{
  try
    {
      using namespace dealii;
      using namespace Step57;
      StationaryNavierStokes<2> flow(/* degree = */ 1);
      flow.run(4);
    }
  catch (std::exception &exc)
    {
      std::cerr << std::endl
                << std::endl
                << "----------------------------------------------------"
                << std::endl;
      std::cerr << "Exception on processing: " << std::endl
                << exc.what() << std::endl
                << "Aborting!" << std::endl
                << "----------------------------------------------------"
                << std::endl;
      return 1;
    }
  catch (...)
    {
      std::cerr << std::endl
                << std::endl
                << "----------------------------------------------------"
                << std::endl;
      std::cerr << "Unknown exception!" << std::endl
                << "Aborting!" << std::endl
                << "----------------------------------------------------"
                << std::endl;
      return 1;
    }
  return 0;
}