PINNs

Resources

Main Idea

PINNs are extremely nonlinear (global) Collocation Methods for forward solutions or Inverse Problems in Partial Differential Equations. The number of collocation points is generally unrelated to the number of variables to solve for (the parameters of the neural network), and a loss minimization is used, as opposed to solving a system of equations.

As opposed to some collocation methods which rely on recursive / analytic derivatives of basis functions, PINNs rely on Automatic Differentiation for the construction of their derivatives.

Vanilla PINNs suffer from inexact solution of BCs, ICs, and the Governing PDE. Some of these can be combatted through clever architecture changes, for simple cases.

Training

Recent work suggests that second-order/quasi-newton Optimization or Multi-Objective Optimization is essential for good performance, outweighing many other factors such as architecture and collocation point sampling. Generally to improve performance, this should be the first thing to try.

Exact Periodic BCs

If our domain is Ω×T = [L,L]×T, and we wish for our neural network representing the state, uθ(x,t) to satisfy periodic BCs, by construction, we can use the following trick:

uθ(x,t)=DNNθ(v(x),t),

where

v(x)=[sin(2πxL),cos(2πxL)].

We can add higher frequencies to v arbitrarily, scaling the arguments by integers. Then, v constructs a set of 2m orthogonal basis functions, where each of them satisfies the periodic BCs (equal value, and equal value of all derivatives). This is very similar to a spectral method, where we can construct a matrix Φ(x), and it's derivatives analytically. We then write the governing PDE in terms of the expansion u(x)=Φ(x)c. We compute the residual by plugging in ux=Φxc,uxx=Φxxc,... and solve for c. For a linear equation, this a linear system; otherwise, we iteratively the linearized system. We only construct Φ and it's derivatives once, as they just depend on the "mesh" (evaluation points or collocation points of the strong form). Note that the basis functions all satisfy the periodic boundary conditions, so the solution (in the span of the basis functions) must also. In fact, any function composition relying on this u(x), which does not elsewhere involve x, will have the same fulfillment of periodic BCs. This is essentially what the PINNs method uses.

Exact Initial Conditions

We want uθ(x,0)=g(x). Let DNNθ(x,t) represent some arbitrary function which will be used in this construction. It could itself have other compositions, such as satisfying periodic boundary conditions, as described above.
Construct

uθ(x,0)=(1ϕ(t))DNNθ(x,t)+ϕ(t)g(x),

where $$\phi(t) = \exp\left ( {-\frac{\lambda t}{T}} \right ) \cdot \left ( 1 - \frac{t}{T} \right ),$$
now with T denoting the maximum simulation time, and λ0. Other forms of ϕ can be used, although the influence of this arbitrary function is shadowed by the tunable parameters θ. Note that ϕ(0)=1 and ϕ(T)=0. In the construction of uθ, these correspond to g(x) and DNNθ(x,t) respectively. We could select λ based on some domain specific knowledge, for example the characteristic time scale in Turbulence, which is the "time required to forget the initial conditions".

Uncertainty

Suppose μ are added as parametric variables to the PDE. How can we solve the PDE for new values of μ?

  1. Naively build a PINNs solution for each new μ, throwing away previous results.
  2. Take uθ(x,t;μ) and train this now with samples over μ. This implies a sort of continuity or smoothness of u over μ, which may or may not be the case. There is one training over all μ.
  3. Do some sneaky hypernetwork tricks. I.e. have θ(μ) as a guess from some hypernetwork, then detach and use this as the initial guess for regular training. This has been done by having uθ take low rank updates, e.g. θ=Udiag(s)VT, and only train s during this 2nd phase. Share all U and VT across μ, but still learn them (with an orthogonality constraint). See “Hypernetwork-Based Meta-Learning for Low-Rank Physics-Informed Neural Networks.” Learning U and VT here can be interpreted as Meta-Learning.

Bayesian PINNs

To facilitate UQ and have a probabilistic version of uθ (and inverse parameters too), we can use B-PINNs. Using Bayesian Methods, these effectively assume a prior distribution over θ, p(θ), and aim to compute p(θ|data,residuals). This really only differs from the standard Bayesian methods by including the PDE residuals.

As part of this, we must construct a likelihood function, p(data,residuals|θ). We assume that the errors on the data and the residuals are independent of one another, and that each instance is itself independently drawn from the same Normal Distribution with known standard deviation σu or σf for the data and PDE residuals respectively. With this in hand, we can use either Variational Inference (e.g. VAEs) or Markov Chain Monte Carlo to get the posterior distribution. Then, we have samples of parameters from p(θ|data,residuals) that we can use to give predictions for uθ.

For inverse problems, we can just group those parameters with θ and continue as above, similar to how the easy generalization of PINNs to inverse problems.

Todo

  • Include other PINNs variants: training, architecture, and sampling
  • Discuss problems in training
  • Highlight scaling to higher dimensions