Lifting Operator

Resources

Problem Formulation

Let Ur and Up be appropriate Banach / Hilbert spaces for the solution spaces of

ut=Nr(u,ux,uxx),ut=Np(u,ux,uxx).

For instance, we may have urUr=H1(Ω) and upUp=H1(Ω) (for the PDEs describing the same domain Ω).
The goal of the lifting approach is to find a lifting operator

M:UrUp.

Then,

up=M(ur).

First, let's look at some properties for operators.

Linearity

The operator A:UV is linear if for α,βK(=R)

A(αu+βv)=αA(u)+βA(v),uU,vV,α,βK.

Such a linear operator is bounded (or continuous) if there exists c>0 such that

||A(u)||Vc||u||UuU.

This is similar to Lipschitz Continuity. The smallest c (infimum) such that the above holds is the operator norm, which for U=V=Rn and a symmetric operator is the largest absolute eigenvalue, as Au=|λ|maxu, and the norm is the same between both sides (other than taking the absolute value of λ). In this case, the operator norm is the same as the spectral norm. For non-square and non-symmetric matrices, this is the square root of the largest eigenvalue of ATA.

For an operator between function spaces, an integral Kernel and a (linear) differential operator are linear:

A(u)=K(x,s,t)u(s,t)ds.A(u)=a(x,t)nxnu

Further, the sum and composition of linear operators is still a linear operator. Thus, we can combine and repeat each of these separate forms while still being a linear operator.

Example: Operator from Diffusion to Advection Diffusion

ut=Nr(u,ux,uxx)=κuxx,ut=Np(u,ux,uxx)=κuxxcux.

Let Ur and Up be solution spaces for these equations. Without sufficient initial conditions / boundary conditions, multiple functions are in these function spaces (which are themselves in H1). We seek an operator / a map of the form given above. Further, by leaving κ unspecified, we can update Ur too: Ur=Ur(κ). However, we take κ=κ to simplify things. In this, we assume a particular form of M:

up(x,t)=M(ur)=A(x,t)ur(x,t)

By applying t and Np to the right most term, we can derive an expression involving only A(x,t) and ur(x,t). This eventually gives

up(x,t)=A0exp(c24κt+c2κx)ur(x,t),

for any A00. Here, we found not just a single operator M, but a family of these operators. Thus the operator M is not necessarily unique. Intuitively, we can have ur scaled by an arbitrary constant, and the r-system will hold, as it is a linear PDE. By imposing an initial condition, we should have fewer solutions. If the r-system has initial condition u0r=ur(x,0), then

u0p=up(x,0)=A(x,0)u0r,=A0exp(c2κx)u0r.

For the case of c=0, we see that u0p=A0u0r. Thus, we still have the same scaling ambiguity. We can think of this as having multiple r-systems. There are infinitely many diffusion systems (scaled by the A0 constant) that can be transformed into a single p-system. I doubt that adding boundary conditions would fix this issue, as it still persists after investigating the initial conditions. Let us suppose that we know the initial condition, and choose ur satisfying the initial condition. In other words, we align up and ur at t=0:

up(x,0)=ur(x,0)=M(ur)(x,0).

This requires

A0exp(c2κx)=1,

which only holds for c=0, which means ur=up. Thus, for the nontrivial case (urup), there is no such transformation of the form assumed in M(ur)(x,t)=A0exp(c24κt+c2κx)ur(x,t). Due to the generality of the derivation, I also think that there is no such transformation of the more general form M(ur)(x,t)=A(x,t)ur(x,t).

Example: Operator from Viscous Burgers' to Diffusion

ut=Nr(u,ux,uxx)=κuxx,ut=Np(u,ux,uxx)=νuxx(u22)x.

We use the first transformation

u(x,t)=2ν[log(ϕ(x,t))]x.

Plugging this in and simplifying gives

2ν[1ϕ(ϕtνϕxx)]x=0.

We integrate with respect to x and get an arbitrary constant, c(t)=f(t), and

ϕtνϕxx=ϕc(t).

Next, we introduce another transformation ur(x,t)=ϕ(x,t)exp(f(t)). Solving for ϕ and plugging into the above gives the diffusion equation,

utr=νuxxr.

Yet, for the initial condition, we have that

logu0r(x)=12νu(x,0)dx.

This indefinite integral introduces an arbitrary constant. Taking U(x) as the antiderivative of the initial condition u(x,0), this gives

u0r(x)=eCe12νeU(x).

Note that eC is an arbitrary multiplication constant. Again, only in very special conditions will we have that the initial conditions of the two systems are the same. For this case, also note that the first transformation would be nonlinear. The second transformation seems to be linear, but as a whole the mapping from up to ur (M1) is nonlinear.

Initial Condition

We require that the p-system and the r-system begin with the same state, u0(x). That is,

ur(x,0)=up(x,0)=M(ur)(x,0)

In other words, the initial condition requires that M=I at t=0. For instance, consider the form of M(ur)(x,t)=A(x,t)ur(x,t). Then,

ur(x,0)=A(x,0)ur(x,0)A(x,0)=1.

For a kernel operator, this requires that K(x,s,0)=δ(xs). The derivative operator may not work without specific requirements on the IC.

Continuity

There must be some sort of regularity requirement based on the continuity of both u(x,t) and v(x,t). If these are both continuous, then it is reasonable to expect that M is in some way continuous too. For example, we may expect A(x,t) to be continuous. K would have similar requirements, but maybe slightly looser, due to the smoothness introduced by the integral operator.

Operator as PDE Discovery with Solution Operator

One option to consider for M is by modifying the original PDE. We wish to construct M(ur)=M1M2M3ur. First, M1 constructs the residual according to utrNr(ur,uxr,uxxr), and stacks the original state as the input to M2. Then, M2 modifies the residual, potentially with a new term resembling Nϕ(ur,uxr,uxxr). Finally, M3 solves this modified PDE, returning the state, which is ideally up. The modified PDE aims to approximate the Np term (keeping the time derivative).

Using a simpler M, we can think of mapping directly between the state spaces, not relying on the PDE structure directly in the mapping of M. Conversely, we can think of this simpler mapping as "inducing" some PDE for Np.

Parameterization

We take some inspiration from the exact initial conditions for PINNs and parameterize as the following general form

M(ur)=(1ϕ(t))M(ur)(x,t)+ϕ(t)u0(x),

taking for instance

ϕ(t)=exp(λtT)(1tT).

The consideration of the form of ϕ is discussed in PINNs, notably with ϕ(0)=1, ϕ(T)=0, and λ0. For our application, ϕ describes how close ur is to up, at a given time.

We should consider more closely the error term

e(t)=||ur(,t)up(,t)||L2(Ω),

as this informs the ϕ(t) functional form. We may expect, due to the accumulation of errors in the time integration, e(t)exp(t). This is further discussed later

As mentioned above, this function maps from function spaces to function spaces, e.g. H1H1, or more generally UrUp Let us choose subspaces UhrUr and UhpUp, which have finite bases:

Uhr=Span{ϕir}i=1nr,Uhp=Span{ϕjp}j=1np.

In other words,

ur(x,t)=i=1nrcirϕir(x,t)up(x,t)=j=1npcjpϕjp(x,t)

Now, the operator Mh:UhrUhp can be fully defined by M:RnrRnp, or M(cr)=cp.

First implementation

In the spirit motivated above, a first implementation may be as follows:

  1. Construct ur(x,t) to satisfy the initial and boundary conditions. This may use the r-system PDE too. In other words, this can be a variant of Physics Informed Neural Network, without using the available system data (which comes from the p-system).
  2. We will evaluate ur(x,t) at a collection of points x, for a fixed time tk, giving ukr.
  3. Our transformation will apply at multiple times, rather than mapping from ur(x,t) to up(x,t), it will map from ur(x,tk) to up(x,tk). By repeatedly applying this, we can get the original operator that maps over both x and t. This will also map the discretized form M(ukr)=ukp, i.e. M:RnxRnx. Notably, for the discretized initial condition, u0r, this operator should behave as the identity. It may be worthwhile to design this operator in such a manner.
  4. Train both of these objects simultaneously. They serve different purposes, so hopefully they do not conflict with one another. It might make sense to train ur(x,t) first.

For a time invariant correction, we require that M(u0)=u0. One such form is

M(u)=δu0(u)u0+(δu0(u)1)M(u),δu0(u)={1 if u0=u,0 otherwise.

However, δu0(u) is discontinuous. Thus, we suggest using a new measure eu0:Rnx[0,1]. If we consider u0 as an input to this function, then it maps Rnx×Rnx[0,1]. This resembles normalized inner products. For instance, we may use

eu0(u)=|u,u0|||u||||u0||.

This is not differentiable at u0=u, but that might be okay.