SD-PINN: Physics Informed Neural Networks for Spatially Dependent PDES

The physics-informed neural network (PINN) is able to identify partial differential equation (PDE) coefficients which are constant across the space directly from physical measurements. In this paper, we propose a modification of PINN, named as SD-PINN, which can recover the coefficients in spatially-dependent PDEs using only one neural network without the requirement of domain-specific physical knowledge. The network structure is a simple fully connected neural network, and multiple physical information like the time-invariance and spatial-smoothness of the PDE coefficients is incorporated as loss functions. The method is robust to noise due to introduced physical constraints, which is verified by experiments.


INTRODUCTION
Many natural phenomena are described by parial differential equations (PDEs) which consist of multiple PDE terms.A PDE governing the dynamical behaviors of field U can be described by where the partial derivatives Ux, Uy, Ut, ... are the PDE terms and the a1, a2, ... are PDE coefficients.The PDE coefficients can be spatially-dependent, which indicates an inhomogeneous medium for the dynamics.In this work, we use the attenuating wave equation as the example: where Ut, Utt are the 1st and 2nd-order temporal derivatives of U and Uxx the 2nd-order spatial derivative of U .The PDE coefficient α ≥ 0 is the wave attenuation factor and c 2 > 0 is the square of the phase speed c.
Given coordinates (x, t) and the measurements U (x, t) on these spatio-temporal points, the PINN can learn the function that maps the coordinate to the measurement by an L-layer fully connected neural network (FCN) as shown in Fig. 1.The PINN assumes we know which PDE is governing the physical system from prior knowledge and only want to find the coefficients within the PDE, e.g., if we know the PDE is the wave equation as (2), the PINN can be employed to recover the PDE coefficients, i.e., the attenuation α and squared phase speed c 2 .We denote the weights and bias in all layers as the neural network parameter θ.In reality, the PDE governing the dynamics within a field can be spatially-dependent, e.g., the phase speed c can vary within an area of the medium (like water) due to the spatial variations of the temperature or the density.The PINN assumes that the PDE coefficients are identical across the whole region of interest (ROI) and thus is unable to identify the PDEs in these cases.There are only a few efforts about using PINN to recover spatially-dependent PDE coefficients [15,16], in which they construct two neural networks to recover the PDE coefficients in addition to approximating the solution of the PDE.To be specific, both of [15,16] use one neural network to approximate the dynamical behaviors in the field and another network to estimate the material parameters.The two networks are then joined via domain-specific physical knowledge such as the stress-strain relationship [15] or Maxwell's equations [16].
In this paper, we propose the spatially dependent physics informed neural network (SD-PINN) that is capable of recovering spatially dependent PDEs.In contrast to [15,16], we can recover the spatially-dependent coefficients at the same time of approximating the PDE solution using only one neural network, and do not require domain-specific physical knowledge.Compared to the previous spatially-dependent PDE recovery work which is based on least squares regression [17], the major advantage of the SD-PINN is its robustness against the noise in the input signals.

THEORY
Like for the original PINN, we assume the kind of PDE known and the task is to recover the coefficients for each term in the assumed PDE at all locations within the ROI.In addition, the true PDE coefficients at the spatial boundaries of the ROI are given in this work, and we also know the sign information (non-positive or non-negative) of each coefficients to be recovered.Given that the kind of PDE is known, the sign information is not a strong additional assumption because it is fixed by the physical background of the PDE.For example, in (2) the coefficient −c 2 for Uxx must be non-positive since c is a real number for the phase speed of the wave, and the α must be non-negative for a system without input energy from the external.

Formulation
We fix one term of the PDE as the left hand side (LHS) with its coefficient arbitrarily set to one at every location, and recover the coefficients for all other terms placed in the right hand side (RHS) for all locations.In this part we use the wave equation as the example, and for other PDEs the method works in the same way.We focus on time-invariant homogeneous PDE here, e.g., N[U ] = 0 in (2) which indicates that there is no source in the ROI and the α, c 2 do not change with time.
Since N[U ] = 0, we can write the wave equation ( 2) as For M spatial locations within the ROI, if the PDE is spatially dependent, then (3) becomes Generally, if there are K PDE terms d k , k = 1, ..., K in the RHS with coefficients λ k , k = 1, ..., K, for location m at time step j, the kth term in the RHS with its coefficient is where d j mk is the kth PDE term evaluated at location m and time j and the coefficient λ mk is identical for all the time j because the coefficients are time-invariant.For all the M locations, the RHS can be denoted by where the • denotes element-wise product.For the wave equation ( 4), K = 2, and d j m1 , d j m2 are for Ut, Uxx respectively.The λm1 and λm2 denote −αm and c 2 m .The (6) indicates one difference between this work and the conventional PINN [1][2][3], in which there are only K unknown PDE coefficients {λ1, . . ., λK } to be recovered since the PDE is assumed to be spatially-independent and thus the coefficients are identical for all locations.
Let the LHS at the same location m and time step j denoted by ℓ j m , then the PDE is written as For the wave equation ( 4), the ℓ j m is the Utt evaluated at time j and location m.

Loss functions
During training the SD-PINN we are minimizing the overall loss loss in Eq. ( 8): ) which is a linear combination of 5 losses {lossu, loss f , losssm, loss b , losssi} with w f , wsm, w b and wsi their weights.They can be grouped into 3 categories: (i) the data fitting loss lossu is a function of only the neural network parameters θ (weights and bias); (ii) the functional loss loss f is a function of both θ and the PDE coefficients λ; (iii) the smoothness, boundary and sign losses {losssm, loss b , losssi} are the functions of only the PDE coefficients λ.They are detailed as follows.
The SD-PINN first learns the dynamics by maximizing the similarity between the composite function described by the neural network and the true function mapping the time steps t and spatial locations x to corresponding physical measurements u in the training dataset.Assuming there are M locations and T time indices within the training dataset, the method is to minimize a loss function where the network parametrized by θ (including weights W l and bias b l in all L layers) shown in Fig. 1 is denoted by N et θ , which is a non-linear function of (x, t): where the tanh is applied in an element-wise way.
After the estimated measurement um,j = N et θ (xm, tj) is calculated, the derivatives involved by the PDE are computed by automatic differentiation [18].The estimated ux at (xm, tj) is acquired by computing the derivative ∂N  ) is also a function parametrized by θ.The automatic differentiation is also used to compute ut and the LHS ℓ j m , e.g., the estimated utt in (3).After d j mk and ℓ j m are estimated by automatic differentiation, inspired by [1][2][3], the λ mk could be estimated by minimizing (10) because of ( 5) and (7).But we use a modified version of (10) as detailed in the following.
To increase the robustness against noise, we add "virtual measurements" evenly located between neighboring true measurements in both spatial and temporal aspects.Let us evenly insert P "virtual measurements" into two neighboring spacial locations and Q "virtual measurements" into two neighboring time points, then we have (M − 1)(P + 1) + 1 spatial coefficients for each PDE term, and every such spatial coefficient is invariant across all the (T − 1)(Q + 1) + 1 time steps.These "virtual measurements" are computed from the forward process of the SD-PINN, e.g., u (m+ p P +1 )(j+ q Q+1 ) = N et θ (x, t)| x=xm+ p P +1 ∆x,t=t j + q Q+1 ∆t for p ∈ {1, ..., P } and q ∈ {1, ..., Q} where ∆x and ∆t are the step size between neighboring true measurements along space and time.In this work, we use P = 1 for simplicity, which indicates that there is only one "virtual measurement" inserted at the mid-point of two neighboring spatial locations.We define the functional loss to be Updating (10) to (11) indicates that the recovered PDE at any location m can also describe the dynamics at Q inserted time steps between every pair of neighboring time steps of the true T observations.Meanwhile, all the recovered λ mk with non-integer m are used for smoothing.The smoothness penalty is Minimizing ( 12) encourages a smoother transition of the recovered PDE coefficients between two neighboring sensors.
To accelerate the training, we assume the PDE coefficients on the spatial boundaries of the ROI known.Thus we define the boundary loss loss b according to the difference between the estimated and true coefficients on the boundaries: In addition, the assumed sign for the unknown PDE coefficient is fixed.For example, in the wave equation ( 4), the c 2 m must be non-negative for ∀m to be physically meaningful as it is the squared phase speed.Thus a loss function losssi penalizing the λ mk whose value violates the desired sign can be designed.We define the losssi where sign(x) = 1, ∀x > 0 and −1, ∀x < 0. Note that the sign(λ mk ) is determined by the assumed PDE form and thus is irrelevant to the recovered value λ mk and independent of m.For example, in (4), sign(λ mk ) = −1 for k = 1 since λm1 denotes −αm ≤ 0 (if the λ mk can be zero like here, we only care about the possible sign when it is non-zero).ReLU is the Rectified Linear Unit defined as ReLU(x) = x for x > 0 and 0 otherwise.By minimizing (14), the λ mk is encouraged to have its assumed sign.A demonstration of how the inputs, the nueral network parameter θ and the PDE coefficients λ are related by the losses discussed above is shown in Fig. 2.

Robustness to noise
The advantage of the SD-PINN over the PDE recovery methods based on finite difference (FD) [19] is its robustness against noise.For SD-PINN the LHS l and the PDE terms in the RHS d in ( 6) are both computed by automatic differentiation from the neural network estimation u = N et θ (x, t), which is supposed to be less noisy (than the noisy measurements u) because it is also regularized by loss f , losssm, losssi in addition to the data fitting loss lossu.Both loss f and losssi help suppressing the noise by introducing physical constraints from the assumed PDE knowledge, and losssm help suppressing the noise by introducing smoothness constraint between the recovered coefficients at neighboring locations.
We use two least squares regression (LSQ) based methods as the baselines for comparison, where the PDE terms are directly computed by FD from noisy measurements and then the LSQ is implemented to find the PDE coefficients.For example, if the aim is to recover the −αm and c 2 m in (4), the T × 1 vectors ut, utt and uxx are first computed by FD using the data around the m-th location in the noisy measurements, and then [−αm c 2 m ] T = [ut uxx] † utt where the † denotes pseudo-inverse (which is the least squares regression to solve utt = [ut uxx][−αm c 2 m ] T ).If a naive FD [19] is used to compute the PDE terms, we name the method as FD-LSQ.The naive FD can be robustified against noise by adding a total variation (TV) regularization [20], and we name the method using this TV-regularized FD as TVR-FD-LSQ.

EXPERIMENTS
We conduct two experiments, one for recovering one PDE coefficient (phase speed) from measurements with large noise, and the other for recovering two PDE coefficients (phase speed and the more implicit attenuation factor) from observations with noise.The datasets are generated by finite difference modeling [21].In both experiments, the SD-PINN is trained by Adam [22].

Recovering phase speeds from noisy measurements
We assume no attenuation in the wave field for this experiment and thus αm = 0 is set everywhere in (4).So there is only K = 1 PDE coefficient to recover, which is c 2 m represented by λm1.The network has 9 layers, and w f = wsm = w b = wsi = 10 in the loss function (8).We set Q = 3.The wavefield is shown in Fig. 3a in which the ROI is between [3,23]∆x in space (thus M = 21) and [5,195]

Recovering attenuation and phase speeds from noisy data
With attenuation, wave equation ( 4) is used.The attenuation α is harder to recover since its absolute value can be much smaller and the wave propagation is more obvious to observe than attenuation.
To recover (4), we still keep the Utt in the LHS and set the number of PDE terms in RHS K = 2.The k = 1 is for Ut, and k = 2 is for Uxx.Thus, the λm1 is for −α and λm2 for c 2 at location m.
We use a similar dataset as in Sec.3.1 for this experiment, see Fig. 3b.The initial state and the phase speeds distribution are the same as before, and the only difference is that the attenuation is nonzero for a part of ROI, as shown by the "True" line in Fig. 4b.The measurements are polluted by the zero-mean Gaussian noise with SNR = 30 dB.
The network has 5 layers here, and w f = wsm = w b = wsi = 10 in the loss function (8).We set Q = 1.The SD-PINN works well as the recovered α and c 2 are closer to the ground truth compared to the baseline methods, as shown in Fig. 4b.The MSEs between the true PDE coefficients and the recovered ones from various methods are recorded in the rows for Sec.3.2 in Table 1, in which the coefficients recoverd by SD-PINN have much smaller MSEs.

CONCLUSION
In this work, we proposed a neural network termed as SD-PINN that can recover spatially dependent PDE coefficients using only one network without the domain knowledge pertinent to a specific situation.The network structure is a simple FCN and the physical information for the PDE is encoded into the loss functions.The SD-PINN is robust to noise, which is demonstrated by various experiments.

Fig. 1 :
Fig. 1: The structure of the neural network used in PINN and SD-PINN.

Fig. 2 :
Fig.2:A demonstration for the losses of SD-PINN, for each loss, only a small part of the summation involving (xm, tj), (xm+0.5, tj) and (xm+1, tj) is shown.The input data are shown in blue.The spatial index m and time index j correspond to a true measurement instead of an inserted "virtual measurement" and the um+0.5,jdoes not affect lossu since it is virtual.The dashed lines are for automatic differentiation, which is computed by evaluating functions parametrized by θ at given coordinates, e.g., (xm, tj).The loss b is included only if m = 1 or m + 1 = M (i.e., at boundaries).The loss functions are the functions of θ and/or λ, and the neural network is parametrized by θ.The three N et θ blocks denote just one neural network parametrized by θ and structured as Fig.1, with the difference being the inputs and outputs.
∆t in time as indicated by the red lines.The governing PDE for the field is wave equation (4) without the Ut term and the spatially-dependent c 2 is indicated by the "True" line in Fig. 4a.The measurements of the field are polluted by the zero-mean Gaussian noise with the signal-to-noise ratio SNR = 20 dB.The recovered λm1, i.e., the c 2 with respect to locations is shown in Fig. 4a.The phase speeds recovered by least squares regression (LSQ) are also included for comparison, in which the Utt and Uxx are calculated numerically by naive finite difference or TVR finite difference at first, and then the coefficient for Uxx is computed by least squares regression.As shown in the row for Sec.3.1 in Table 1, the spatially-dependent c 2 recovered by SD-PINN has a much smaller MSE (mean squared error) with respect to the true values.SD-PINN FD-LSQ TVR-FD-LSQ Sec.3.1 c 2 1.25 × 10 −2 7.87 × 10 −1 1.49 Sec.3.2 −α 1.80 × 10 −5 5.31 × 10 −4 1.84 × 10 −4 c 2 5.41 × 10 −3 2.30 × 10 −1 2.52 × 10 −1

Fig. 3 :
Fig. 3: The noisy measurements of (a) the waves without attenuation; (b) the waves with attenuation.The ROI covering 3 ≤ x ≤ 23 and 5 ≤ t ≤ 195 is indicated by red lines.

Fig. 4 :
Fig. 4: (a) For the noisy data shown in Fig. 3a, the true c 2 and the recovered values from SD-PINN and the least squares regression (LSQ).(b) For the noisy data shown in Fig. 3b, the true −α, c 2 and the recovered values from SD-PINN and the LSQ.

Table 1 :
For Sec. 3.1 and 3.2, the MSE between the true PDE coefficients and the recovered ones from various methods.