Neural Operators and Operator Networks vs Parametric Approach: A General Comparison
Neural Operators and Operator Networks vs Parametric Approach: A General Comparison
We explore the use of different neural operator architectures for solving partial differential equations (PDEs). Specifically, it investigates the performance of the Feed-Forward Network (FFN), Deep Operator Network (DeepONet), Fourier Neural Operator (FNO), Convolutional Neural Operator (CNO), and Koopman Neural Operator (KNO) in solving the heat, wave, and Poisson equations. The models are trained and evaluated based on their ability to accurately predict the solutions of these PDEs under varying parameters
Introduction
Partial differential equations (PDEs) play a crucial role in describing various physical phenomena and are widely used in many scientific and engineering domains. Solving PDEs accurately is essential for understanding and predicting the behavior of complex systems. Traditional numerical methods for solving PDEs often rely on discretization techniques and iterative algorithms, which can be computationally expensive and time-consuming. In recent years, there has been a growing interest in leveraging neural networks and deep learning techniques to solve PDEs more efficiently.
Neural operators have emerged as a promising approach for solving PDEs using neural networks. These architectures aim to learn the underlying equations directly from data, enabling faster and more accurate predictions of PDE solutions. In this paper, we compare and evaluate several neural operator architectures, namely the Feed-Forward Network (FFN), Deep Operator Network (DeepONet), Fourier Neural Operator (FNO), Convolutional Neural Operator (CNO), and Koopman Neural Operator (KNO), in the context of solving the heat, wave, and Poisson equation.
The heat equation describes the diffusion of heat in a medium, while the wave equation models the propagation of waves. On the other hand, the Poisson equation represents a static equation that does not depend on time. These equations have different characteristics and require different approaches for accurate solution approximation. By examining the performance of various neural operator architectures on these equations under varying parameters, we can gain insights into their strengths and weaknesses. Our code can be publicly accessed byhttps://github.com/walkerchi/DeepLearning-in-ScientificComputing-PartB
Equations
Heat Equation
The heat equation is a fundamental partial differential equation (PDE) that describes the behavior of heat distribution over time. It can be expressed as follows:
Here, represents the position vector, which lies within the domain , indicating that both elements and reside in the interval . The variable spans the domain , where is the final time.
initial condition
boundary condition
solution
Wave Equation:
The wave equation is another important PDE that describes the propagation of waves, such as sound waves or electromagnetic waves. It can be represented as follows:
In this equation, is the position vector, confined within the domain , indicating that the elements and lie within the interval . The variable is restricted to the domain . The symbol represents the propagation speed.
Initial Condition
boundary condition
Solution
Poisson Equation
The Poisson equation is a static PDE that describes the distribution of scalar fields in various physical phenomena, such as electrostatics or fluid flow. It can be defined as follows:
Unlike the heat equation and wave equation, the Poisson equation does not depend on time. represents the source function.
source function
Boundary Condition
Solution
Methodology
In the following illustration, we sample
times for different . Therefore, for each th sample, and
Parameter Approach:Feed Forward Network(FFN):
The objective of the parameter methodology is to establish a transformative function , which is capable of constructing a bridge between the position vector and parameter for the equation to the resultant value of equation.
With the aid of the universal approximation theorem of Multilayer Perceptron (MLP), we have chosen to implement a powerful MLP to closely approximate the function .
DeepONet
In contrast to the parameter approach, DeepONet adopts a different strategy. It learns a mapping function , which gives rise to a continuous function that maps the position vector to the value of the equation, as illustrated in equation:
The is a short notation of .
To execute this, DeepONet employs a matrix to approximate the continuous function. It mainly consists of two networks: the branch network and the trunk network .
In DNO, the th sampled position , pertaining to different or , is preserved for further inference within the DeepONet framework. and take the form of a Multilayer Perceptron (MLP).
Fourier Neural Operator
In contrast to both the parameter approach and the DeepONet approach, the Fourier Neural Operator (FNO)demands the input points to conform to a mesh structure. We refer to this type of neural operator as the Mesh Neural Operator. It takes the form as shown in equation
When it comes to FNO, it adopts the form as illustrated in equation
In this equation, represents the hidden layer at the th level. refers to the convolution operation with a kernel size of at th layer. The symbols and denote the Fourier transform and its inverse, respectively. and stand for the weights and bias of the FNO network at th layer, respectively. Finally, signifies the activation function.
Convolutional Neural Operator
Inspired by the architectural principles of UNet and the activation mechanisms of StyleGAN3, the Convolutional Neural Operator (CNO) integrates the structural design of UNet with the Leaky Rectified Linear Unit (LReLU) filter from StyleGAN3, utilized as an activation layer.
The activation layer, as depicted as Figure
operates through a sequence of transformations on the feature map. Initially, the feature map undergoes an upsampling process, followed by the application of a Finite Impulse Response (FIR) filter. Subsequently, the LeakyReLU activation function is applied, introducing non-linearity to the feature map. A second FIR filter is then applied, followed by a downsampling operation.
Koopman Neural Operator
In order to encapsulate complex long-term dynamics, the Koopman Neural Operator (KNO) has been proposed. This operator is designed to learn the Koopman operator, an infinite-dimensional linear operator that governs all observations of a dynamic system. The KNO is applied to the evolution mapping of the dynamic system’s solution, thereby capturing the system’s behavior over time.
Contrasting with the Fourier Neural Operator, the KNO employs parameter sharing across layers for the Spectral Convolution, as shown in Equation.
In this equation, represents the output of the spectral convolution layer, is the inverse Fourier transformation, and are the weight and bias parameters respectively, and is the Fourier transformation of the input to the layer.
Experiment
Setup
-
CPU/GPU: We use the CPU/GPU of Intel i5-11300H/Nvidia MX450 for these experiments.
-
Equations:
The parameter for all equations was systematically varied across the set . Therefore, we will train weights for the experiments. In the heat equation, where the attenuation of high-frequency signals over time is observed, the parameter was assigned a value of . Conversely, in the wave equation, the propagation speed was determined as , the radial decay parameter was set to , and was specified as . Notably, in the case of the Poisson equation, which exhibits time-independence, the radial decay parameter was consistently assigned a value of -
Sampling: The Mesh sampler is employed for all models. The samplers generated mesh spatial points () for the training dataset, validation dataset, with samplings.
-
Model: For the FFN, a Multi-Layer Perceptron (MLP) with 4 layers was used, each layer having a hidden size of 64 and employing a ReLU activation function. For the DeepONet, both the branch and trunk networks were configured with the same settings as the FFN. The Fourier Neural Operator (FNO) and Koopman Neural Operator (KNO) also followed the same settings as the FFN. To maintain the complexity among models, for the Convolutional Neural Operator (CNO) and UNet, the depth was set to 3 with hidden channel range within and the residual block length was set to 2.
-
Training: The Adam optimizer was used for training, with a learning rate of 0.001 and 1000 epochs. Model validation was performed every 100 epochs using the validation dataset, and the best validation model was stored.
-
Acceleration: To expedite the activation process on the GPU, we utilized the implementation from StyleGAN3, which is implemented in CUDA.
Prediction
In this experiment, we set the parameter to a fixed value of for the heat equation. A single sample of and was generated. We compare the predictions of each model with the ground truth , as depicted in Figure Heat Prediction. The error is quantified using the error metric. Notably, the Fourier Neural Operator and Koopman Neural Operator models exhibit sensitivity to the presence of high-frequency signals. The FFN and DeepONet models struggle to effectively capture the high-frequency components of the signal. Conversely, the Convolutional Neural Operator and Unet models demonstrate good fitting capability, accurately representing the signal.
Varying Parameter
In this experiment, we generated 64 samples for each equation parameter within the range . It is important to note that separate models were trained for each parameter of the equation.
The results of the experiment are illustrated in Figure Heat Varying. The shaded bands in the figure represent the uncertainty, while the curve is fitted using logistic regression. Analyzing the figure, it becomes evident that the Koopman neural operator consistently achieved the lowest error across all variations of the parameter .
However, it is important to note that there is no discernible ascending or descending pattern observed for certain models as the parameter varies. This behavior indicates that the models respond differently to different equations, and the relationship between the parameter and the model’s performance is not strictly linear.
Conclustion
In conclusion, neural operator architectures show great promise in solving PDEs efficiently and accurately. This study contributes to the understanding of different neural operator architectures and their performance in solving the heat, wave, and Poisson equations. Further research can explore the application of neural operators to more complex and diverse PDEs, as well as the development of hybrid approaches that combine the strengths of different architectures.