梯度协调优化驱动的物理信息神经网络求解中子扩散方程

Gradient Harmony Optimization-driven Physics-informed Neural Network for Solving Neutron Diffusion Equation

  • 摘要: 近年来,物理信息神经网络(PINN)的兴起为求解反应堆中子扩散方程提供了新型解决方案。然而,传统PINN训练过程中常因损失函数中多个函数项之间的梯度方向冲突而导致优化效率低下、收敛不稳定。本文提出了一种基于梯度协调优化(GHO)驱动的残差自适应重采样物理信息神经网络(GHO_R2-PINN)架构。该方法通过引入带捷径连接的卷积神经网络结构,并结合残差自适应重采样机制与固定卷积导数计算方式,有效改善了网络训练中的梯度消失和不稳定现象。GHO优化策略通过计算损失梯度矩阵的伪逆,以获取各损失项之间无冲突的优化方向,极大提高了网络的训练效率与收敛稳定性。通过经典的一维平板裸堆和IAEA 2D基准问题的数值实验,验证了R2-PINN框架相较于传统PINN架构,在训练时间上减少约50%,整体预测精度提高约1.5倍。在IAEA 2D基准问题中,网络训练时间由传统方法的4 h缩短至约1.5 h。本文方法在收敛速度、计算精度和训练稳定性方面均有明显优势,可为未来复杂问题的计算提供一种高效可靠的方案。

     

    Abstract: Solving the neutron diffusion equation is a fundamental and critical problem in nuclear reactor physics calculations. In recent years, physics-informed neural networks (PINNs), as a novel paradigm that does not rely on traditional meshing and analytical solutions, have been introduced to solve the reactor neutron diffusion equation and are gradually entering the mainstream. However, a core limitation of conventional PINNs exists: during the training process, the loss function is typically composed of multi-objective sub-terms such as the partial differential equation (PDE) residual term, initial condition error term, and boundary condition error term. When optimizing these sub-terms, their gradient directions frequently exhibit severe conflicts and scale differences, leading to problems such as the network training getting trapped in local optima, low optimization efficiency, slow convergence speed, and unstable prediction results. This gradient conflict is particularly pronounced in complex coupled scenarios like multi-group, multi-material nuclear reactors, becoming a critical bottleneck limiting PINN performance. In this paper, a complex residual-adaptive resampling physics-informed neural network (GHO_R2-PINN) architecture driven by gradient harmonization optimization (GHO) was proposed, aimed at systematically mitigating the negative effects of this gradient imbalance challenge in practical optimization. The GHO_R2-PINN architecture first utilizes a convolutional neural network (S-CNN) structure with shortcut connections as its backbone, effectively alleviating the common vanishing gradient problem in deep learning network training and enhancing the network’s generalization capability. Building upon this, it incorporates a residual-adaptive resampling (RAR) mechanism, which dynamically adjusts the sampling point distribution based on the physical equation residuals during training. This enables the model to focus on learning areas with larger errors and more complex physical constraints in the later stages of optimization, thereby enhancing the fitting ability for complex geometries and physical field changes. Concurrently, the GHO optimization strategy was introduced, which aimed to dynamically generate a unique, conflict-free optimization direction by precisely calculating the pseudo-inverse of the matrix formed by all loss gradients. This ensures that the optimization direction forms an acute angle with every independent loss gradient, achieving a coordinated and consistent descent across the multi-objective sub-loss terms. This effectively avoids the situation where one loss term dominates the training process at the expense of others’ accuracy. Furthermore, based on the characteristics of the GHO optimization strategy and traditional optimizers, this paper further proposed a hybrid optimization strategy (“fast start, stable end”): in the initial training phase, the GHO method was utilized to significantly accelerate the network’s robust convergence, enhance its global search capability, and rapidly steer network parameters away from undesirable local minima, preventing local entrapment. After a certain number of iterations, the optimization smoothly transitions to a traditional optimizer (such as L-BFGS or ADAM) to complete the final stage of fine error reduction and stable convergence in the high-accuracy region. The effectiveness and robustness of the GHO_R2-PINN architecture were fully validated through three theoretical benchmark examples and engineering cases: the nonlinear Burgers equation, the one-group, one-material 1D bare reactor problem, and the complex multi-group, multi-material IAEA 2D benchmark. In the Burgers equation experiment, the GHO method reduces the maximum prediction error by approximately 15 times compared to the original gradient method. Compared to the traditional PINN architecture, the GHO_R2-PINN architecture shortens the total training time by about 50% and improves the overall prediction accuracy by approximately 1.5 times. Specifically, for the IAEA 2D benchmark, the optimal hybrid strategy significantly reduces the training time from about 4 hours with traditional methods to about 1.5 hours, demonstrating its high efficiency in practical applications. Moreover, once the network training is complete, a single neutron flux prediction can be finished in millisecond time, showing good real-time responsiveness. In summary, the proposed GHO_R2-PINN architecture effectively mitigates the critical problem of gradient competition in PINN training, achieving certain breakthroughs in both theoretical and engineering aspects, and significantly improving the convergence speed, prediction accuracy, and training stability. The efficient and reliable solution provided by this paper offers a considerable technical pathway for future complex multi-physics field coupled calculations in nuclear reactor design.

     

/

返回文章
返回