Abstract:
The CPU-GPU heterogeneous system provides method and idea for accelerating the whole-core MOC (method of characteristics) neutron transport calculation. A performance analysis model was proposed to identify the factors which significantly impact the parallel efficiency of the 2D MOC heterogeneous parallel algorithm based on the CPU-GPU heterogeneous system. Then the overall parallel efficiency was improved by the transport sweep and the data movement overlapping after the performance analysis. The numerical results demonstrate that the parallel algorithm maintains the desired accuracy. The data movement which includes the MPI communication and the data copy between CPU and GPU is the main factor affecting the parallel efficiency of heterogeneous parallel algorithm. The overall performance and the strong scaling efficiency are improved with the transport sweep and the data movement overlapping. About 8% improvement is observed in the overall performance and the strong scaling efficiency reaches 95% from 87% when 5 heterogeneous nodes (including 20 GPUs) are utilized to perform the simulation. Compared against the CPU-based parallelization, the overall performance of 4 CPUGPU heterogeneous nodes outperforms the performance of 20 CPU nodes.