The spectral method involves the transformation of data between the physical, Fourier, and spectral domains. Each of these domains is two-dimensional. The spectral method performs Fourier transforms in the longitude direction followed by summation in the latitude direction to evaluate the discrete spectral transform. A simple way of parallelizing the spectral code is to decompose the physical problem domain in just the latitude direction. This allows an optimized sequential FFT algorithm to be used in the longitude direction. However, this approach limits the number of processors that can be brought to bear on the problem. Decomposing the problem over both directions allows the parallelism inherent in the problem to be exploited more effectively -- the grain size is reduced, so that more processors can be used.
Results are presented that show that decomposing over both directions does result in a more rapid solution of the problem. The results show that for a given problem and number of processors, the optimum decomposition has approximately equal numbers of processors in each direction. Load imbalance also has an impact on the performance of the method. The importance of minimizing communication latency and overlapping communication with calculation is stressed. General methods for doing this, that may be applied to many other problems, are discussed.
D. W. Walker, P. H. Worley and J. B. Drake, "Parallelizing the Spectral Transform Method - Part II", Technical Report ORNL/TM-11855, Oak Ridge National Laboratory, July 1991. A modified version of this report was later published as a journal article.