Some time ago, in order to improve the efficiency of parallelised calculation on large system, I posted a thread about "How to improve parallelised calculation ?" After several communications between Danny, Alex and me, they suggested me to do some test on parameter NPAR, NSIM, LPLANE.
The testing example is Tb0.25Dy0.75Fe2 which have 2 Tb atoms, 6 Dy atoms and 16 Fe atoms in unit cell, kpoints is auto, 7*7*7, etc. I used 2 nodes(24 cores, which means each node has 12 cores), the network card is Infiniband (20GB/S).
If I take the default value for NPAR, NSIM, LPLANE, I approximately use 3000 seconds.
I tested NPAR = 1, 2, 4, 6, 8, 12, 24, each of NPAR corresponds to NSIM = 1, 2, 4, 6, 8, 12, 24 which means 7*7 cases, here are my calculated result:
It seems like NPAR = 2, NSIM = 1 is fastest. The increase of NSIM will always increase the calculation time, increaing NPAR should be careful, because NPAR should be different according to your specific cases.===========================================
NSIM\NPAR NPAR=1 NPAR=2 NPAR=4 NPAR=6 NPAR=8 NPAR=12 NPAR=24
NSIM=1 10712.798 1533.092 2302.704 2220.371 2470.454 2834.889 2941.860
NSIM=2 10889.813 1940.413 2429.192 2239.376 2515.769 2891.733 2948.944
NSIM=4 10622.640 1917.540 2221.515 2385.977 2502.756 2929.271 3033.390
NSIM=6 10836.125 2111.760 2393.558 2395.906 2623.324 2913.558 2990.683
NSIM=8 11168.838 2107.752 2378.309 2296.263 2668.595 2934.627 3109.094
NSIM=12 11148.837 2056.108 2279.254 2339.886 2624.820 2934.643 3207.204
NSIM=24 10512.837 1967.503 2253.869 2260.493 2626.288 3016.769 3725.165
===========================================
I also tested LPLANE = .TRUE. which save approximately 50% calculation time further.
My questions are as follows:
(1)In vasp userguide, it says increase NSIM should improve the performance, but here the data is not the case, so which one is correct ?
(2)Since NPAR = 2, NSIM = 1 is fastest, I think maybe calculating one band in one node is most optimized, so I am expecting a further improvement when I use 3 nodes which I set NPAR = 3, NSIM = 1, but even though, the calculation time is 1515 seconds, only 18 second faster. I don't know why ?
I hope my calculation result will be usefull for all vasp users, I am looking forward to everybody's reply to my questions.
Thanks:)
Hui
<span class='smallblacktext'>[ Edited ]</span>