Paralleization disturbance
Hello everyone,
I am working with a large system consisting of 99 atoms (a nanoparticle) and 370 bands, using a 1×1×1 Gamma mesh. I have access to 40 computational cores (across 2 sockets and 4 nodes). For tasks like DOS, band structure, or an SCF calculation, what would be the optimal parallelization settings, particularly for band structure optimization? My calculations often terminate prematurely and run very slowly when using 8 cores.