Clarification on number of MPI ranks for VASP tests.
Moderators: Global Moderator, Moderator
-
- Full Member
- Posts: 189
- Joined: Tue Oct 13, 2020 11:32 pm
Clarification on number of MPI ranks for VASP tests.
Dear VASP Forum,
I am currently setting up tests for the VASP software and came across some information specifying that "Reference files have been generated with 4 MPI ranks. Note that tests might fail if another number of ranks is used!"
Would it be possible to obtain clarification regarding the use of a different number of MPI ranks for the tests?
As far as the vasp testing is concerned, I would like to know if it is feasible to use a larger number of MPI ranks to distribute the computation more broadly, or perhaps a smaller number if my resources are limited.
Could you please provide some guidance as to the potential pitfalls, if any, of deviating from the 4 MPI ranks? Or if there are certain tests that are more sensitive to the number of ranks used, could you please specify these?
Any advice or resources on this topic would be greatly appreciated, and will greatly assist me in properly configuring my test environment.
Many thanks for your assistance in this matter.
Best regards,
Zhao
I am currently setting up tests for the VASP software and came across some information specifying that "Reference files have been generated with 4 MPI ranks. Note that tests might fail if another number of ranks is used!"
Would it be possible to obtain clarification regarding the use of a different number of MPI ranks for the tests?
As far as the vasp testing is concerned, I would like to know if it is feasible to use a larger number of MPI ranks to distribute the computation more broadly, or perhaps a smaller number if my resources are limited.
Could you please provide some guidance as to the potential pitfalls, if any, of deviating from the 4 MPI ranks? Or if there are certain tests that are more sensitive to the number of ranks used, could you please specify these?
Any advice or resources on this topic would be greatly appreciated, and will greatly assist me in properly configuring my test environment.
Many thanks for your assistance in this matter.
Best regards,
Zhao
-
- Full Member
- Posts: 189
- Joined: Tue Oct 13, 2020 11:32 pm
Re: Clarification on number of MPI ranks for VASP tests.
Below is my testing report of HEG_333_LW example shipped with vasp.6.4.2 testsuite using various cores and with/without "-genv I_MPI_FABRICS=shm" option:
I have recently conducted a series of tests on the HEG_333_LW example shipped in vasp.6.4.2 testsuite using different combinations of the number of cores and the option "-genv I_MPI_FABRICS=shm". This test was conducted by starting with one core and incrementally increasing the number of cores by one each time.
Here are the two sets of commands I used. 'n' represents the number of cores:
Without "-genv I_MPI_FABRICS=shm" option:
With "-genv I_MPI_FABRICS=shm" option:
Below are my findings:
1. The testing either stalls or fails when run on more than 12 cores.
2. The testing is successful when run on less or equal 12 cores, with or without "-genv I_MPI_FABRICS=shm". However, when the "-genv I_MPI_FABRICS=shm" option is used, there is an observable performance lift between one to several seconds.
I would appreciate it if you could provide some insights regarding my findings. Are these expected behaviors, or are there any optimizations that we could do to enhance the testsuite further?
Looking forward to your response.
Best Regards,
Zhao
I have recently conducted a series of tests on the HEG_333_LW example shipped in vasp.6.4.2 testsuite using different combinations of the number of cores and the option "-genv I_MPI_FABRICS=shm". This test was conducted by starting with one core and incrementally increasing the number of cores by one each time.
Here are the two sets of commands I used. 'n' represents the number of cores:
Without "-genv I_MPI_FABRICS=shm" option:
Code: Select all
module load vasp/6.4.2-intel-oneapi.2023.2.0
export VASP_TESTSUITE_EXE_STD="mpirun -np n vasp_std"
export VASP_TESTSUITE_EXE_NCL="mpirun -np n vasp_ncl"
export VASP_TESTSUITE_EXE_GAM="mpirun -np n vasp_gam"
cd vasp.6.4.2
time VASP_TESTSUITE_TESTS=HEG_333_LW make test
With "-genv I_MPI_FABRICS=shm" option:
Code: Select all
module load vasp/6.4.2-intel-oneapi.2023.2.0
export VASP_TESTSUITE_EXE_STD="mpirun -np n -genv I_MPI_FABRICS=shm vasp_std"
export VASP_TESTSUITE_EXE_NCL="mpirun -np n -genv I_MPI_FABRICS=shm vasp_ncl"
export VASP_TESTSUITE_EXE_GAM="mpirun -np n -genv I_MPI_FABRICS=shm vasp_gam"
cd vasp.6.4.2
time VASP_TESTSUITE_TESTS=HEG_333_LW make test
1. The testing either stalls or fails when run on more than 12 cores.
2. The testing is successful when run on less or equal 12 cores, with or without "-genv I_MPI_FABRICS=shm". However, when the "-genv I_MPI_FABRICS=shm" option is used, there is an observable performance lift between one to several seconds.
I would appreciate it if you could provide some insights regarding my findings. Are these expected behaviors, or are there any optimizations that we could do to enhance the testsuite further?
Looking forward to your response.
Best Regards,
Zhao
-
- Global Moderator
- Posts: 74
- Joined: Fri Aug 04, 2023 11:07 am
Re: Clarification on number of MPI ranks for VASP tests.
Dear Zhao,
The sentence just states that the reference (with which your calculation will be checked against) has been run with four MPI ranks. You may change the number of ranks used to run tests using the VASP_TESTSUITE_EXE_STD, VASP_TESTSUITE_EXE_GAM and VASP_TESTSUITE_EXE_NCL environment variables. Please note that changing the parallelization strategy could change certain parameters within the calculation (such as NBANDS: see here https://www.vasp.at/wiki/index.php/NBANDS) and hence might lead to some tests failing. There is currently no list of tests that are sensitive to the number of ranks used.
Sudarshan
The sentence just states that the reference (with which your calculation will be checked against) has been run with four MPI ranks. You may change the number of ranks used to run tests using the VASP_TESTSUITE_EXE_STD, VASP_TESTSUITE_EXE_GAM and VASP_TESTSUITE_EXE_NCL environment variables. Please note that changing the parallelization strategy could change certain parameters within the calculation (such as NBANDS: see here https://www.vasp.at/wiki/index.php/NBANDS) and hence might lead to some tests failing. There is currently no list of tests that are sensitive to the number of ranks used.
Sudarshan
-
- Full Member
- Posts: 189
- Joined: Tue Oct 13, 2020 11:32 pm
Re: Clarification on number of MPI ranks for VASP tests.
Here is the test script I used, if others have similar needs, you can adjust it according to your own situation:
Code: Select all
#!/usr/bin/env bash
script_name_sh=$HOME/.local/libexec/script_name.sh
source $script_name_sh ${BASH_SOURCE[0]}
cwd
if [ $# -ne 1 ]; then
echo "This script requires exactly one argument!"
echo "Please provide a feature_set as the argument (for example, 1 for intel or 2 for intel_omp)"
exit 1
fi
feature_set=$1
# Change directory to the correct path
cd releases/vasp.6.4.2
case "$feature_set" in
1) echo "Running code with intel feature_set"
#https://www.vasp.at/forum/viewtopic.php?f=2&t=18373#p21390
#This issued could be related to the fabrics control in the Intel MPI library.
#Can you try setting the argument I_MPI_FABRICS=shm in your mpirun command?
#mpirun -np 64 -genv I_MPI_FABRICS=shm vasp_std
# Load the required module
module --force purge
module load vasp/6.4.2-intel-oneapi.2023.2.0
# Iterate over different numbers of cores starting from 1
for ((n=1; n<=12; n++)); do
echo "Testing with $n cores without -genv option"
export VASP_TESTSUITE_EXE_STD="mpirun -np $n vasp_std"
export VASP_TESTSUITE_EXE_NCL="mpirun -np $n vasp_ncl"
export VASP_TESTSUITE_EXE_GAM="mpirun -np $n vasp_gam"
# Run the test
time VASP_TESTSUITE_TESTS=HEG_333_LW make test | grep SUCCESS
echo "Testing with $n cores with -genv option"
export VASP_TESTSUITE_EXE_STD="mpirun -np $n -genv I_MPI_FABRICS=shm vasp_std"
export VASP_TESTSUITE_EXE_NCL="mpirun -np $n -genv I_MPI_FABRICS=shm vasp_ncl"
export VASP_TESTSUITE_EXE_GAM="mpirun -np $n -genv I_MPI_FABRICS=shm vasp_gam"
# Run the test
time VASP_TESTSUITE_TESTS=HEG_333_LW make test | grep SUCCESS
done
;;
2) echo "Running code with intel_omp feature_set"
# The following setting is adapted from testsuite/impi+omp.conf,
# and the guidance describe on https://www.vasp.at/wiki/index.php/Validation_tests
#https://stackoverflow.com/questions/71181984/find-the-optimal-combination-of-setting-values-for-number-of-processes-and-om
# Load the required module
module --force purge
module load vasp/6.4.2-intel_omp-oneapi.2023.2.0
# Set up an array with the number of MPI processes (ranks) and threads per process
nranks=(2 4 6 8 10 12) # an array including the number of MPI processes for each test
nthrd=2 # number of threads per process for all tests, utilizing hyperthreading (2 threads per core)
# Iterate over each test
for nrank in ${nranks[*]}; do
# Set up MPI parameters
mpi_params="-np $nrank -genv OMP_NUM_THREADS=$nthrd -genv I_MPI_PIN_DOMAIN=omp -genv KMP_AFFINITY=verbose,granularity=fine,compact,1,0 -genv KMP_STACKSIZE=512m"
# Define the commands to test by using the mpi variable
export VASP_TESTSUITE_EXE_STD="mpirun $mpi_params vasp_std"
export VASP_TESTSUITE_EXE_NCL="mpirun $mpi_params vasp_ncl"
export VASP_TESTSUITE_EXE_GAM="mpirun $mpi_params vasp_gam"
# Print the number of nrank and threads
printf "Running tests with nrank=%s, nthrd=%s\n" $nrank $nthrd
# Run the test and grep for the SUCCESS status
time VASP_TESTSUITE_TESTS=HEG_333_LW make test | grep SUCCESS
done
;;
*) echo "Invalid argument!"
echo "Please provide a valid feature_set as the argument (for example, 1 for intel or 2 for intel_omp)"
exit 1
;;
esac
-
- Global Moderator
- Posts: 74
- Joined: Fri Aug 04, 2023 11:07 am
Re: Clarification on number of MPI ranks for VASP tests.
Dear Zhao,
Thank you for sharing the results of your tests and scripts to generate them! To expand a bit more on my previous message (and after speaking with my colleagues here at VASP) it appears that the test suite might pass for 2, 4 and 8 ranks. 6 (and as you see as well, 12) are most likely to fail because of the way NBANDS is decided based on the mode of parallelism used.
Sudarshan
Thank you for sharing the results of your tests and scripts to generate them! To expand a bit more on my previous message (and after speaking with my colleagues here at VASP) it appears that the test suite might pass for 2, 4 and 8 ranks. 6 (and as you see as well, 12) are most likely to fail because of the way NBANDS is decided based on the mode of parallelism used.
Sudarshan
-
- Full Member
- Posts: 189
- Joined: Tue Oct 13, 2020 11:32 pm
Re: Clarification on number of MPI ranks for VASP tests.
Dear Sudarshan,
Regards,
Zhao
You only mentioned an even number of cores above, and I also did a test on an odd number of cores at the same time and for the example I used, there was no problem with integers less than or equal to 12.To expand a bit more on my previous message (and after speaking with my colleagues here at VASP) it appears that the test suite might pass for 2, 4, and 8 ranks. 6 (and as you see as well, 12) are most likely to fail because of the way NBANDS is decided based on the mode of parallelism used.
Regards,
Zhao
-
- Global Moderator
- Posts: 74
- Joined: Fri Aug 04, 2023 11:07 am
Re: Clarification on number of MPI ranks for VASP tests.
Dear Zhao,
Thanks for running these tests and the clarification! For future reference, could you please share the makefile.include that you use to compile VASP? Thanks!
Sudarshan
Thanks for running these tests and the clarification! For future reference, could you please share the makefile.include that you use to compile VASP? Thanks!
Sudarshan