Fatal error in MPI_Allreduce: Other MPI error, error stack
Posted: Sun Apr 23, 2023 3:33 am
Hello everyone,
I'm hoping to get some help with an error message when running VASP.
I was doing structural relaxation of a magnetic molecule in crystal structure and got an error message when switching from ALGO = Normal to ALGO = All. There was no error when using ALGO = Normal but it was hard to converge. So I decided to use ALGO = All but got an error message.
I was running vasp 5.4.4 on NERSC Cori with the following mudules loaded:
The running log is:
My INCAR is
Can anyone offer any advice or suggestions? I'd really appreciate any help you can provide.
Thanks in advance!
I'm hoping to get some help with an error message when running VASP.
I was doing structural relaxation of a magnetic molecule in crystal structure and got an error message when switching from ALGO = Normal to ALGO = All. There was no error when using ALGO = Normal but it was hard to converge. So I decided to use ALGO = All but got an error message.
I was running vasp 5.4.4 on NERSC Cori with the following mudules loaded:
Code: Select all
Currently Loaded Modulefiles:
1) modules/3.2.11.4 9) pmi/5.0.17 17) atp/3.14.9
2) darshan/3.4.0 10) dmapp/7.1.1-7.0.3.1_3.44__g93a7e9f.ari 18) perftools-base/21.12.0
3) craype-network-aries 11) gni-headers/5.0.12.0-7.0.3.1_3.27__gd0d73fe.ari 19) PrgEnv-intel/6.0.10
4) intel/19.1.2.254 12) xpmem/2.2.27-7.0.3.1_3.28__gada73ac.ari 20) craype-haswell
5) craype/2.7.10 13) job/2.2.4-7.0.3.1_3.35__g36b56f4.ari 21) cray-mpich/7.7.19
6) cray-libsci/20.09.1 14) dvs/2.12_2.2.224-7.0.3.1_3.45__gc77db2af 22) craype-hugepages2M
7) udreg/2.3.2-7.0.3.1_3.45__g5f0d670.ari 15) alps/6.6.67-7.0.3.1_3.43__gb91cd181.ari 23) vasp/5.4.4-hsw
8) ugni/6.0.14.0-7.0.3.1_6.26__g8101a58.ari 16) rca/2.2.20-7.0.3.1_3.48__g8e3fb5b.ari 24) Base-opts/2.4.142-7.0.3.1_3.23__g8f27585.ari
Code: Select all
running on 128 total cores
distrk: each k-point on 128 cores, 1 groups
distr: one band on 16 cores, 8 groups
using from now: INCAR
vasp.5.4.4.18Apr17-6-g9f103f2a35 (build Mar 12 2022 03:47:05) gamma-only
POSCAR found type information on POSCAR Mn O N C H I
POSCAR found : 6 types and 286 ions
scaLAPACK will be used
-----------------------------------------------------------------------------
| |
| ADVICE TO THIS USER RUNNING 'VASP/VAMP' (HEAR YOUR MASTER'S VOICE ...): |
| |
| You have a (more or less) 'large supercell' and for larger cells |
| it might be more efficient to use real space projection opertators |
| So try LREAL= Auto in the INCAR file. |
| Mind: For very accurate calculation you might also keep the |
| reciprocal projection scheme (i.e. LREAL=.FALSE.) |
| |
-----------------------------------------------------------------------------
LDA part: xc-table for Pade appr. of Perdew
found WAVECAR, reading the header
POSCAR, INCAR and KPOINTS ok, starting setup
FFT: planning ...
reading WAVECAR
the WAVECAR file was read successfully
charge-density read from file: Mn3
magnetization density read from file 1
initial charge from wavefunction
entering main loop
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| ALGO=A and IALGO=5X tend to fail with the tetrahedron method |
| (e.g. Bloechls method ISMEAR=-5 is not variational) |
| please switch to IMSEAR=0-n, except for DOS calculations |
| For DOS calculations use IALGO=53 after preconverging with ISMEAR>=0 |
| I HOPE YOU KNOW, WHAT YOU ARE DOING |
| |
-----------------------------------------------------------------------------
N E dE d eps ncg rms ort
gam= 0.000 g(H,U,f)= 0.143E+02 0.000E+00 NaN ort(H,U,f) = 0.000E+00 0.000E+00 NaN
SDA: 1 -0.185963621438E+04 -0.18596E+04 0.00000E+00 736 NaN NaN
gam= 0.000 trial= 0.400 step= NaN mean= 0.400
gam= 0.000 trial= 2.600 step= 2.600 mean= 0.631
Rank 19 [Sat Apr 22 20:13:33 2023] [c1-0c0s7n3] Fatal error in MPI_Recv: Message truncated, error stack:
MPI_Recv(212).....................: MPI_Recv(buf=0x100082ff720, count=0, MPI_BYTE, src=45, tag=9, comm=0xc4000015, status=0x7ffffffdcf00) failed
MPIDI_CH3U_Receive_data_found(144): Message from rank 45 and tag 9 truncated; 2304 bytes received but buffer size is 0
Rank 66 [Sat Apr 22 20:13:33 2023] [c1-0c0s8n1] Fatal error in MPI_Recv: Message truncated, error stack:
MPI_Recv(212).......................: MPI_Recv(buf=0x100082dc860, count=0, MPI_BYTE, src=33, tag=9, comm=0xc4000019, status=0x7ffffffdcf00) failed
MPIDI_CH3U_Request_unpack_uebuf(595): Message truncated; 2304 bytes received but buffer size is 0
Rank 74 [Sat Apr 22 20:13:33 2023] [c1-0c0s8n1] Fatal error in MPI_Recv: Message truncated, error stack:
MPI_Recv(212).......................: MPI_Recv(buf=0x10008300740, count=0, MPI_BYTE, src=37, tag=9, comm=0xc4000017, status=0x7ffffffdcf00) failed
MPIDI_CH3U_Request_unpack_uebuf(595): Message truncated; 2304 bytes received but buffer size is 0
Code: Select all
SYSTEM = Mn3
####
ISTART = 1
ICHARG = 1
ALGO = All
####
PREC = Accurate
NCORE = 16
ENCUT = 500
#NBANDS = 768
#NELECT = 1040
#NGX = 192
#NGY = 192
#NGZ = 210
#### electron & strut ####
EDIFF = 1E-7
#NELMIN = 10
NELM = 300
IBRION = 2
#ISIF = 3
EDIFFG = -1E-3
NSW = 50
#### sym ####
ISYM = -1
#### mag ####
ISPIN = 2
MAGMOM = 12*6 274*0
#### CHG & WAV ####
#ICHARG = 11
LMAXMIX = 4
#LWAVE = .F.
#### dos ####
ISMEAR = -2
FERWE = 575*1 0 1 159*0
FERDO = 528*1 0 207*0
NBANDS = 736
#ISMEAR = 0
#SIGMA = 0.05
#NEDOS = 2001
#EMIN = -10
#EMAX = 10
LORBIT = 10
#### vdW ####
IVDW = 11
#### LDA+U ####
LDAU = T
LDAUTYPE = 1
LDAUPRINT = 1
LDAUL = 2 -1 -1 -1 -1 -1
LDAUU = 2.8 0.0 0.0 0.0 0.0 0.0
LDAUJ = 0.9 0.0 0.0 0.0 0.0 0.0
#LASPH = T
Thanks in advance!