Page 1 of 1

ERROR in EDDIAG

Posted: Mon Sep 11, 2023 10:17 am
by valeria_butera
Hello,
I am running a relaxation job using VASP 6.4.2. The system contains a Ge slab with a YbRh2Si2 cluster.
The jobs ends after three Iterations with the following error:

Code: Select all

   POTLOK:  cpu time      0.0958: real time      0.0958
    SETDIJ:  cpu time      0.0709: real time      0.0709
 -----------------------------------------------------------------------------
|                                                                             |
|     EEEEEEE  RRRRRR   RRRRRR   OOOOOOO  RRRRRR      ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     EEEEE    RRRRRR   RRRRRR   O     O  RRRRRR       #       #       #      |
|     E        R   R    R   R    O     O  R   R                               |
|     E        R    R   R    R   O     O  R    R      ###     ###     ###     |
|     EEEEEEE  R     R  R     R  OOOOOOO  R     R     ###     ###     ###     |
|                                                                             |
|     ERROR in EDDIAG: call to ZHEEV/ZHEEVX/DSYEV/DSYEVX failed! error        |
|     code was 24800
In the previous iterations, there is a warming message saying that the SCF is not converged. However, I am not sure that the reason why the job ends is related to the SCF problem.

The job runs on GPU accelerated nodes (8GPUs per node). The job script file I am using includes:

Code: Select all

#PBS -l select=1:ncpus=128:mpiprocs=8,walltime=48:00:00
export OMP_NUM_THREADS=16
in the OUTCAR, I can read:

Code: Select all

running    8 mpi-ranks, on    1 nodes
 distrk:  each k-point on    8 cores,    1 groups
 distr:  one band on NCORE=   1 cores,    8 groups
 OpenACC runtime initialized ...    8 GPUs detected
Could you please help me figure out what the issue is about?

Thanks

Re: ERROR in EDDIAG

Posted: Mon Sep 11, 2023 12:27 pm
by merzuk.kaltak
Dear Valeria Butera,

please upload a full error report that includes following files:
OUTCAR, INCAR, POTCAR, POSCAR, KPOINTS and stdout.

Re: ERROR in EDDIAG

Posted: Tue Sep 12, 2023 9:28 am
by valeria_butera
Dera Merzuk Kaltak,
I am attaching the folder with all the files. I do not have the stdout since I did run other tests and the file was overwritten.
I tried to run the job on CPUs node and it is currently running (it did 6 Iterations). However, the SCF is unconverged in all the steps, which means it might be something wrong also in the input section.

Re: ERROR in EDDIAG

Posted: Wed Sep 13, 2023 12:18 pm
by merzuk.kaltak
Your structure is a prime example where "charge sloshing" appears. That is, "rms" in the following does not decrease monotonically, but increases and decreases in an oscillatory behavior. Often this is accompanied with a changing sign in "dE" that has a large magnitude:

Code: Select all

      N       E                     dE             d eps       ncg     rms          ort
DAV:   1    -0.674675263873E+04   -0.67468E+04   -0.18532E+05  1408   0.124E+04
DAV:   2     0.641970417853E+03    0.73887E+04   -0.51762E+04  1152   0.187E+03
DAV:   3    -0.375999241483E+03   -0.10180E+04   -0.93399E+03  1536   0.466E+02
DAV:   4    -0.595569632725E+03   -0.21957E+03   -0.21185E+03  1536   0.101E+02
DAV:   5    -0.504986267212E+03    0.90583E+02   -0.16216E+03  1280   0.579E+01
DAV:   6    -0.573948735957E+03   -0.68962E+02   -0.68124E+02  1536   0.166E+01
DAV:   7    -0.499762978097E+03    0.74186E+02   -0.99068E+02   896   0.377E+01
DAV:   8    -0.536179964928E+03   -0.36417E+02   -0.35934E+02  1536   0.449E+00    0.233E+02
DAV:   9    -0.154398344554E+05   -0.14904E+05   -0.14309E+05  1472   0.239E+03    0.135E+03
DAV:  10    -0.171134675562E+05   -0.16736E+04   -0.10238E+05  1536   0.525E+02    0.743E+02
DAV:  11    -0.322590198541E+03    0.16791E+05   -0.11741E+05  1536   0.208E+03    0.221E+02
DAV:  12    -0.114713536264E+04   -0.82455E+03   -0.22575E+04  1408   0.717E+02    0.169E+02
DAV:  13    -0.371352473535E+04   -0.25664E+04   -0.33844E+04  1152   0.544E+02    0.189E+02
DAV:  14    -0.178004987504E+04    0.19335E+04   -0.20182E+04  1344   0.784E+02    0.104E+02
DAV:  15    -0.761124519163E+03    0.10189E+04   -0.65715E+03  1152   0.288E+02    0.103E+02
DAV:  16    -0.560149987189E+04   -0.48404E+04   -0.14354E+03  1344   0.341E+02    0.125E+02
DAV:  17    -0.660411088100E+03    0.49411E+04   -0.68448E+03   896   0.297E+02    0.613E+01
DAV:  18    -0.216042093994E+04   -0.15000E+04   -0.17281E+04  1344   0.349E+02    0.659E+01
DAV:  19    -0.503925818302E+03    0.16565E+04   -0.46861E+03   960   0.150E+02    0.475E+01
DAV:  20    -0.185831050787E+04   -0.13544E+04   -0.13994E+04  1344   0.106E+02    0.488E+01
DAV:  21    -0.211392801739E+04   -0.25562E+03   -0.21904E+04  1088   0.186E+02    0.426E+01
DAV:  22    -0.405689476247E+04   -0.19430E+04   -0.39483E+04  1216   0.117E+02    0.866E+01
DAV:  23    -0.159872402555E+04    0.24582E+04   -0.68853E+03  1088   0.204E+02    0.367E+01
DAV:  24    -0.513758376284E+03    0.10850E+04   -0.34184E+03  1024   0.215E+02    0.299E+01
DAV:  25    -0.218844280648E+04   -0.16747E+04   -0.57863E+03  1280   0.208E+02    0.449E+01
DAV:  26    -0.394096173978E+03    0.17943E+04   -0.64560E+03   896   0.161E+02    0.249E+01
DAV:  27    -0.135970501039E+04   -0.96561E+03   -0.61444E+02  1408   0.221E+02    0.390E+01
DAV:  28    -0.399318289758E+03    0.96039E+03   -0.39731E+03   960   0.122E+02    0.203E+01
...
This happens often for magentic systems, slabs or a combination of them.
Changing the density mixer setting often remedies this problem, unfortunately not (immediately) in your case.

There are a few other options you can set in INCAR to help VASP finding a solution, though.
First, check the POSCAR if the structure seems reasonable. For this purpose, I have shifted the compound that is supposed to be adsorbed on the Yb-surface.
The Yb-Ge distance of ~2 Ang. seemed too close at first sight. Specifically, I have replaced the last 27 coordinates in your box by the following starting guess:

Code: Select all

0.381535 0.502732 0.411664 T T T 
0.6286 0.502058 0.575086 T T T 
0.637622 0.261528 0.410269 T T T 
0.622763 0.758758 0.413884 T T T 
0.87885 0.517553 0.412488 T T T 
0.37835 0.486563 0.737684 T T T 
0.634437 0.245359 0.736289 T T T 
0.619578 0.742588 0.739904 T T T 
0.875665 0.501384 0.738509 T T T 
0.379542 0.492613 0.615688 T T T 
0.627408 0.496007 0.697083 T T T 
0.629792 0.508109 0.45309 T T T 
0.380343 0.496682 0.53366 T T T 
0.635629 0.251409 0.614293 T T T 
0.636431 0.255477 0.532265 T T T 
0.62077 0.748639 0.617908 T T T 
0.621571 0.752707 0.53588 T T T 
0.876857 0.507434 0.616512 T T T 
0.877658 0.511503 0.534485 T T T 
0.49976 0.618618 0.657289 T T T 
0.50719 0.370003 0.655482 T T T 
0.508783 0.378088 0.492471 T T T 
0.501353 0.626703 0.494279 T T T 
0.755848 0.377414 0.655894 T T T 
0.75744 0.385498 0.492884 T T T 
0.748418 0.626028 0.657702 T T T 
0.750011 0.634113 0.494691 T T T 
However, the modifications to the POSCAR might not be necessary in the following.

The general guideline to tackle convergence issues is to make the problem simpler and ramp up the difficulty step by step.

This essentially means, finding a PBE solution for the non-magnetic problem first, i.e. reducing the parameters in the INCAR.
For this purpose, it is usually sufficient to study the convergence at the Gamma point.
Bear in mind to use vasp_std, in order to be able to increase the k-point mesh in a subsequent run.
Following INCAR settings converge to an error dE < 1E-5 after NELM=500 steps.

Code: Select all

NELM     = 500      
ALGO     = DAMPED ; TIME = 0.01  # strongly damped electronic minimizer
EDIFF    = 1E-05    
ISMEAR   = 0     
SIGMA    = 0.05
ENCUT    = 500
MAXMIX   = 40   # helps sometimes to reach electronic convergence faster
LMAXMIX  = 6    # set this if you have  f-electrons in your system
LREAL    = Auto # faster for large unit cells
IMIX     = 0    # linear density mixer
AMIX     = 0.2
BMIX     = 0.0001 
AMIX_MAG = 0.8
BMIX_MAG = 0.0001 
Here is an excerpt of the corresponding OSZICAR:

Code: Select all

       N       E                     dE             d eps       ncg     rms          ort
DAV:   1     0.771919199043E+04    0.77192E+04   -0.14049E+05   384   0.123E+04
DAV:   2     0.395915183534E+04   -0.37600E+04   -0.33616E+04   384   0.182E+03
DAV:   3     0.163444013484E+04   -0.23247E+04   -0.21988E+04   384   0.480E+02
DAV:   4     0.609520693352E+03   -0.10249E+04   -0.98950E+03   384   0.298E+02
DAV:   5     0.153589071904E+03   -0.45593E+03   -0.44401E+03   384   0.187E+02
SDA:   6     0.111046393773E+04    0.95687E+03   -0.54546E+03   384   0.545E+05 0.000E+00
DMP:   7     0.643080833850E+03   -0.46738E+03   -0.58340E+03   384   0.251E+05 0.383E+05
DMP:   8     0.305845645317E+03   -0.33724E+03   -0.19735E+03   384   0.132E+05 0.755E+04
DMP:   9     0.365760816412E+03    0.59915E+02   -0.22794E+03   384   0.107E+06-0.973E+05

...
DMP: 393    -0.385824339861E+03   -0.10792E-04   -0.10124E-04   384   0.763E-04 0.108E-02
DMP: 394    -0.385824349986E+03   -0.10125E-04   -0.94960E-05   384   0.713E-04 0.101E-02
DMP: 395    -0.385824359483E+03   -0.94971E-05   -0.89054E-05   384   0.667E-04 0.949E-03
   1 F= -.38582436E+03 E0= -.38577599E+03  d E =-.967378E-01
Using the corresponding output (WAVECAR), I considered the "magnetic" PBE problem with by adding following lines to INCAR:

Code: Select all

MAGMOM   = 64*0.6 9*0.6 10*0.6 8*0.6  # ferromagentic solution is the default choice, you probably can avoid this line
ISPIN    = 2
ICHARG   = 1 # use CHGCAR as initial density, not WAVECAR
OSZICAR of the run is given in the following:

Code: Select all

       N       E                     dE             d eps       ncg     rms          ort
SDA:   1    -0.385824368401E+03   -0.38582E+03   -0.14784E-01   384   0.148E+01 0.000E+00
DMP:   2    -0.385839129843E+03   -0.14761E-01   -0.26090E-01   384   0.133E+01 0.147E+01
DMP:   3    -0.385865151565E+03   -0.26022E-01   -0.34484E-01   384   0.120E+01 0.259E+01
DMP:   4    -0.385899516596E+03   -0.34365E-01   -0.40378E-01   384   0.107E+01 0.342E+01
DMP:   5    -0.385939729285E+03   -0.40213E-01   -0.44253E-01   384   0.951E+00 0.400E+01
DMP:   6    -0.385983780473E+03   -0.44051E-01   -0.46529E-01   384   0.848E+00 0.438E+01
...
DMP: 103    -0.386818189427E+03   -0.13809E-04   -0.12667E-04   384   0.756E-04 0.137E-02
DMP: 104    -0.386818202039E+03   -0.12612E-04   -0.11567E-04   384   0.690E-04 0.125E-02
DMP: 105    -0.386818213556E+03   -0.11516E-04   -0.10561E-04   384   0.629E-04 0.114E-02
DMP: 106    -0.386818224070E+03   -0.10515E-04   -0.96400E-05   384   0.573E-04 0.104E-02
DMP: 107    -0.386818233668E+03   -0.95973E-05   -0.87973E-05   384   0.521E-04 0.954E-03
   1 F= -.38681823E+03 E0= -.38676919E+03  d E =-.980935E-01  mag=     0.0000
Next I have added the Van der Waals parameter

Code: Select all

# VDW settings 
LVDW     = .TRUE.
IVDW     = 12
LDIPOL   = .TRUE
IDIPOL   = 3
to obtain an energy of:

Code: Select all

...
DMP: 144    -0.387616147707E+03   -0.10487E-04   -0.95253E-05   384   0.444E-04 0.105E-02
DMP: 145    -0.387616157224E+03   -0.95167E-05   -0.86341E-05   384   0.395E-04 0.949E-03
   1 F= -.40878865E+03 E0= -.40874032E+03  d E =-.966684E-01  mag=    -0.0000
From here on, I recommend to run a few ionic steps with the gamma point version to obtain a reasonable structure. Also, if you want to switch to ALGO = FAST, you might need to decrease EDIFF further down to ~1E-7, otherwise RMM could fail.
Only, in the final step I would increase the k-points to find the relaxed structure at the precision you are looking for.

N.B.: Because you study a system with f electrons, you might want to add a U-parameter to Yb and Rh to improve the description of the system.