Nonconverging electronic minimization for specific magnetic systems

Message

alexeydick · #1 Post by **alexeydick** » Sun May 10, 2009 6:16 pm

Dear VASP users and developers,

I decided to re-post the description of my problem (also posted at http://cms.mpi.univie.ac.at/vasp-forum/ ... php?4.5586) to this forum. Meanwhile I have discovered another several structures with similar convergence problems, i.e. it seems that the scf-problem is systematic. Meanwhile one of the "bad" systems was tested with our home-made code, and it was possible to achieve the desired accuracy. Although the test was done employing norm-conserving pseudopotentials and not with the paw, it suggests that the problem resides on the minimizer, rather than on the system. Please find the detailed description of the problem below.

I experience severe convergence problem with relatively complex magnetic structure (pure iron, 32 atoms cubic fcc-cell, special distribution of 16 up and 16 down magnetic moments, PAW PBE potentials without 3p-states, NUPDOWN=0, initial atomic positions are nonrelaxed fcc-sites). During electronic minimization of the supercell with local magnetic moments smaller than ~1.5-1.7 Bohr magneton (depends on the supercell volume) the total energy is fluctuating with the amplitude of ~10^-4 Hartree/cell, and is not converging within 50 steps. Inspection of the total energy reveals that the energy can even smoothly increase during several (~10) steps, i.e. it does not show stable tendency to decrease. By taking snapshots of the local magnetic moments one recognizes that some of the moments undergo the sign reversal during the minimization.

Since such a nonconverging energy indicates that the residuum density does not vanish upon minimization I have tried to stabilize the electronic minimizer. First I have reduced AMIX_MAG to 0.1 to reduce spin fluctuations, and used BMIX*=0.00001 to keep the speed of the convergence high. This stabilizes magnetic moments (removes their flips), but does not improve the convergence. Closer check of the magnetic moment vs iteration shows that, although most of the moments do not significantly change, they are still â€œbreathingâ€� (e.g., some of the moments after smooth decrease during 60 iterations start to fluctuate with amplitude ~0.1 Bohr magneton) and lead to change of the total energy.

To decrease this â€œbreathingâ€� I have played with BMIX and BMIX_MAG by changing them from 0.0000001 to 1.0, without any visible improvement of the convergence.

Furthermore, I have tried several other options in different combinations:
(*) included LMAXMIX=6
(*) checked that k-points mesh and energy cut-off are not responsible for such ill behavior
(*) checked Pulay mixer with different settings of AMIX* and BMIX*
(*) changed the memory of the Broyden mixer with MAXMIX from 7 to 45
(*) checked ALGO=Normal
(*) tried to increase NELMDL up to 15
(*) tried calculations removing NUPDOWN=0 tag

None of the setups was successful, i.e. the energy was fluctuating and its convergence was not better than ~ 10^-4 Hartree/cell even when 100 steps were used.

At the same time, the default settings of the INCAR at larger volumes (and, consequently, larger magnetic moments) provide a reasonable electronic convergence without flips of the local magnetic moments. In fact, the moments vs iteration curves are converged rather fast and, therefore, do not destabilize the minimization routine. Also no problems occur at small volumes where magnetic moments disappear.

I would like to ask whether someone else has experienced such nonconverging magnetic systems, and whether there exists a known solution of this issue. Any advise how to overcome the problem will be, of course, also highly appreciated. Thank you very much for your response.

With best regards,
Alexey Dick

P.S. Parallel/serial VASP v.4.6.31 compiled with Intel compiler was used.
The typical INCAR file that I have used is provided below:
PREC = accurate
ISMEAR = 1
SIGMA = 0.15
ENCUT = 270
ISPIN = 2
MAGMOM = 16*5.0 16*-5.0
ALGO = NORMAL
EDIFF = 1.0E-06
NELM = 100
NELMIN = 7
NELMDL = -7
ADDGRID = .TRUE.
ISTART = 0
ICHARG = 2
NBANDS = 200
NUPDOWN = 0

LMAXMIX = 6

AMIX = 0.2
AMIX_MAG = 0.1

BMIX = 0.0001
BMIX_MAG = 0.0001

alexeydick · #2 Post by **alexeydick** » Wed May 20, 2009 2:06 pm

Dear all,

please find below supplementary information to my previous post - the typical set of input files, the makefile which was used for compilation, as well as the typical energy-versus-iteration-step output.

With best regards,
Alexey Dick

POTCAR - standard PAW potential with 3p-states in the core

INCAR

Code: Select all

PREC    = High     # no wrap around errors
ISMEAR  = 1        # k-point smearing
SIGMA   = 0.15     # thermal smearing
ENCUT   = 270      # increased cutoff because of ISIS=3
ISPIN   = 2        #
MAGMOM  = 16*5.0 16*-5.0
ALGO    = NORMAL   # use RMM-DIIS algorithm
EDIFF   = 1.0E-07  # convergence criterion for electronic relaxation
NELM    = 300      # maximum number of iterations in electronic minimizer
NELMIN  = 7        # minimum number of electronic steps
NELMDL  = -7       # number of DAV steps before RMM-DIIS starts
ADDGRID = .TRUE.   # increase accuracy of the stresses and forces
ISTART  = 0        # start WF from scratch
ICHARG  = 2        # start density with overlapping atoms
NBANDS  = 200      # number of electronic bands

LMAXMIX = 6

MAXMIX  = 45

AMIX = 0.2
AMIX_MAG = 0.1

BMIX = 0.00001
BMIX_MAG = 0.00001

NUPDOWN = 0

KPOINTS

Code: Select all

k-points
0
Gamma
4 4 4
0 0 0

POSCAR

Code: Select all

####
3.504313
2.000000000000 0.000000000000 0.000000000000
0.000000000000 2.000000000000 0.000000000000
0.000000000000 0.000000000000 2.000000000000
32
Selective Dynamics
Direct
1.000000000000 0.250000000000 0.750000000000 T T T
1.000000000000 1.000000000000 1.000000000000 T T T
1.000000000000 0.500000000000 0.500000000000 T T T
0.250000000000 0.250000000000 0.500000000000 T T T
0.750000000000 1.000000000000 0.750000000000 T T T
1.000000000000 0.500000000000 1.000000000000 T T T
1.000000000000 0.750000000000 0.250000000000 T T T
0.250000000000 0.250000000000 1.000000000000 T T T
0.250000000000 0.500000000000 0.250000000000 T T T
0.500000000000 0.250000000000 0.250000000000 T T T
0.500000000000 0.500000000000 0.500000000000 T T T
0.500000000000 0.750000000000 0.750000000000 T T T
0.750000000000 0.250000000000 0.500000000000 T T T
0.750000000000 0.500000000000 0.750000000000 T T T
0.250000000000 0.750000000000 1.000000000000 T T T
0.750000000000 0.500000000000 0.250000000000 T T T
1.000000000000 1.000000000000 0.500000000000 T T T
0.250000000000 1.000000000000 0.750000000000 T T T
1.000000000000 0.250000000000 0.250000000000 T T T
1.000000000000 0.750000000000 0.750000000000 T T T
0.250000000000 1.000000000000 0.250000000000 T T T
0.250000000000 0.500000000000 0.750000000000 T T T
0.500000000000 1.000000000000 0.500000000000 T T T
0.500000000000 0.250000000000 0.750000000000 T T T
0.250000000000 0.750000000000 0.500000000000 T T T
0.500000000000 1.000000000000 1.000000000000 T T T
0.750000000000 1.000000000000 0.250000000000 T T T
0.500000000000 0.500000000000 1.000000000000 T T T
0.500000000000 0.750000000000 0.250000000000 T T T
0.750000000000 0.250000000000 1.000000000000 T T T
0.750000000000 0.750000000000 0.500000000000 T T T
0.750000000000 0.750000000000 1.000000000000 T T T

Makefile used to compile VASP

Code: Select all

#=======================================================================
# Makefile for the PGI f90 (pgf90) compiler for Athlon Opteron systems
#=======================================================================

#-----------------------------------------------------------------------
# Make uses a special target, named .SUFFIXES to allow you to define 
# your own suffixes and corresponding ruleshow to treat them. 
#-----------------------------------------------------------------------
.SUFFIXES: .inc .f .f90 .F

#-----------------------------------------------------------------------
# all CPP processed fortran files have the extension .f90
#-----------------------------------------------------------------------
SUFFIX=.f90

#-----------------------------------------------------------------------
# fortran compiler and linker
#-----------------------------------------------------------------------
FC=/usr/mpi/intel/openmpi-1.2.8/bin/mpif90
FCL=$(FC)

#-----------------------------------------------------------------------
# whereis CPP ?? (I need CPP, can't use gcc with proper options)
#-----------------------------------------------------------------------
CPP_ = ./preprocess <$*.F | /usr/bin/cpp -P -C -traditional >$*$(SUFFIX)

#-----------------------------------------------------------------------
# possible options for CPP:
# NGXhalf             charge density   reduced in X direction
# wNGXhalf            gamma point only reduced in X direction
# avoidalloc          avoid ALLOCATE if possible
# IFC                 work around some IFC bugs
# CACHE_SIZE          1000 for PII,PIII, 5000 for Athlon, 8000-12000 P4
# RPROMU_DGEMV        use DGEMV instead of DGEMM in RPRO (depends on used BLAS)
# RACCMU_DGEMV        use DGEMV instead of DGEMM in RACC (depends on used BLAS)
#-----------------------------------------------------------------------
CPP     = $(CPP_) -DMPI -DIFC -DHOST=\"LinuxIFC\" \
          -Dkind8 -DNGZhalf -DPGF90 -Davoidalloc \
#          -DscaLAPACK \
#          -DRPROMU_DGEMV  -DRACCMU_DGEMV  -DCACHE_SIZE=2000

#-----------------------------------------------------------------------
# general fortran flags  (there must a trailing blank on this line)
#-----------------------------------------------------------------------

FFLAGS = -FR -lowercase -assume byterecl 
OFLAG= -O3 -xP -static -no-prec-div  
OFLAG2= -O1 -axW
OFLAG_HIGH = $(OFLAG)
OBJ_HIGH = 
OBJ_NOOPT = 
DEBUG  = 
INLINE = $(OFLAG)

#-----------------------------------------------------------------------
# location of SCALAPACK
# if you do not use SCALAPACK simply uncomment the line SCA
#-----------------------------------------------------------------------
#SCA= -L/opt/intel/Compiler/mkl/lib/em64t/libmkl_scalapack_lp64.a   

MKL= -L/opt/intel/Compiler/mkl/lib/em64t -lmkl_intel_lp64 -lmkl_blacs_lp64 -lmkl_intel_thread -lm -lmkl_core -liomp5 -lpthread
#-----------------------------------------------------------------------
# VASP internal libraries in: ../lib/
# BLAS and LAPACK taken from taken from the pgi compiler   
#-----------------------------------------------------------------------
LIB  = /raid/dick/vasp.intel/vasp.4.lib.i4/libdmy.a \
       /raid/dick/vasp.intel/vasp.4.lib.i4/linpack_double.o \
       $(SCA) $(MKL)
#-----------------------------------------------------------------------
# options for linking (for compiler version 6.X, 7.1) nothing is required
#-----------------------------------------------------------------------
LINK    =  

#-----------------------------------------------------------------------
# fft libraries
#-----------------------------------------------------------------------
FFT3D   = fftmpi.o fftmpi_map.o fft3dlib.o

#-----------------------------------------------------------------------
# general rules and compile lines
#-----------------------------------------------------------------------
BASIC=   symmetry.o symlib.o   lattlib.o  random.o   

SOURCE=  base.o     mpi.o      smart_allocate.o      xml.o  \
         constant.o jacobi.o   main_mpi.o  scala.o   \
         asa.o      lattice.o  poscar.o   ini.o      setex.o     radial.o  \
         pseudo.o   mgrid.o    mkpoints.o wave.o      wave_mpi.o  $(BASIC) \
         nonl.o     nonlr.o    dfast.o    choleski2.o    \
         mix.o      charge.o   xcgrad.o   xcspin.o    potex1.o   potex2.o  \
         metagga.o  constrmag.o pot.o      cl_shift.o force.o    dos.o      elf.o      \
         tet.o      hamil.o    steep.o    \
         chain.o    dyna.o     relativistic.o LDApU.o sphpro.o  paw.o   us.o \
         ebs.o      wavpre.o   wavpre_noio.o broyden.o \
         dynbr.o    rmm-diis.o reader.o   writer.o   tutor.o xml_writer.o \
         brent.o    stufak.o   fileio.o   opergrid.o stepver.o  \
         dipol.o    xclib.o    chgloc.o   subrot.o   optreal.o   davidson.o \
         edtest.o   electron.o shm.o      pardens.o  paircorrection.o \
         optics.o   constr_cell_relax.o   stm.o    finite_diff.o \
         elpol.o    setlocalpp.o aedens.o 
 
INC=

vasp: $(SOURCE) $(FFT3D) $(INC) main.o 
	rm -f vasp
	$(FCL) -o vasp $(LINK) main.o  $(SOURCE)   $(FFT3D) $(LIB) 
makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)
	$(FCL) -o makeparam  $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)
zgemmtest: zgemmtest.o base.o random.o $(INC)
	$(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)
dgemmtest: dgemmtest.o base.o random.o $(INC)
	$(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB) 
ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)
	$(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)
kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)
	$(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)

clean:	
	-rm -f *.g *.f *.o *.L *.mod ; touch *.F

main.o: main$(SUFFIX)
	$(FC) $(FFLAGS)$(DEBUG)  $(INCS) -c main$(SUFFIX)
xcgrad.o: xcgrad$(SUFFIX)
	$(FC) $(FFLAGS) $(INLINE)  $(INCS) -c xcgrad$(SUFFIX)
xcspin.o: xcspin$(SUFFIX)
	$(FC) $(FFLAGS) $(INLINE)  $(INCS) -c xcspin$(SUFFIX)

makeparam.o: makeparam$(SUFFIX)
	$(FC) $(FFLAGS)$(DEBUG)  $(INCS) -c makeparam$(SUFFIX)

makeparam$(SUFFIX): makeparam.F main.F 
#
# MIND: I do not have a full dependency list for the include
# and MODULES: here are only the minimal basic dependencies
# if one strucuture is changed then touch_dep must be called
# with the corresponding name of the structure
#
base.o: base.inc base.F
mgrid.o: mgrid.inc mgrid.F
constant.o: constant.inc constant.F
lattice.o: lattice.inc lattice.F
setex.o: setexm.inc setex.F
pseudo.o: pseudo.inc pseudo.F
poscar.o: poscar.inc poscar.F
mkpoints.o: mkpoints.inc mkpoints.F
wave.o: wave.inc wave.F
nonl.o: nonl.inc nonl.F
nonlr.o: nonlr.inc nonlr.F

$(OBJ_HIGH):
	$(CPP)
	$(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)
$(OBJ_NOOPT):
	$(CPP)
	$(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)

fft3dlib_f77.o: fft3dlib_f77.F
	$(CPP)
	$(F77) $(FFLAGS_F77) -c $*$(SUFFIX)

.F.o:
	$(CPP)
	$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
.F$(SUFFIX):
	$(CPP)
$(SUFFIX).o:
	$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)

# special rules
#-----------------------------------------------------------------------
# these special rules are cummulative (that is once failed
#   in one compiler version, stays in the list forever)
# -tpp5|6|7 P, PII-PIII, PIV
# -xW use SIMD (does not pay of on PII, since fft3d uses double prec)
# all other options do no affect the code performance since -O1 is used
#-----------------------------------------------------------------------

#fft3dlib.o : fft3dlib.F
#	$(CPP)
#	$(FC) $(FFLAGS) -FR -lowercase -O1 -xW -prefetch- -unroll0  -vec_report3 -c $*$(SUFFIX)
#fftmpi.o : fftmpi.F
#	$(CPP)
#	$(FC) $(FFLAGS) $(OFLAG2) $(INCS) -c $*$(SUFFIX)  
#fftmpi_map.o : fftmpi_map.F
#	$(CPP)
#	$(FC) $(FFLAGS) $(OFLAG2) $(INCS) -c $*$(SUFFIX)

Energy vs. iteration output:

Code: Select all

1702.49369785
-94.70802258
-269.47381601
-274.81393100
-274.86864446
-274.86985661
-274.86989747
-245.11630109
-259.33381830
-245.98860950
-245.89088844
-252.92114047
-258.79684187
-259.73661680
-263.13434578
-262.22093923
-261.96483761
-260.84387531
-260.13601765
-259.97363188
-259.54411246
-259.56982616
-259.85885091
-259.85762382
-259.93896477
-259.99336011
-260.01332603
-260.04649076
-260.06691481
-260.07788723
-260.07152019
-260.06930155
-260.06764757
-260.06933587
-260.06974239
-260.07090306
-260.07150759
-260.07375307
-260.07394955
-260.07447701
-260.07472426
-260.07551102
-260.07562686
-260.07597895
-260.07621745
-260.07676669
-260.07706824
-260.07791609
-260.07807215
-260.07826313
-260.07830609
-260.07845162
-260.07850780
-260.07855236
-260.07861885
-260.07875927
-260.07899851
-260.07880062
-260.07915017
-260.07927037
-260.07933669
-260.07936665
-260.07958846
-260.07943638
-260.07951803
-260.07956368
-260.07953985
-260.07952859
-260.07952811
-260.07963357
-260.07967630
-260.07970355
-260.07981281
-260.07968844
-260.07968955
-260.07975222
-260.07955863
-260.07974496
-260.08072052
-260.08034559
-260.08017048
-260.07976877
-260.07886140
-260.07878050
-260.07854529
-260.07861894
-260.07830461
-260.07782649
-260.07788728
-260.07909022
-260.08136412
-260.08191208
-260.07986494
-260.08102521
-260.08046854
-260.07958607
-260.07918998
-260.07603360
-260.07648395
-260.07628423
-260.07815421
-260.07938171
-260.08110671
-260.08282241
-260.08193140
-260.08013740
-260.07824842
-260.07908690
-260.07889475
-260.07889589
-260.07855745
-260.07870568
-260.07866534
-260.07800754
-260.07761908
-260.07743272
-260.07698364
-260.07639996
-260.07681744
-260.07698786
-260.07700944
-260.07699022
-260.07643673
-260.07644581
-260.07731947
-260.07810952
-260.07888955
-260.07895179
-260.07940000
-260.08009571
-260.08064140
-260.08107076
-260.08155870
-260.08142986
-260.08125789
-260.08092220
-260.07964154
-260.07896366
-260.07860181
-260.07808867
-260.07810386
-260.07787354
-260.07835769
-260.07750126
-260.07858757
-260.07891429
-260.07904712
-260.07856455
-260.07885229
-260.07883959
-260.07871237
-260.07872643
-260.07868297
-260.07890899
-260.07892879
-260.07892522
-260.07879920
-260.07869662
-260.07890128
-260.07902258
-260.07958929
-260.07990172
-260.07909401
-260.07960225
-260.07942220
-260.07930811
-260.07921772
-260.07573588
-260.07703017
-260.07728647
-260.07735253
-260.07756382
-260.07747381
-260.07778831
-260.07773257
-260.07761240
-260.07756689
-260.07784515
-260.07787768
-260.07804796
-260.07797394
-260.07770289
-260.07781086
-260.07785466
-260.07793448
-260.07812907
-260.07819727
-260.07869256
-260.07850807
-260.07849496
-260.07849093
-260.07848975
-260.07818394
-260.07820253
-260.07816699
-260.07821344
-260.07824323
-260.07789638
-260.07809342
-260.07816584
-260.07816303
-260.07818309
-260.07807406
-260.07841768
-260.07841886
-260.07845634
-260.07841459
-260.07803581
-260.07807734
-260.07807865
-260.07825743
-260.07836557
-260.07774881
-260.07795246
-260.07832342
-260.07826304
-260.07810459
-260.07830794
-260.07781055
-260.07766762
-260.07731356
-260.07718879
-260.07745285
-260.07725930
-260.07700784
-260.07709524
-260.07712586
-260.07732154
-260.07718204
-260.07716142
-260.07713171
-260.07707772
-260.07740942
-260.07765026
-260.07789222
-260.07797375
-260.07807112
-260.07796461
-260.07827023
-260.07861297
-260.07906640
-260.07894321
-260.08030037
-260.07828576
-260.07804490
-260.07791818
-260.07796250
-260.07795613
-260.07792257
-260.07760319
-260.07780996
-260.07748049
-260.07852795
-260.07841359
-260.07841137
-260.07885797
-260.07881921
-260.07883587
-260.07919899
-260.07872912
-260.07975170
-260.08103199
-260.08200710
-260.08264936
-260.08208393
-260.08227139
-260.08227893
-260.08294252
-260.08183253
-260.08150698
-260.07924722
-260.08060588
-260.08120958
-260.08162902
-260.08165579
-260.08148403
-260.08157739
-260.08304581
-260.08111449
-260.08118000
-260.08028723
-260.08054748
-260.08084765
-260.07910954
-260.07898609
-260.07835466
-260.07769443
-260.07827011
-260.07741086
-260.07736269
-260.07738850
-260.07703898
-260.07746652
-260.07756358
-260.07804139
-260.07864658
-260.07898450
-260.07994340

tmueller · #3 Post by **tmueller** » Fri Jun 19, 2009 2:57 pm

We had a similar problem, even when we turned off all compiler optimizations. We were able to fix it by using Goto BLAS and FFTW instead of MKL. The resulting code is about the same speed and much more robust.

On a side note: Is this forum painfully slow for anyone else? Each page takes minutes to load, making the forum almost unusable.

pkroll · #4 Post by **pkroll** » Tue Jun 23, 2009 1:35 am

it is fast at nights, though ....