How to accelerate SOC + HSE06 calculation?
Posted: Fri Jun 04, 2010 7:47 am
Hi, all vasp users and developers,
I'm now tring to calculate some semiconductors materials with Spin-Orbital Coupling (SOC) and
Hybride Functional (HSE06) method in Vasp 5.2. I'm in trouble about this type calculation,
and ask for your help.
My calculation process is:
First, general static DFT calculation.
Second, standart SOC calculation (LSORBIT=T), and obtain the WAVECAR file.
Third, based on this WAVECAR file, add LHFCALC = .TRUE. and other related HF parameters to perform HF type calculation.
Here are my SOC + HF input files:
======== INCAR ====================================================
ENCUT = 350
LMAXMIX=6
LSORBIT=T
ISTART = 1
LHFCALC = .TRUE.
HFSCREEN = 0.2
ALGO = Damped
TIME = 0.4
#ENCUTFOCK = 0
#NKRED = 2
======== KPOINTS ==================================================
Auto
0
Gamma
8 8 8
0 0 0
======= POSCAR ====================================================
Cd As Ge
1.00000000000000
-2.9596082483964499 2.9596082483964552 6.0150253167398420
2.9596082483964552 -2.9596082483964525 6.0150253167398402
2.9596082483964539 2.9596082483964539 -6.0150253167398411
2 4 2
Direct
0.0000000000000000 0.0000000000000000 0.0000000000000000
0.7500000000000000 0.2500000000000000 0.5000000000000000
0.3750000000000000 0.3566726093521447 0.4816726093521447
0.8750000000000000 0.8933273906478553 0.5183273906478553
0.6433273906478553 0.1250000000000000 0.0183273906478553
0.1066726093521447 0.6250000000000000 0.9816726093521447
0.5000000000000000 0.5000000000000000 0.0000000000000000
0.2500000000000000 0.7500000000000000 0.5000000000000000
========= POTCAR ==================================================
potpaw_PBE for every element.
My calculating enviroment is a blade-based system:
Each node contains four AMD Opteron Quad-Core 64-bit processors (16 cores in all) on a
single board, as an SMP unit. The core frequency is 2.3 GHz and supports 4 floating-point
operations per clock period with a peak performance of 9.2 GFLOPS/core or 128 GFLOPS/node.
Each node contains 32 GB of memory. The memory subsystem has a 1.0 GHz HyperTransport
system Bus, and 2 channels with 667 MHz DDR2 DIMMS. Each socket possesses an independent
memory controller connected directly to an L3 cache.
I used 4 nodes(total 64 CPUs) to perform this calculation, but one electronic step is
NOT done with more than 20 hours. In order to accelerate the calculation, I also try
to uncomment the #ENCUTFOCK = 0 and #NKRED = 2, but there is an error about memory:
----------------------------------------------------------------------------------
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
libintlc.so.5 00002B102C3703A6 Unknown Unknown Unknown
libintlc.so.5 00002B102C36F338 Unknown Unknown Unknown
libifcore.so.5 00002B102BD2829E Unknown Unknown Unknown
libifcore.so.5 00002B102BCB2DA4 Unknown Unknown Unknown
libifcore.so.5 00002B102BCEF6EF Unknown Unknown Unknown
libifcore.so.5 00002B102BCEF5DF Unknown Unknown Unknown
vasp5.2 00000000005CC44A Unknown Unknown Unknown
vasp5.2 00000000005CBC26 Unknown Unknown Unknown
vasp5.2 00000000004293D0 Unknown Unknown Unknown
vasp5.2 000000000040A3E2 Unknown Unknown Unknown
libc.so.6 0000003D58A1C3FB Unknown Unknown Unknown
vasp5.2 000000000040A32A Unknown Unknown Unknown
MPI process terminated unexpectedly
----------------------------------------------------------------------------------
So, I have two questions:
1. Are there some parameters I don't sensibly used? In order to accelerate the calculation
and reduce the demand for computing resources, which parameters should be adjusted?
2. In Vasp 5.2, is the calculation with a combination of SOC and HF stable and reliable?
If yes, how much the computing resources are needed for this type calculation?
I'm looking forward to your reply, Thank you very much!
I'm now tring to calculate some semiconductors materials with Spin-Orbital Coupling (SOC) and
Hybride Functional (HSE06) method in Vasp 5.2. I'm in trouble about this type calculation,
and ask for your help.
My calculation process is:
First, general static DFT calculation.
Second, standart SOC calculation (LSORBIT=T), and obtain the WAVECAR file.
Third, based on this WAVECAR file, add LHFCALC = .TRUE. and other related HF parameters to perform HF type calculation.
Here are my SOC + HF input files:
======== INCAR ====================================================
ENCUT = 350
LMAXMIX=6
LSORBIT=T
ISTART = 1
LHFCALC = .TRUE.
HFSCREEN = 0.2
ALGO = Damped
TIME = 0.4
#ENCUTFOCK = 0
#NKRED = 2
======== KPOINTS ==================================================
Auto
0
Gamma
8 8 8
0 0 0
======= POSCAR ====================================================
Cd As Ge
1.00000000000000
-2.9596082483964499 2.9596082483964552 6.0150253167398420
2.9596082483964552 -2.9596082483964525 6.0150253167398402
2.9596082483964539 2.9596082483964539 -6.0150253167398411
2 4 2
Direct
0.0000000000000000 0.0000000000000000 0.0000000000000000
0.7500000000000000 0.2500000000000000 0.5000000000000000
0.3750000000000000 0.3566726093521447 0.4816726093521447
0.8750000000000000 0.8933273906478553 0.5183273906478553
0.6433273906478553 0.1250000000000000 0.0183273906478553
0.1066726093521447 0.6250000000000000 0.9816726093521447
0.5000000000000000 0.5000000000000000 0.0000000000000000
0.2500000000000000 0.7500000000000000 0.5000000000000000
========= POTCAR ==================================================
potpaw_PBE for every element.
My calculating enviroment is a blade-based system:
Each node contains four AMD Opteron Quad-Core 64-bit processors (16 cores in all) on a
single board, as an SMP unit. The core frequency is 2.3 GHz and supports 4 floating-point
operations per clock period with a peak performance of 9.2 GFLOPS/core or 128 GFLOPS/node.
Each node contains 32 GB of memory. The memory subsystem has a 1.0 GHz HyperTransport
system Bus, and 2 channels with 667 MHz DDR2 DIMMS. Each socket possesses an independent
memory controller connected directly to an L3 cache.
I used 4 nodes(total 64 CPUs) to perform this calculation, but one electronic step is
NOT done with more than 20 hours. In order to accelerate the calculation, I also try
to uncomment the #ENCUTFOCK = 0 and #NKRED = 2, but there is an error about memory:
----------------------------------------------------------------------------------
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
libintlc.so.5 00002B102C3703A6 Unknown Unknown Unknown
libintlc.so.5 00002B102C36F338 Unknown Unknown Unknown
libifcore.so.5 00002B102BD2829E Unknown Unknown Unknown
libifcore.so.5 00002B102BCB2DA4 Unknown Unknown Unknown
libifcore.so.5 00002B102BCEF6EF Unknown Unknown Unknown
libifcore.so.5 00002B102BCEF5DF Unknown Unknown Unknown
vasp5.2 00000000005CC44A Unknown Unknown Unknown
vasp5.2 00000000005CBC26 Unknown Unknown Unknown
vasp5.2 00000000004293D0 Unknown Unknown Unknown
vasp5.2 000000000040A3E2 Unknown Unknown Unknown
libc.so.6 0000003D58A1C3FB Unknown Unknown Unknown
vasp5.2 000000000040A32A Unknown Unknown Unknown
MPI process terminated unexpectedly
----------------------------------------------------------------------------------
So, I have two questions:
1. Are there some parameters I don't sensibly used? In order to accelerate the calculation
and reduce the demand for computing resources, which parameters should be adjusted?
2. In Vasp 5.2, is the calculation with a combination of SOC and HF stable and reliable?
If yes, how much the computing resources are needed for this type calculation?
I'm looking forward to your reply, Thank you very much!