VASP GFORTRAN

Message

gaelb · #1 Post by **gaelb** » Sun Sep 26, 2010 8:54 pm

Hi everyone
any makefile to compile VASP 5.2 with gfortran?
I have already compilled the vasp lib.

I'm using debian lenny (gfortran 4.3.2) on opteron machines.
I would like to use acml mathlib, is it possible?

Thank you in advance
Gael

imcnab · #2 Post by **imcnab** » Thu Feb 10, 2011 9:17 pm

Gael,
I am also trying to compile vasp 5.2.11 with gfortran.
If you managed a working makefile, please can you post it here?
Thanks... Iain

imcnab · #3 Post by **imcnab** » Fri Feb 11, 2011 6:40 pm

Dear Gael,
in case these are of use to you (or anyone else), here are two Makefiles for vasp on an x86_64 linux machine, running RedHat.
These Makefiles were created by our systems manager, Frank Bures. He comments: "... the original makefile for linux had lots of things missing and some problems with trailing EOL's and spaces. Quite an ordeal to find such errors..."

The Makefiles are for a build of vasp that includes the climbing-image NEB add-ons from University of Austin at Texas theory group (see http://theory.cm.utexas.edu/forum/viewforum.php?f=2).

There are two makefiles.
Serial Build
MPI Build

Frank did a little benchmarking using the Hg job provided with VASP4:

Results:
vasp4 2:19.60
vasp5 2:19.33
vasp4_gamma 1:13.91
vasp5_gamma 1:13.66

(1) Serial Build.
---------------------------------------
.SUFFIXES: .inc .f .f90 .F
#-----------------------------------------------------------------------
# rewritten for gfortran under Redhat, RedHat on x86_64 and gfortran
# by Frank Bures, 11 February 2011
#
# Makefile for Portland Group F90/HPF compiler release 3.0-1, 3.1
# and release 1.7
# (http://www.pgroup.com/ & ftp://ftp.pgroup.com/x86/, you need
# to order the HPF/F90 suite)
# we have found no noticable performance differences between
# any of the releases, even Athlon or PIII optimisation does
# not seem to improve performance
#
# The makefile was tested only under Linux on Intel platforms
# (Suse X,X)
#
# it might be required to change some of library pathes, since
# LINUX installation vary a lot
# Hence check ***ALL**** options in this makefile very carefully
#-----------------------------------------------------------------------
#
# Mind that some Linux distributions (Suse 6.1) have a bug in
# libm causing small errors in the error-function (total energy
# is therefore wrong by about 1meV/atom). The recommended
# solution is to update libc.
#
# Mind that some Linux distributions (Suse 6.1) have a bug in
# libm causing small errors in the error-function (total energy
# is therefore wrong by about 1meV/atom). The recommended
# solution is to update libc.
#
# BLAS must be installed on the machine
# there are several options:
# 1) very slow but works:
# retrieve the lapackage from ftp.netlib.org
# and compile the blas routines (BLAS/SRC directory)
# please use g77 or f77 for the compilation. When I tried to
# use pgf77 or pgf90 for BLAS, VASP hang up when calling
# ZHEEV (however this was with lapack 1.1 now I use lapack 2.0)
# 2) most desirable: get an optimized BLAS
# for a list of optimized BLAS try
# http://www.kachinatech.com/~hjjou/scilib/opt_blas.html
#
# the two most reliable packages around are presently:
# 3a) Intels own optimised BLAS (PIII, P4, Itanium)
# http://developer.intel.com/software/products/mkl/
# this is really excellent when you use Intel CPU's
#
# 3b) or obtain the atlas based BLAS routines
# http://math-atlas.sourceforge.net/
# you certainly need atlas on the Athlon, since the mkl
# routines are not optimal on the Athlon.
#
#-----------------------------------------------------------------------

# all CPP processed fortran files have the extension .f
SUFFIX=.f

#-----------------------------------------------------------------------
# fortran compiler and linker
#-----------------------------------------------------------------------
FC=gfortran
# fortran linker
FCL=$(FC)

#-----------------------------------------------------------------------
# whereis CPP ?? (I need CPP, can't use gcc with proper options)
# that's the location of gcc for SUSE 5.3
#
# CPP_ = /usr/lib/gcc-lib/i486-linux/2.7.2/cpp -P -C
#
# that's probably the right line for some Red Hat distribution:
#
# CPP_ = /usr/lib/gcc-lib/i386-redhat-linux/2.7.2.3/cpp -P -C
#
# SUSE 6.X, maybe some Red Hat distributions:

CPP_ = ./preprocess <$*.F | /usr/bin/cpp -P -C -traditional >$*$(SUFFIX)

#-----------------------------------------------------------------------
# possible options for CPP:
# possible options for CPP:
# NGXhalf charge density reduced in X direction
# wNGXhalf gamma point only reduced in X direction
# avoidalloc avoid ALLOCATE if possible
# IFC work around some IFC bugs
# CACHE_SIZE 1000 for PII,PIII, 5000 for Athlon, 8000 P4
# RPROMU_DGEMV use DGEMV instead of DGEMM in RPRO (usually faster)
# RACCMU_DGEMV use DGEMV instead of DGEMM in RACC (faster on P4)
# **** definitely use -DRACCMU_DGEMV if you use the mkl library
#-----------------------------------------------------------------------

CPP = $(CPP_) -DHOST=\"LinuxGfortran\" \
-Dkind8 -DNGXhalf -DCACHE_SIZE=2000 -DGfortran -Davoidalloc \
-DRPROMU_DGEMV

#-----------------------------------------------------------------------
# general fortran flags (there must a trailing blank on this line)
# the -Mx,119,0x200000 is required if you use older pgf90 versions
# on a more recent LINUX installation
# the option will not do any harm on other 3.X pgf90 distributions
#-----------------------------------------------------------------------

FFLAGS = -ffree-form -ffree-line-length-none

#-----------------------------------------------------------------------
# optimization,
# we have tested whether higher optimisation improves
# the performance, and found no improvements with -O3-5 or -fast
# (even on Athlon system, Athlon specific optimistation worsens performance)
#-----------------------------------------------------------------------

OFLAG = -O2

OFLAG_HIGH = $(OFLAG)
OBJ_HIGH =
OBJ_NOOPT =
DEBUG = -g -O0
INLINE = $(OFLAG)

#-----------------------------------------------------------------------
# the following lines specify the position of BLAS and LAPACK
# what you chose is very system dependent
# P4: VASP works fastest with Intels mkl performance library
# Athlon: Atlas based BLAS are presently the fastest
# P3: no clue
#-----------------------------------------------------------------------

# Atlas based libraries
#ATLASHOME= $(HOME)/archives/BLAS_OPT/ATLAS/lib/Linux_ATHLONXP_SSE1/
#BLAS= -L/usr/lib/blas/atlas -lblas
#BLAS= -L$(ATLASHOME) -lf77blas -latlas
BLAS= -L/usr/lib64 -lblas

# use specific libraries (default library path points to other libraries)
#BLAS= $(ATLASHOME)/libf77blas.a $(ATLASHOME)/libatlas.a

# use the mkl Intel libraries for p4 (www.intel.com)
#BLAS=-L/opt/intel/mkl/lib/32 -lmkl_p4 -lpthread

# LAPACK, simplest use vasp.5.lib/lapack_double
#LAPACK= ../vasp.5.lib/lapack_double.o

# use atlas optimized part of lapack
LAPACK= ../vasp.5.lib/lapack_atlas.o -llapack -lcblas

# use the mkl Intel lapack
#LAPACK= -lmkl_lapack

LAPACK= -L/usr/lib64 -llapack

#-----------------------------------------------------------------------

#LIB = -L../vasp.5.lib -ldmy \
../vasp.5.lib/linpack_double.o $(LAPACK) \
$(BLAS)

LIB = -L../vasp.5.lib -ldmy ../vasp.5.lib/linpack_double.o -L/usr/lib64 -llapack

# options for linking (none required)
#LINK = -q64 -static -bnoquiet

Link =

#-----------------------------------------------------------------------
# fft libraries:
# VASP.4.5 can use FFTW (http://www.fftw.org)
# since the FFTW is very slow for radices 2^n the fft3dlib is used
# in these cases
# if you use fftw3d you need to insert -lfftw in the LIB line as well
# please do not send us any querries reltated to FFTW (no support)
# if it fails, use fft3dlib
#-----------------------------------------------------------------------

FFT3D = fft3dfurth.o fft3dlib.o
#FFT3D = fftw3d+furth.o fft3dlib.o

#=======================================================================
# MPI section, uncomment the following lines
#
# one comment for users of mpich or lam:
# You must *not* compile mpi with g77/f77, because f77/g77
# appends *two* underscores to symbols that contain already an
# underscore (i.e. MPI_SEND becomes mpi_send__). The pgf90
# compiler however appends only one underscore.
# Precompiled mpi version will also not work !!!
#
# We found that mpich.1.2.1 and lam-6.5.X are stable
# mpich.1.2.1 was configured with
# ./configure -prefix=/usr/local/mpich_nodvdbg -fc="pgf77 -Mx,119,0x200000" \
# -f90="pgf90 -Mx,119,0x200000" \
# --without-romio --without-mpe -opt=-O \
#
# lam was configured with the line
# ./configure -prefix /usr/local/lam-6.5.X --with-cflags=-O -with-fc=pgf90 \
# --with-f77flags=-O --without-romio
#
# lam was generally faster and we found an average communication
# band with of roughly 160 MBit/s (full duplex)
#
# please note that you might be able to use a lam or mpich version
# compiled with f77/g77, but then you need to add the following
# options: -Msecond_underscore (compilation) and -g77libs (linking)
#
# !!! Please do not send me any queries on how to install MPI, I will
# certainly not answer them !!!!
#=======================================================================
#-----------------------------------------------------------------------
# fortran linker for mpi: if you use LAM and compiled it with the options
# suggested above, you can use the following lines
#-----------------------------------------------------------------------

#FC=mpif77
#FCL=$(FC)

#-----------------------------------------------------------------------
# additional options for CPP in parallel version (see also above):
# NGZhalf charge density reduced in Z direction
# wNGZhalf gamma point only reduced in Z direction
# scaLAPACK use scaLAPACK (usually slower on 100 Mbit Net)
#-----------------------------------------------------------------------

#CPP = $(CPP_) -DMPI -DHOST=\"LinuxPgi\" \
# -Dkind8 -DNGZhalf -DCACHE_SIZE=2000 -DPGF90 -Davoidalloc -DRPROMU_DGEMV

#-----------------------------------------------------------------------
# location of SCALAPACK
# if you do not use SCALAPACK simply uncomment the line SCA
#-----------------------------------------------------------------------

BLACS=/usr/local/BLACS_lam
SCA_= /usr/local/SCALAPACK_lam

SCA= $(SCA_)/scalapack_LINUX.a $(SCA_)/pblas_LINUX.a $(SCA_)/tools_LINUX.a \
$(BLACS)/LIB/blacsF77init_MPI-LINUX-0.a $(BLACS)/LIB/blacs_MPI-LINUX-0.a $(BLACS)/LIB/blacsF77init_MPI-LINUX-0.a

SCA=

#-----------------------------------------------------------------------
# libraries for mpi
#-----------------------------------------------------------------------

#LIB = -L../vasp.5.lib -ldmy \
# ../vasp.5.lib/linpack_double.o $(LAPACK) \
# $(SCA) $(BLAS)

# FFT: only option fftmpi.o with fft3dlib of Juergen Furthmueller

#FFT3D = fftmpi.o fftmpi_map.o fft3dfurth.o fft3dlib.o

#-----------------------------------------------------------------------
# general rules and compile lines
#-----------------------------------------------------------------------
BASIC= symmetry.o symlib.o lattlib.o random.o

SOURCE= base.o mpi.o smart_allocate.o xml.o \
constant.o jacobi.o main_mpi.o scala.o \
asa.o lattice.o poscar.o ini.o xclib.o xclib_grad.o \
radial.o pseudo.o mgrid.o gridq.o ebs.o \
mkpoints.o wave.o wave_mpi.o wave_high.o \
$(BASIC) nonl.o nonlr.o nonl_high.o dfast.o choleski2.o \
mix.o hamil.o xcgrad.o xcspin.o potex1.o potex2.o \
constrmag.o cl_shift.o relativistic.o LDApU.o \
paw_base.o metagga.o egrad.o pawsym.o pawfock.o pawlhf.o rhfatm.o paw.o \
mkpoints_full.o charge.o Lebedev-Laikov.o stockholder.o dipol.o pot.o \
dos.o elf.o tet.o tetweight.o hamil_rot.o steep.o \
dimer.o dynmat.o neb.o lanczos.o instanton.o \
sd.o cg.o qm.o lbfgs.o bfgs.o fire.o opt.o \
chain.o dyna.o sphpro.o us.o core_rel.o \
aedens.o wavpre.o wavpre_noio.o broyden.o \
dynbr.o rmm-diis.o reader.o writer.o tutor.o xml_writer.o \
brent.o stufak.o fileio.o opergrid.o stepver.o \
chgloc.o fast_aug.o fock.o mkpoints_change.o sym_grad.o \
mymath.o optengines.o internals.o hessian.o gadget.o dynconstr.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o \
hamil_high.o nmr.o force.o \
pead.o mlwf.o subrot.o subrot_scf.o pwlhf.o gw_model.o optreal.o davidson.o david_inner.o \
electron.o rot.o electron_all.o shm.o pardens.o paircorrection.o \
optics.o constr_cell_relax.o stm.o finite_diff.o elpol.o \
hamil_lr.o rmm-diis_lr.o subrot_cluster.o subrot_lr.o \
lr_helper.o hamil_lrf.o elinear_response.o ilinear_response.o \
linear_optics.o linear_response.o \
setlocalpp.o wannier.o electron_OEP.o electron_lhf.o twoelectron4o.o \
ratpol.o screened_2e.o wave_cacher.o chi_base.o wpot.o local_field.o \
ump2.o bse.o acfdt.o chi.o sydmat.o

vasp: $(SOURCE) $(FFT3D) $(INC) main.o
rm -f vasp
$(FCL) -o vasp main.o $(SOURCE) $(FFT3D) $(LIB) $(LINK)
makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)
$(FCL) -o makeparam $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)
zgemmtest: zgemmtest.o base.o random.o $(INC)
$(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)
dgemmtest: dgemmtest.o base.o random.o $(INC)
$(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB)
ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)
$(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)
kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)
$(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)

clean:
-rm -f *.g *.f *.o *.L *.mod ; touch *.F

main.o: main$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG) $(INCS) -c main$(SUFFIX)
xcgrad.o: xcgrad$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcgrad$(SUFFIX)
xcspin.o: xcspin$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcspin$(SUFFIX)

makeparam.o: makeparam$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG) $(INCS) -c makeparam$(SUFFIX)

makeparam$(SUFFIX): makeparam.F main.F
#
# MIND: I do not have a full dependency list for the include
# and MODULES: here are only the minimal basic dependencies
# if one strucuture is changed then touch_dep must be called
# with the corresponding name of the structure
#
base.o: base.inc base.F
mgrid.o: mgrid.inc mgrid.F
constant.o: constant.inc constant.F
lattice.o: lattice.inc lattice.F
setex.o: setexm.inc setex.F
pseudo.o: pseudo.inc pseudo.F
poscar.o: poscar.inc poscar.F
mkpoints.o: mkpoints.inc mkpoints.F
wave.o: wave.F
nonl.o: nonl.inc nonl.F
nonlr.o: nonlr.inc nonlr.F

$(OBJ_HIGH):
$(CPP)
$(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)
$(OBJ_NOOPT):
$(CPP)
$(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)

fft3dlib_f77.o: fft3dlib_f77.F
$(CPP)
$(F77) $(FFLAGS_F77) -c $*$(SUFFIX)

.F.o:
$(CPP)
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
.F$(SUFFIX):
$(CPP)
$(SUFFIX).o:
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)

---------------------------------------

(2) MPI Build
=--------------------------------------
.SUFFIXES: .inc .f .f90 .F
#-----------------------------------------------------------------------
# rewritten for gfortran under Redhat, RedHat on x86_64 and gfortran
# by Frank Bures, 11 February 2011
#
# Makefile for Portland Group F90/HPF compiler release 3.0-1, 3.1
# and release 1.7
# (http://www.pgroup.com/ & ftp://ftp.pgroup.com/x86/, you need
# to order the HPF/F90 suite)
# we have found no noticable performance differences between
# any of the releases, even Athlon or PIII optimisation does
# not seem to improve performance
#
# The makefile was tested only under Linux on Intel platforms
# (Suse X,X)
#
# it might be required to change some of library pathes, since
# LINUX installation vary a lot
# Hence check ***ALL**** options in this makefile very carefully
#-----------------------------------------------------------------------
#
# Mind that some Linux distributions (Suse 6.1) have a bug in
# libm causing small errors in the error-function (total energy
# is therefore wrong by about 1meV/atom). The recommended
# solution is to update libc.
#
# Mind that some Linux distributions (Suse 6.1) have a bug in
# libm causing small errors in the error-function (total energy
# is therefore wrong by about 1meV/atom). The recommended
# solution is to update libc.
#
# BLAS must be installed on the machine
# there are several options:
# 1) very slow but works:
# retrieve the lapackage from ftp.netlib.org
# and compile the blas routines (BLAS/SRC directory)
# please use g77 or f77 for the compilation. When I tried to
# use pgf77 or pgf90 for BLAS, VASP hang up when calling
# ZHEEV (however this was with lapack 1.1 now I use lapack 2.0)
# 2) most desirable: get an optimized BLAS
# for a list of optimized BLAS try
# http://www.kachinatech.com/~hjjou/scilib/opt_blas.html
#
# the two most reliable packages around are presently:
# 3a) Intels own optimised BLAS (PIII, P4, Itanium)
# http://developer.intel.com/software/products/mkl/
# this is really excellent when you use Intel CPU's
#
# 3b) or obtain the atlas based BLAS routines
# http://math-atlas.sourceforge.net/
# you certainly need atlas on the Athlon, since the mkl
# routines are not optimal on the Athlon.
#
#-----------------------------------------------------------------------

# all CPP processed fortran files have the extension .f
SUFFIX=.f

#-----------------------------------------------------------------------
# fortran compiler and linker
#-----------------------------------------------------------------------
FC=gfortran
# fortran linker
FCL=$(FC)

#-----------------------------------------------------------------------
# whereis CPP ?? (I need CPP, can't use gcc with proper options)
# that's the location of gcc for SUSE 5.3
#
# CPP_ = /usr/lib/gcc-lib/i486-linux/2.7.2/cpp -P -C
#
# that's probably the right line for some Red Hat distribution:
#
# CPP_ = /usr/lib/gcc-lib/i386-redhat-linux/2.7.2.3/cpp -P -C
#
# SUSE 6.X, maybe some Red Hat distributions:

CPP_ = ./preprocess <$*.F | /usr/bin/cpp -P -C -traditional >$*$(SUFFIX)

#-----------------------------------------------------------------------
# possible options for CPP:
# possible options for CPP:
# NGXhalf charge density reduced in X direction
# wNGXhalf gamma point only reduced in X direction
# avoidalloc avoid ALLOCATE if possible
# IFC work around some IFC bugs
# CACHE_SIZE 1000 for PII,PIII, 5000 for Athlon, 8000 P4
# RPROMU_DGEMV use DGEMV instead of DGEMM in RPRO (usually faster)
# RACCMU_DGEMV use DGEMV instead of DGEMM in RACC (faster on P4)
# **** definitely use -DRACCMU_DGEMV if you use the mkl library
#-----------------------------------------------------------------------

CPP = $(CPP_) -DHOST=\"LinuxGfortran\" \
-Dkind8 -DNGXhalf -DCACHE_SIZE=2000 -DGfortran -Davoidalloc \
-DRPROMU_DGEMV

#-----------------------------------------------------------------------
# general fortran flags (there must a trailing blank on this line)
# the -Mx,119,0x200000 is required if you use older pgf90 versions
# on a more recent LINUX installation
# the option will not do any harm on other 3.X pgf90 distributions
#-----------------------------------------------------------------------

FFLAGS = -ffree-form -ffree-line-length-none

#-----------------------------------------------------------------------
# optimization,
# we have tested whether higher optimisation improves
# the performance, and found no improvements with -O3-5 or -fast
# (even on Athlon system, Athlon specific optimistation worsens performance)
#-----------------------------------------------------------------------

OFLAG = -O2

OFLAG_HIGH = $(OFLAG)
OBJ_HIGH =
OBJ_NOOPT =
DEBUG = -g -O0
INLINE = $(OFLAG)

#-----------------------------------------------------------------------
# the following lines specify the position of BLAS and LAPACK
# what you chose is very system dependent
# P4: VASP works fastest with Intels mkl performance library
# Athlon: Atlas based BLAS are presently the fastest
# P3: no clue
#-----------------------------------------------------------------------

# Atlas based libraries
#ATLASHOME= $(HOME)/archives/BLAS_OPT/ATLAS/lib/Linux_ATHLONXP_SSE1/
#BLAS= -L/usr/lib/blas/atlas -lblas
#BLAS= -L$(ATLASHOME) -lf77blas -latlas
BLAS= -L/usr/lib64 -lblas

# use specific libraries (default library path points to other libraries)
#BLAS= $(ATLASHOME)/libf77blas.a $(ATLASHOME)/libatlas.a

# use the mkl Intel libraries for p4 (www.intel.com)
#BLAS=-L/opt/intel/mkl/lib/32 -lmkl_p4 -lpthread

# LAPACK, simplest use vasp.5.lib/lapack_double
#LAPACK= ../vasp.5.lib/lapack_double.o

# use atlas optimized part of lapack
LAPACK= ../vasp.5.lib/lapack_atlas.o -llapack -lcblas

# use the mkl Intel lapack
#LAPACK= -lmkl_lapack

LAPACK= -L/usr/lib64 -llapack

#-----------------------------------------------------------------------

#LIB = -L../vasp.5.lib -ldmy \
../vasp.5.lib/linpack_double.o $(LAPACK) \
$(BLAS)

LIB = -L../vasp.5.lib -ldmy ../vasp.5.lib/linpack_double.o -L/usr/lib64 -llapack

# options for linking (none required)
#LINK = -q64 -static -bnoquiet

Link =

#-----------------------------------------------------------------------
# fft libraries:
# VASP.4.5 can use FFTW (http://www.fftw.org)
# since the FFTW is very slow for radices 2^n the fft3dlib is used
# in these cases
# if you use fftw3d you need to insert -lfftw in the LIB line as well
# please do not send us any querries reltated to FFTW (no support)
# if it fails, use fft3dlib
#-----------------------------------------------------------------------

FFT3D = fft3dfurth.o fft3dlib.o
#FFT3D = fftw3d+furth.o fft3dlib.o

#=======================================================================
# MPI section, uncomment the following lines
#
# one comment for users of mpich or lam:
# You must *not* compile mpi with g77/f77, because f77/g77
# appends *two* underscores to symbols that contain already an
# underscore (i.e. MPI_SEND becomes mpi_send__). The pgf90
# compiler however appends only one underscore.
# Precompiled mpi version will also not work !!!
#
# We found that mpich.1.2.1 and lam-6.5.X are stable
# mpich.1.2.1 was configured with
# ./configure -prefix=/usr/local/mpich_nodvdbg -fc="pgf77 -Mx,119,0x200000" \
# -f90="pgf90 -Mx,119,0x200000" \
# --without-romio --without-mpe -opt=-O \
#
# lam was configured with the line
# ./configure -prefix /usr/local/lam-6.5.X --with-cflags=-O -with-fc=pgf90 \
# --with-f77flags=-O --without-romio
#
# lam was generally faster and we found an average communication
# band with of roughly 160 MBit/s (full duplex)
#
# please note that you might be able to use a lam or mpich version
# compiled with f77/g77, but then you need to add the following
# options: -Msecond_underscore (compilation) and -g77libs (linking)
#
# !!! Please do not send me any queries on how to install MPI, I will
# certainly not answer them !!!!
#=======================================================================
#-----------------------------------------------------------------------
# fortran linker for mpi: if you use LAM and compiled it with the options
# suggested above, you can use the following lines
#-----------------------------------------------------------------------

FC=mpif77
FCL=$(FC) -g77libs

#-----------------------------------------------------------------------
# additional options for CPP in parallel version (see also above):
# NGZhalf charge density reduced in Z direction
# wNGZhalf gamma point only reduced in Z direction
# scaLAPACK use scaLAPACK (usually slower on 100 Mbit Net)
#-----------------------------------------------------------------------

CPP = $(CPP_) -DMPI -DHOST=\"LinuxPgi\" \
-Dkind8 -DNGZhalf -DCACHE_SIZE=2000 -DPGF90 -Davoidalloc -DRPROMU_DGEMV

#-----------------------------------------------------------------------
# location of SCALAPACK
# if you do not use SCALAPACK simply uncomment the line SCA
#-----------------------------------------------------------------------

BLACS=/usr/local/BLACS_lam
SCA_= /usr/local/SCALAPACK_lam

#SCA= $(SCA_)/scalapack_LINUX.a $(SCA_)/pblas_LINUX.a $(SCA_)/tools_LINUX.a \
$(BLACS)/LIB/blacsF77init_MPI-LINUX-0.a $(BLACS)/LIB/blacs_MPI-LINUX-0.a $(BLACS)/LIB/blacsF77init_MPI-LINUX-0.a

SCA=

#-----------------------------------------------------------------------
# libraries for mpi
#-----------------------------------------------------------------------

LIB = -L../vasp.5.lib -ldmy \
../vasp.5.lib/linpack_double.o $(LAPACK) \
$(SCA) $(BLAS)

# FFT: only option fftmpi.o with fft3dlib of Juergen Furthmueller

FFT3D = fftmpi.o fftmpi_map.o fft3dfurth.o fft3dlib.o

#-----------------------------------------------------------------------
# general rules and compile lines
#-----------------------------------------------------------------------
BASIC= symmetry.o symlib.o lattlib.o random.o

SOURCE= base.o mpi.o smart_allocate.o xml.o \
constant.o jacobi.o main_mpi.o scala.o \
asa.o lattice.o poscar.o ini.o xclib.o xclib_grad.o \
radial.o pseudo.o mgrid.o gridq.o ebs.o \
mkpoints.o wave.o wave_mpi.o wave_high.o \
$(BASIC) nonl.o nonlr.o nonl_high.o dfast.o choleski2.o \
mix.o hamil.o xcgrad.o xcspin.o potex1.o potex2.o \
constrmag.o cl_shift.o relativistic.o LDApU.o \
paw_base.o metagga.o egrad.o pawsym.o pawfock.o pawlhf.o rhfatm.o paw.o \
mkpoints_full.o charge.o Lebedev-Laikov.o stockholder.o dipol.o pot.o \
dos.o elf.o tet.o tetweight.o hamil_rot.o steep.o \
dimer.o dynmat.o neb.o lanczos.o instanton.o \
sd.o cg.o qm.o lbfgs.o bfgs.o fire.o opt.o \
chain.o dyna.o sphpro.o us.o core_rel.o \
aedens.o wavpre.o wavpre_noio.o broyden.o \
dynbr.o rmm-diis.o reader.o writer.o tutor.o xml_writer.o \
brent.o stufak.o fileio.o opergrid.o stepver.o \
chgloc.o fast_aug.o fock.o mkpoints_change.o sym_grad.o \
mymath.o optengines.o internals.o hessian.o gadget.o dynconstr.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o \
hamil_high.o nmr.o force.o \
pead.o mlwf.o subrot.o subrot_scf.o pwlhf.o gw_model.o optreal.o davidson.o david_inner.o \
electron.o rot.o electron_all.o shm.o pardens.o paircorrection.o \
optics.o constr_cell_relax.o stm.o finite_diff.o elpol.o \
hamil_lr.o rmm-diis_lr.o subrot_cluster.o subrot_lr.o \
lr_helper.o hamil_lrf.o elinear_response.o ilinear_response.o \
linear_optics.o linear_response.o \
setlocalpp.o wannier.o electron_OEP.o electron_lhf.o twoelectron4o.o \
ratpol.o screened_2e.o wave_cacher.o chi_base.o wpot.o local_field.o \
ump2.o bse.o acfdt.o chi.o sydmat.o

vasp: $(SOURCE) $(FFT3D) $(INC) main.o
rm -f vasp
$(FCL) -o vasp main.o $(SOURCE) $(FFT3D) $(LIB) $(LINK)
makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)
$(FCL) -o makeparam $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)
zgemmtest: zgemmtest.o base.o random.o $(INC)
$(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)
dgemmtest: dgemmtest.o base.o random.o $(INC)
$(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB)
ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)
$(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)
kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)
$(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)

clean:
-rm -f *.g *.f *.o *.L *.mod ; touch *.F

main.o: main$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG) $(INCS) -c main$(SUFFIX)
xcgrad.o: xcgrad$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcgrad$(SUFFIX)
xcspin.o: xcspin$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcspin$(SUFFIX)

makeparam.o: makeparam$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG) $(INCS) -c makeparam$(SUFFIX)

makeparam$(SUFFIX): makeparam.F main.F
#
# MIND: I do not have a full dependency list for the include
# and MODULES: here are only the minimal basic dependencies
# if one strucuture is changed then touch_dep must be called
# with the corresponding name of the structure
#
base.o: base.inc base.F
mgrid.o: mgrid.inc mgrid.F
constant.o: constant.inc constant.F
lattice.o: lattice.inc lattice.F
setex.o: setexm.inc setex.F
pseudo.o: pseudo.inc pseudo.F
poscar.o: poscar.inc poscar.F
mkpoints.o: mkpoints.inc mkpoints.F
wave.o: wave.F
nonl.o: nonl.inc nonl.F
nonlr.o: nonlr.inc nonlr.F

$(OBJ_HIGH):
$(CPP)
$(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)
$(OBJ_NOOPT):
$(CPP)
$(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)

fft3dlib_f77.o: fft3dlib_f77.F
$(CPP)
$(F77) $(FFLAGS_F77) -c $*$(SUFFIX)

.F.o:
$(CPP)
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
.F$(SUFFIX):
$(CPP)
$(SUFFIX).o:
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)

hipertrofia · #4 Post by **hipertrofia** » Thu Mar 10, 2011 6:09 pm

Hi all,

I tried to compile the serial version of VASP5.2 with gfortran and the Makefile given above by Iain but there were some little problems. Finally amending some stuff I managed to compile it. The following steps worked for me, so I hope somebody else can make use of it:

(1). The library can be compiled with no problems with the Makefile provided.

(2). To compile the program itself I edited the Makefile given for the ifc compiler. In particular, these lines:

Write

FC=gfortran -ffree-form -ffree-line-length-none

instead of

FC=ifc

Then, comment the FFLAGS variable:

#FFLAGS = -FR -lowercase -assume byterecl

(3) Depending on your machine it might compile after this typing "make". For me, another step was required because I didn't have the BLAS library or the link wasn't pointing to it. So I downloaded it (http://www.netlib.org/blas/blas.tgz) and compiled it on my machine. Then changed in the Makefile the line

BLAS= /opt/libs/libgoto/libgoto.so

to point instead to

BLAS= "wherever_you_installed_blas"/blas_LINUX.a

Note that the extension and name of the file changed, not only the location.

After doing this I ran "make" and the program compiled just fine. I haven't been using it long but I have already reproduced published results so I'm confident it's working just fine.

Remember this is for the serial implementation, I haven't tried for the MPI one. If I try and get it working will come back here and post it.

I hope it helps!
Miguel

imcnab · #5 Post by **imcnab** » Sat Sep 03, 2011 6:19 pm

ERRATUM
The comments given in the Makefiles I posted are obviously NOT up to date - please disregard info about "Makefile for Portland Group F90/HPF compiler release 3.0-1, 3.1 and release 1.7"

Dr_Nick · #6 Post by **Dr_Nick** » Mon Oct 17, 2011 2:16 am

Basically what you can do is to take all the FLAGS and compiler settings from the vasp.4.6/makefile.linux_gfortran.

I have used a Makefile similar to the one posted here and was able to compile VASP 5.2.12. However, only gcc 4.3.3. worked - I had problems with other versions.

Optimization exceeding "-O2 -msse3" (or -msse4a), depending on the CPU, did not yield any improvement. Machine-specific optimization such as "-mtune=opteron" worsened the performance. You can gain a little, maybe 1 or 2 %, by specifying -Duse_cray_ptr in the CPP line.

On an Opteron machine I obtained best performance with gotoblas2 and FFTW v3.3. ACML is significantly slower, ATLAS was not tested. For Furthm?ller's FFTW you need to play around with the -DCACHE_SIZE setting, for me 10000 was best. Then you can obtain performance close to FFTW3. However, I did not do any intensive benchmarking, so please test yourself.

And do not forget to play around with NPAR etc. (see the VASP guide) to obtain best performance when running the parallel version.