How to improve parallelised calculation ?

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#1 Post by vasp16888 » Sun Apr 25, 2010 8:41 pm

Dear vasp users:
I am running vasp 5.2 on suse linux operatiing system with Infiniband network card(20 GB/S).
My question is: if I wanna calculate on 2 nodes or even more nodes(each nodes has 12 processors) , how should I set the NPAR, LPLANE and other parameters about parallelisation??
Thanks a lot in advance:)
Last edited by vasp16888 on Sun Apr 25, 2010 8:41 pm, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#2 Post by vasp16888 » Mon Apr 26, 2010 6:44 am

Can somebody give me some tips, thanks a lot:)
Last edited by vasp16888 on Mon Apr 26, 2010 6:44 am, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

alex
Hero Member
Hero Member
Posts: 585
Joined: Tue Nov 16, 2004 2:21 pm
License Nr.: 5-67
Location: Germany

How to improve parallelised calculation ?

#3 Post by alex » Mon Apr 26, 2010 1:01 pm

Nobody will answer faster if you scream around. This part of the forum is voluntarily.

About your question: You have to try on your own. It'll depend on number of atoms, cutoff, number of k-points, speed of your memory interface, speed of your cpu and so on. I hope you got an idea ...
Start with NPAR = 1 and remove LPLANE fron INCAR.

Hth

alex
Last edited by alex on Mon Apr 26, 2010 1:01 pm, edited 1 time in total.

vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#4 Post by vasp16888 » Mon Apr 26, 2010 6:51 pm

[quote="20px"]scream[/size] around. This part of the forum is voluntarily.

About your question: You have to try on your own. It'll depend on number of atoms, cutoff, number of k-points, speed of your memory interface, speed of your cpu and so on. I hope you got an idea ...
Start with NPAR = 1 and remove LPLANE fron INCAR.

Hth

alex

[/quote]First[/color] I have to say, sorry, I am a new comer of vasp forum, and there are a lot of things to learn here, thank you for your suggestions(all of them).
Second, about your reply (number of atoms, cutoff, number of k-points, speed of your memory interface, speed of your cpu), I already know it and tested some of them before I posted the thread. But thanks anyway.
Third, I read the userguide, it said:
LPLANE = .TRUE.
NPAR = number of nodes.
LSCALU = .FALSE.
NSIM = 4
but the improvement of parallelisation is not obvious.
My supercomputer's network card is Inifniband (20GB/S), and all hardwares are the latest, I just wanna know how to deal with large system efficiently(test many times, but failed).
If your time permiting, any suggestion will be greatly appreciated, sorry to bother you:)

Hui
<span class='smallblacktext'>[ Edited Mon Apr 26 2010, 08:53PM ]</span>
Last edited by vasp16888 on Mon Apr 26, 2010 6:51 pm, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

Danny
Full Member
Full Member
Posts: 201
Joined: Thu Nov 02, 2006 4:35 pm
License Nr.: 5-532
Location: Ghent, Belgium
Contact:

How to improve parallelised calculation ?

#5 Post by Danny » Tue Apr 27, 2010 1:46 pm

The only way to do this is the hard way.
Step one, find a calculation which is welbehaved and runs ~5-20h on a single CPU containing ~100 atoms

do a calculation on multiple nodes/CPU's with LPLANE=.TRUE. and one with LPLANE=.False.
You should see a clear difference (sometime 50%) in time. (note: you need to use exactly the same CPU configuration)
After this you choose the best LPLANE value.
Second step: NSIM and NPAR, these two parameters seem connected and their behavior seems quite system dependent.
Choose a set of values NPAR, and NSIM (choose wisely so that they have some physical meaning wrt your cpu's and nodes)
and loop over all of them doing your test calculation.(i.e. #NPARx#NSIM calculations) and find a trend.
Default values are NSIM=1, NPAR=#CPU's.

The current machine I run on needs NSIM=8-16, NPAR=#nodes/2
(a former machine needed NPAR=#cores, while another one needed NPAR=#nodes)

It takes time but it is worth it.

Cheers
Danny
Last edited by Danny on Tue Apr 27, 2010 1:46 pm, edited 1 time in total.

vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#6 Post by vasp16888 » Wed Apr 28, 2010 3:14 am

[quote author=.TRUE. and one with LPLANE=.False.
You should see a clear difference (sometime 50%) in time. (note: you need to use exactly the same CPU configuration)
After this you choose the best LPLANE value.
Second step: NSIM and NPAR, these two parameters seem connected and their behavior seems quite system dependent.
Choose a set of values NPAR, and NSIM (choose wisely so that they have some physical meaning wrt your cpu's and nodes)
and loop over all of them doing your test calculation.(i.e. #NPARx#NSIM calculations) and find a trend.
Default values are NSIM=1, NPAR=#CPU's.

The current machine I run on needs NSIM=8-16, NPAR=#nodes/2
(a former machine needed NPAR=#cores, while another one needed NPAR=#nodes)

It takes time but it is worth it.

Cheers
Danny[/quote]


Hi Danny:
I am little confused about the concept of node, cpu, core, and processor.
In my opinion, for instance: we have 6 nodes which are connected by Infiniband card, and each node has 2 cpus on the motheboard, and each cpu has 6 cores, which means each node has 12 cores. I think processor = cpu.
Please correct me if I am wrong.
Thanks:)

Yours sincerely:
Hui
Last edited by vasp16888 on Wed Apr 28, 2010 3:14 am, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

Danny
Full Member
Full Member
Posts: 201
Joined: Thu Nov 02, 2006 4:35 pm
License Nr.: 5-532
Location: Ghent, Belgium
Contact:

How to improve parallelised calculation ?

#7 Post by Danny » Wed Apr 28, 2010 9:14 am

Yes you are right, current day machinerie is confusing since node/CPU and core are often used interchangeably.
In the vasp manual the reference to node means actually core. In my case I try to only refer to
1) nodes= nodes
2) CPU/core/processor=smallest part that does the calculation, i.e. in your case the 12 cores I would refer to as 12 CPU's (I know technically it's wrong)

In your case for a 2 node(=24core calculation=4cpu) I would suggest trying
NPAR=1 (each band on the entire system)
NPAR=2 (one band per node)
NPAR=4 (one band per CPU)
NPAR=24 (one band per core)
NPAR=8 & NPAR=12 (one band per 3, 2 cores)
combined with NSIM=1,2,4,6,12,24

Danny
Last edited by Danny on Wed Apr 28, 2010 9:14 am, edited 1 time in total.

vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#8 Post by vasp16888 » Thu Apr 29, 2010 1:11 am

[quote author= nodes
2) CPU/core/processor=smallest part that does the calculation, i.e. in your case the 12 cores I would refer to as 12 CPU's (I know technically it's wrong)

In your case for a 2 node(=24core calculation=4cpu) I would suggest trying
NPAR=1 (each band on the entire system)
NPAR=2 (one band per node)
NPAR=4 (one band per CPU)
NPAR=24 (one band per core)
NPAR=8 & NPAR=12 (one band per 3, 2 cores)
combined with NSIM=1,2,4,6,12,24

Danny[/quote]


Thanks, Danny, it's more clear:)
Are the combination you tailed about last time:
NPAR=1, NSIM=1
NPAR=2, NSIM=2
NPAR=4, NSIM=4
NPAR=8, NSIM=6
NPAR=12, NSIM=12
NPAR=24, NSIM=24
if this is the case, I think it is relative easier.

But if it doesn't combine orderly, it's gonna be 36 combinations, this is a really hardwork for our limited computer resources:(

Thanks in advance :)
Last edited by vasp16888 on Thu Apr 29, 2010 1:11 am, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

alex
Hero Member
Hero Member
Posts: 585
Joined: Tue Nov 16, 2004 2:21 pm
License Nr.: 5-67
Location: Germany

How to improve parallelised calculation ?

#9 Post by alex » Thu Apr 29, 2010 7:40 am

Hi there again,

some hints:
You've got a fast machine with fast network, so start with NPAR = 1 and NSIM = 1 for one number of tasks you are most likely to use most often.
Next: NPAR = # of tasks / 4, NSIM unchanged.
Then: NPAR = # of tasks, NSIM unchanged.

Then take the two fastest and optimize NPAR further. I'd guess, you'll end up at NPAR = 2 or 4.

Then touch NSIM, same game.

Gotcha. Hth

alex
Last edited by alex on Thu Apr 29, 2010 7:40 am, edited 1 time in total.

Danny
Full Member
Full Member
Posts: 201
Joined: Thu Nov 02, 2006 4:35 pm
License Nr.: 5-532
Location: Ghent, Belgium
Contact:

How to improve parallelised calculation ?

#10 Post by Danny » Fri Apr 30, 2010 8:33 am

[quote="Danny"]Yes you are right, current day machinerie is confusing since node/CPU and core are often used interchangeably.
In the vasp manual the reference to node means actually core. In my case I try to only refer to
1) nodes= nodes
2) CPU/core/processor=smallest part that does the calculation, i.e. in your case the 12 cores I would refer to as 12 CPU's (I know technically it's wrong)

In your case for a 2 node(=24core calculation=4cpu) I would suggest trying
NPAR=1 (each band on the entire system)
NPAR=2 (one band per node)
NPAR=4 (one band per CPU)
NPAR=24 (one band per core)
NPAR=8 & NPAR=12 (one band per 3, 2 cores)
combined with NSIM=1,2,4,6,12,24

Danny[/quote]
Nope, I'm affraid it's the 36. Then again If you have a job that takes 10h on 1 core, this job on 24 cores might take 30 to 60 minutes...so you will need <36 hours on 24 cores = <1000 CPU hours ( still reasonable, knowing that you will probably face relaxations that take twice that time for 1 calculation, plus if you can gain 25-30% compared to the normal settings those 1000 hours are recovered quite quickly ;-)

Danny
Last edited by Danny on Fri Apr 30, 2010 8:33 am, edited 1 time in total.

vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#11 Post by vasp16888 » Sat May 01, 2010 2:46 am

ok, I am gonna do it:)
Last edited by vasp16888 on Sat May 01, 2010 2:46 am, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#12 Post by vasp16888 » Mon May 03, 2010 11:03 pm

[quote author= 1 and NSIM = 1 for one number of tasks you are most likely to use most often.
Next: NPAR = # of tasks / 4, NSIM unchanged.
Then: NPAR = # of tasks, NSIM unchanged.

Then take the two fastest and optimize NPAR further. I'd guess, you'll end up at NPAR = 2 or 4.

Then touch NSIM, same game.

Gotcha. Hth

alex

[/quote]</span>
Last edited by vasp16888 on Mon May 03, 2010 11:03 pm, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

vasp16888
Newbie
Newbie
Posts: 44
Joined: Fri Apr 23, 2010 3:09 am

How to improve parallelised calculation ?

#13 Post by vasp16888 » Mon May 03, 2010 11:10 pm

[quote="vasp16888"][quote author= nodes
2) CPU/core/processor=smallest part that does the calculation, i.e. in your case the 12 cores I would refer to as 12 CPU's (I know technically it's wrong)

In your case for a 2 node(=24core calculation=4cpu) I would suggest trying
NPAR=1 (each band on the entire system)
NPAR=2 (one band per node)
NPAR=4 (one band per CPU)
NPAR=24 (one band per core)
NPAR=8 & NPAR=12 (one band per 3, 2 cores)
combined with NSIM=1,2,4,6,12,24

Danny[/quote]
Nope, I'm affraid it's the 36. Then again If you have a job that takes 10h on 1 core, this job on 24 cores might take 30 to 60 minutes...so you will need <36 hours on 24 cores = <1000 CPU hours ( still reasonable, knowing that you will probably face relaxations that take twice that time for 1 calculation, plus if you can gain 25-30% compared to the normal settings those 1000 hours are recovered quite quickly ;-)

Danny[/quote]NPAR, NSIM, and LPLANE [/b]which may improve the efficiency. The result are posted in a new thread:http://cms.mpi.univie.ac.at/vasp-forum/ ... php?4.7257

Please take a look, and there are some questions about the testing result, waiting for your suggestions, thanks in advance. :)
Last edited by vasp16888 on Mon May 03, 2010 11:10 pm, edited 1 time in total.
[align=center]
AB INITIO STUDY OF MAGNETIC MATERIALS
[/align]

Post Reply