ScaLAPACK .VS. LAPACK

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
zoowe
Newbie
Newbie
Posts: 26
Joined: Thu Jan 21, 2010 11:41 pm
License Nr.: 5-100
Location: USA

ScaLAPACK .VS. LAPACK

#1 Post by zoowe » Wed Jan 27, 2010 3:44 pm

Hi vaspers!

I am wondering when we should use ScaLAPACK or LAPACK. In other words, what will we gain/lose when using ScaLAPACK or LAPACK?

I got more or less the same CPUs time when using ScaLAPACK or LAPACK (with various number of CPUS).

I used MKL10, intel compiler 10.1, fftw-3.2.1.

D.
Last edited by zoowe on Wed Jan 27, 2010 3:44 pm, edited 1 time in total.

panda

ScaLAPACK .VS. LAPACK

#2 Post by panda » Wed Jan 27, 2010 4:20 pm

The original goal of the LAPACK project was to make the widely used EISPACK and LINPACK libraries run efficiently on shared-memory vector and parallel processors. On these machines, LINPACK and EISPACK are inefficient because their memory access patterns disregard the multi-layered memory hierarchies of the machines, thereby spending too much time moving data instead of doing useful floating-point operations. LAPACK addresses this problem by reorganizing the algorithms to use block matrix operations, such as matrix multiplication, in the innermost loops. These block operations can be optimized for each architecture to account for the memory hierarchy, and so provide a transportable way to achieve high efficiency on diverse modern machines. We use the term "transportable" instead of "portable" because, for fastest possible performance, LAPACK requires that highly optimized block matrix operations be already implemented on each machine.

Highly efficient machine-specific implementations of the BLAS are available for many modern high-performance computers. For details of known vendor- or ISV-provided BLAS, consult the BLAS FAQ. Alternatively, the user can download ATLAS to automatically generate an optimized BLAS library for the architecture. A Fortran77 reference implementation of the BLAS in available from netlib; however, its use is discouraged as it will not perform as well as a specially tuned implementation.

also see:
http://cms.mpi.univie.ac.at/vasp-forum/ ... php?2.4050

have you tried BLAS and ATLAS as well?
Last edited by panda on Wed Jan 27, 2010 4:20 pm, edited 1 time in total.

panda

ScaLAPACK .VS. LAPACK

#3 Post by panda » Wed Jan 27, 2010 4:24 pm

I think you answered your own ? though, that you get more or less the same performance, it may be that your system is already optimized for both libraries, and if so, that's great! I have not experienced very much difference when using BLAS or ATLAS or LAPACK or mpi versus openmp and etc.... for VASP or otherwise.
Last edited by panda on Wed Jan 27, 2010 4:24 pm, edited 1 time in total.

zoowe
Newbie
Newbie
Posts: 26
Joined: Thu Jan 21, 2010 11:41 pm
License Nr.: 5-100
Location: USA

ScaLAPACK .VS. LAPACK

#4 Post by zoowe » Wed Jan 27, 2010 7:29 pm

Thank Panda, maybe you missed understand my question. I want to compare ScaLAPACK with LAPACK.

[quote="panda"]
also see:
http://cms.mpi.univie.ac.at/vasp-forum/ ... php?2.4050
[/quote]
Yeah, I read this thread weeks ago but there is no information I need [/b]

[quote="panda"]
have you tried BLAS and ATLAS as well?
[/quote]Maybe I don't get your point here. ATLAS contains BLAS and FEW routines of LAPACK. Usually, if we want to use ATLAS, we ONLY use BLAS from ATLAS and LAPACK from other resource (netlib, ../vasp.X.lib, or others)[/b]
Last edited by zoowe on Wed Jan 27, 2010 7:29 pm, edited 1 time in total.

forsdan
Sr. Member
Sr. Member
Posts: 339
Joined: Mon Apr 24, 2006 9:07 am
License Nr.: 173
Location: Gothenburg, Sweden

ScaLAPACK .VS. LAPACK

#5 Post by forsdan » Wed Jan 27, 2010 8:03 pm

For clusters with Infinibath interconnect ScaLAPACK can be crucial in order to be able to run large systems on many cores. For our infiniband clusters, investigations for different number of cores beween 64 to 256 cores show that LAPACK provides a rather poor scaling. However, the use ScaLAPACK can yield a next to linear scaling. So this is what you gain.

On GBit interconnect clusters we don't see any direct difference between LAPACK and ScaLAPACK, so we tend to never use ScaLAPACK on those clusters.

Best regards,
/Dan
<span class='smallblacktext'>[ Edited Wed Jan 27 2010, 09:05PM ]</span>
Last edited by forsdan on Wed Jan 27, 2010 8:03 pm, edited 1 time in total.

zoowe
Newbie
Newbie
Posts: 26
Joined: Thu Jan 21, 2010 11:41 pm
License Nr.: 5-100
Location: USA

ScaLAPACK .VS. LAPACK

#6 Post by zoowe » Wed Jan 27, 2010 11:00 pm

Thank Dan,

I did benchmark test with VASP/LAPACK. I got the similar result: scaling factor is very poor with higher 48 cores in our cluster.

I haven't done any test with ScaLAPACK using that many cores.

Thank you for your input.
D.
<span class='smallblacktext'>[ Edited Thu Jan 28 2010, 02:47AM ]</span>
Last edited by zoowe on Wed Jan 27, 2010 11:00 pm, edited 1 time in total.

panda

ScaLAPACK .VS. LAPACK

#7 Post by panda » Fri Jan 29, 2010 5:48 pm

My point was that I have tested ScaLAPACK, LAPACK, ATLAS, and BLAS and don't see any significant performance differences
Last edited by panda on Fri Jan 29, 2010 5:48 pm, edited 1 time in total.

Post Reply