Running GAMMA on a HPC-Cluster
Added by Jacqueline Tema Salzer over 9 years ago
Hi all,
we are considering installing GAMMA on a HPC-Cluster. Does anyone have experience with this? I realize GAMMA is parallelized internally but how does it work with multithreading? Does it make sense to run it on a cluster at all?
Thanks,
Jackie
Replies (3)
RE: Running GAMMA on a HPC-Cluster - Added by Thorsten Seehaus over 9 years ago
Hi, we are testing GAMMA right now on a HPC-Cluster, but we need to adjust our scripts a bit. I guess, I can tell you next week, if it is working fine and hopefully more effective.
RE: Running GAMMA on a HPC-Cluster - Added by Peter Friedl over 9 years ago
Hi, also we try to get GAMMA running on a HPC-Cluster. As we are at the very first beginning, we are always happy about some information and experiences. For the case we find a proper solution for enabeling multithreating within the next days, I will definitely post it here.
RE: Running GAMMA on a HPC-Cluster - Added by Jacqueline Tema Salzer over 9 years ago
Peter Friedl wrote:
I am currently working on an HPC-Cluster with 16 available threads. So I put the command export OMP_NUM_THREADS=16 at the top of my bash script, hoping that the number of threads is set for all following functions that use OPENMP. Unfortunately this did not work (e.g. with running offset_pwr), as still the default number of 4 threads is used. Does anybody have any suggestions?
Peter: The problem may be that the variable is not forwarded through the cluster scheduler. Check if it is still set when running the script, eg. using echo $OMP_NUM_THREADS. If not, set it again in your wrapper script (the one which actually calls the GAMMA programs), rather than just in the shell.
Apart from a few functions which need the latest C libraries, GAMMA seems to be running fine on the cluster here now (a Centos6.6 system). Since the programs are parallelized internally I don't use additional calls such as “mpirun”. However the OMP_NUM_THREADS needs to be set. This allows processing on all the processors of a single node.
As far as I understand, processing across nodes is not possible because the memory isn't shared, but this may depend on your cluster setup. Of course you can still use multiple nodes by splitting your list of interferograms and submitting separate jobs.