allegro
Cluster allegro
allegro is a hybrid x86 and GPU cluster and consists of a total of ~1000 cores and roughly 3.5 TB RAM located in different
node types.
The nodes are interconnected via
InfiniBand for parallel computation jobs (
MPI) and equipped with ample 15TB of cluster-wide hard disk
storage.
Job Management
Calculations are scheduled and automatically distributed via the
TORQUE resource manager
Getting started
For the impatient, you will find a quick run through
here
Getting an account
You need an account at
Freie Universität Berlin which has been enabled at the
Department of Mathematics and Computer Science
Logging In
The cluster is only reachable from within the Department of Mathematics and Computer Science.
If you want to get access to the cluster from outside the department,
please login on one of our ssh remote login nodes and then jump to allegro or use a ssh tunnel.
To login to the cluster, you need an SSH client of some sort. If you are using a linux or unix based system, there is most likely one already available to you in a shell, and you can get to your account very quickly. For Microsoft Windows, we recommend
PuTTY
.
$ ssh <username>@allegro.imp.fu-berlin.de
Submitting jobs
Save the following as
job_script.sh
and replace the
<USER>
and
<EMAIL ADDRESS>
placeholders.
#!/bin/bash
#PBS -N testjob
#PBS -d /storage/mi/<USER>
#PBS -o testjob.out
#PBS -e testjob.err
#PBS -M <EMAIL ADDRESS>
#PBS -l walltime=00:01:00
#PBS -l nodes=1:ppn=1
#PBS -l pmem=10mb
hostname
date
Then, it's time to start the job via
qsub
(QueueSUBmit).
$ qsub job_script.sh
You can see your currently running jobs with
qstat
(QueueSTATus)
$ qstat
Queue Limits
If you submit a job, the system uses the 'smallest possible' queue.
Short description of queue limits:
(long/short is runtime, big/small is memory)
Usage |
Queue |
Runtime |
CPUs |
Memory |
count |
all others |
'large' |
> 2 days |
> 1 cpu |
-- |
10 |
short big |
'big' |
< 2 days |
-- |
-- |
400 |
very little |
'micro' |
< 1 day |
1 cpu |
< 6G |
750 |
small |
'small' |
< 2 days |
1 cpu |
-- |
759 |
long small |
'long' |
> 2 days |
1 cpu |
< 3G |
400 |
All undeclared settings are assumed to be in tne largest limit.
Cook Book
- Stdout/Stderr Munging
- common, useful operations regarding stdout / stderr redirection
- Selecting Nodes Classes
- selecting node classes, required for consistent running time results.
- Selecting Queues
- selecting queues for short / long running jobs.
- Job arrays
- allow you to submit a sequence of similar job scripts that only differ by one environment variable (
${PBS_ARRAYID}
) (advanced)
- Temporary Files and Cleanup
- managing temporary files and cleaning up after your jobs (advanced)
Notes
- The
qstat
command's output is not in real time. Especially the Time Use
(CPU Time!) field is only updated once every several seconds.
- Use
qstat -f
and qstat -f -
to get detailed info about a job.
- The stdout and stderr log files are created during run. If writing to $HOME you can follow with 'tail' etc.
- Useful:
-t
flag for array execution. Use PBS_ARRAYID=1 bash job_script.sh
to simulate one of array execution locally.
- If you specify
-l nodes=1
then you will NOT get the node exclusively. Use -l node=1:ppn=24
.
- The setting of
ppn
works as a multiplier on the pmem
restrictions!
- If you call a script in your job script then this is not cached. Do not modify included scripts or be aware of the side effects!
- Since September 12 you can run X11-Programs (ssh -X allegro…) and see the WIndow on your Workstation.
File System Paths
The following file system locations are interesting:
/home/$username
Extra User's home on
allegro, fast infiniband-connected hard drives.
- /nfs/$normal-path
- Things that are normally available through the network. For example:
/nfs/group
, /nfs/home/mi/...
, …
Data can be copied from the
/nfs/...
paths to the home directory on allegro.
/data/scratch
Scratch directory for temporary data. It is a good idea to set your
TMPDIR
environment variable to
/data/scratch/$USER/
after creating this directory.
- /data/scratch.local
- Local scratch directory for temporary data in case you need the speed of a local disk -- beware the limited space.
Cluster Queue Commands
ClusterQueueManagement
The cluster queue is managed by the TORQUE cluster management system.
The system knows the following
client commands (besides others):
$
checkjob: Display information about a job, e.g. reason for being deferred.
$
qsub: Submit jobs to the work queue.
$
qstat: Display job queue status.
$
qdel: Remove/cancel jobs from queue.
$
showq: Show queue status.
$
qnodes: Show information about available nodes.
$
qalter: Change attributes of running jobs like time and memory consumption estimates.
Cluster Resource Policy
- The time limit that you give to your jobs is 'hard'. Jobs will be killed if they would still run but your runtime is up.
- The memory limit that you give to your jobs is 'hard', too. Same as above with time. If you allow the job to have 1024Mb but it uses 1034, it will be killed immediately. The message that you will receive for such an event will read like this: 'job violates resource utilization policies'.
- Both values, walltime and memory, have no intrinsic limit. You can set them to what you want. There is, however, the policy to give low time and low memory jobs a preference: Walltime < 2h and memory < 2Gbyte will be preferred.