You are here: Foswiki>Cluster Web>MPI (05 Nov 2013, sgerber)Edit


In a few cases users experienced random-like application crashes. In these cases usage of the extra parameter "--mca orte_base_help_aggregate 0 --verbose" for the actual mpirun or mpiexec call enforces printing an error message like this: "=>> PBS: job killed: mem job total 276564 kb exceeded limit 204800 kb". It might appear that pbs will not give you the standard memory error message but the one mentioned before. Be aware of the differences of pmem (memory per cpu) and mem (total memory of the application).

In general the ompi_info command is an useful tool in case something does not work how it should.

-- Cluster.salzmann - 01 Nov 2013
Topic revision: r2 - 05 Nov 2013, sgerber
  • Printable version of this topic (p) Printable version of this topic (p)