MPI problems on a DEC Alpha cluster


Subject: MPI problems on a DEC Alpha cluster
From: Chris Godfred (chris@earthsci.unimelb.edu.au)
Date: Wed Dec 20 2000 - 21:15:37 MST


Hello --

I (and my sysadmin) have been having a hard time trying
to get the SPMD version of CCM 3.3.6 to run on a
cluster of DEC machines. I was able to compile and
run it OK on a single CPU.

I have compiled CCM it with these options:

   mpif90 $(cpp_path) -r8 -i4 -arch host -extend_source

$(cpp_path) includes the $MPI_INC include files, while
the $(MPI_LIB) libraries are used.

Compiling is not a problem, but the process falls over
as soon as I try to execute, either from the command
line or using the queuing system (PBS)

   mpirun -nolocal -np 4 -machinefile mf atm.sav < atm.parm

The file 'mf' simply specifies the nodes I want to run
the job on.

The output is as follows:

    p0_31391: p4_error: fork_p4: fork failed: -1
        p4_error: latest msg from perror: Operation would block
    bm_list_31550: p4_error: interrupt SIGINT: 2
    p0_31391: p4_error: interrupt SIGINT: 2

Any help or advice on what I'm doing wrong would be most
appreciated!

Chris Godfred-Spenning - chris@earthsci.unimelb.edu.au
School of Earth Sciences - Uni. of Melbourne, Australia
                    ---------
                    HowzStat:
http://www.earthsci.unimelb.edu.au/howzstat/
      http://howzstat.iac.iafrica.com/howzstat.htm



This archive was generated by hypermail 2b27 : Thu Jan 04 2001 - 10:02:13 MST