SP Parallel Programming Workshop
Parallel Virtual Machine (PVM)
© Copyright Statement
- This tutorial assumes that the reader is already familiar with the
concepts covered in the following tutorials:
- PVM Stands for: Parallel Virtual Machine
- enables a collection of different computer systems to be viewed as a
single parallel virtual machine.
- Operates on a collection of homogeneous or heterogeneous Unix computers
connected by one or more networks.
- All communication is accomplished by message passing.
- History:
"The first version of PVM was written during Summer of 1989 at Oak Ridge
National Laboratory. It was not released, but was used in applications at the
lab."
"Version 2 was written from scratch during February 1991 at the University of
Tennessee, Knoxville, and released in March of that year. It was intended to
clean up and stabilize the system so that external users could benefit."
"After a year and a half of continuing research we felt that we had reached a
limit in being able to add to PVM version 2. Version 3 was redesigned from
scratch, and a complete rewrite started in September 1992, with first release
of the software in March 1993. While similar in spirit to version 2, version 3
includes features that didn't fit the old framework - most importantly fault
tolerance, better portability and scalability."
Bob Manchek, 1995
Advantages
- Portability: Probably the most portable
message passing library available
- Scalable Parallelism
- Fault tolerance through dynamic addition and deletion of hosts/processes.
- Easy to install and use
- Public domain software from
NETLIB
- Popular - a widely used parallel message passing library
- Just about any UNIX computer (where you have an account) on the Internet
can become part of your parallel virtual machine.
- Flexible
- Easy definition and modification of your parallel virtual machine
- Arbitrary control and dependency structures - your application
"decides":
- where and when tasks are spawned and terminated
- which machines to add or delete from your parallel virtual machine
- how tasks communicate and/or synchronize with each other
Disadvantages
- Performance: Depending upon the architecture and implementation, PVM may
be slower than other native message passing languages.
- PVM is not a standard (unlike MPI)
- It is deficient in some of the message passing functionality.
Components are the parts of your Parallel Virtual Machine responsible
for things such as:
- communication
- process control
- the programming interface for the user.
There are two - the PVM Daemon and the PVM Libraries.
-
The PVM daemon pvmd3 is a Unix process which oversees the
operation of user processes within a PVM application and coordinates
inter-machine PVM communications.
-
One and only one daemon runs on each machine configured into your parallel
virtual machine. However, other users, with their own parallel virtual
machines, will have their own pvmd3s running.
-
The "master" PVM daemon is the first PVM daemon started and is
responsible for starting all other PVM daemons in your parallel virtual
machine. Typically started on your login machine.
-
Each daemon maintains a table of configuration and process information
relative to your parallel virtual machine.
-
User processes communicate with each other through the daemons:
-
they first talk to their local daemon via the library interface routines.
-
local daemon then sends/receives messages to/from remote host daemons.
-
The PVM daemon software must reside on every machine which you will be
using:
-
each machine must have its own architecture dependent version of
pvmd3 built and installed.
-
the default location is
$HOME/pvm3/lib/PVM_ARCH,
but most multi-user systems will have one copy installed by the system
administrators in a globally accessible location, such as in
/usr/local/bin. Check your local system for
details.
-
The three PVM libraries are:
-
libpvm3.a - library of C language interface routines. Always required.
-
libfpvm3.a - additionally required for Fortran codes
-
libgpvm3.a - required for use with dynamic groups
-
Contain simple subroutine calls that the application
programmer embeds in concurrent or parallel application code. Provide
the ability to:
-
initiate and terminate processes
-
pack, send, receive and broadcast messages
-
synchronize via barriers
-
query and dynamically change configuration of the parallel virtual
machine
-
Library routines do not directly communicate to other processes.
Instead, they send commands to the local daemon and receive status
information back.
-
Can be installed in user filespace - default location is
$HOME/pvm3/lib.
-
External Data Representation (XDR) format conversion is performed
automatically between hosts of different architectures. This is the
default.
-
pvm_mytid
PVMFMYTID
- Enrolls the calling process into PVM and generates a unique task
identifier (tid) if this process is not already enrolled in
PVM. If the calling process is already enrolled in PVM, this routine
simply returns the process's tid. The tid is a 32 bit positive integer
created by the local pvmd with no obvious sequence or relation to
other tids.
tid = pvm_mytid ();
PVMFMYTID (TID)
-
pvm_spawn
PVMFSPAWN
- Starts new PVM processes. Typically executed by a "master" process
to start the "worker" processes. The programmer can specify the
machine architecture and/or machine name where processes are to be spawned.
If successful or partially successful, returns the actual number of
processes spawned; if an error occurs, an integer less than zero is
returned..
numt = pvm_spawn ("worker",NULL,PvmTaskDefault,"",1,&tids[i]);
numt = pvm_spawn ("worker",0,PvmTaskArch,"RS6K",1,&tid[i]);
numt = pvm_spawn ("gen1",0,PvmTaskHost,'node1.mhpcc.edu',1,&tid[0]);
PVMFSPAWN ('worker',PVMDEFAULT," ",1,TIDS(I),INFO)
PVMFSPAWN (`node',PVMARCH,`SUN4',1,TID,INFO)
PVMFSPAWN (`gen1',PVMHOST,`sun2a.mhpcc.edu',32,TIDS,INFO)
-
pvm_exit
PVMFEXIT
- Tells the local pvmd that this process is leaving PVM. This routine
should be called by all PVM processes before they stop or exit.
pvm_exit ();
PVMFEXIT (INFO)
-
pvm_kill
PVMFKILL
- Terminates a specified PVM process. This routine is not designed for
use by the calling process to kill itself.
pvm_kill (tid);
PVMFKILL (TID,INFO)
-
pvm_halt
PVMFHALT
- Shuts down the entire PVM system.
pvm_halt ( )
PVMFHALT (info)
-
pvm_addhosts
PVMFADDHOSTS
- Add hosts to the virtual machine. The names should have the same
syntax as lines of a pvmd hostfile. The C routine may be used to
add multiple hosts per call; the Fortran routine may add only one host
per call. Useful for designing fault tolerant applications.
pvm_addhosts (hostarray,4,infoarray);
PVMFADDHOST (`azure.chem.edu',INFO)
-
pvm_delhost
PVMFDELHOST
- Deletes hosts from the virtual machine. The C routine may be used to
delete multiple hosts per call; the Fortran routine may delete only one
host per call.
pvm_delhosts (hostarray,4);
PVMFDELHOST (`azure.chem.edu',INFO)
-
pvm_getopt
PVMFGETOPT
- Returns the value of libpvm options. See the man page for details.
val = pvm_getopt (PvmFragSize);
PVMFGETOPT (PVMROUTE,VAL)
-
pvm_setopt
PVMFSETOPT
- Sets the value of libpvm options. Most useful option is to increase
message passing performance by specifing "Direct Routing". See the man
page for details.
pvm_setopt (PvmRoute,PvmRouteDirect);
PVMFSETOPT (PVMROUTE,PVMROUTEDIRECT,INFO)
-
pvm_catchout
PVMFCATCHOUT
- Causes the calling task (the parent) to catch output from tasks spawned
after the call to pvm_catchout. Characters printed on stdout or stderr
in children tasks are collected by the pvmds and sent in control
messages to the parent task, which tags each line and appends it to
the specified file (or console screen).
pvm_catchout (stdout);
PVMFCATCHOUT (1,INFO)
-
pvm_sendsig
PVMFSENDSIG
- Sends a signal to another PVM process. Should only be used by
programmers with Unix signal handling experience. See the man page
for details.
pvm_sendsig (tid,SIGKILL);
PVMFSENDSIG (TID,SIGNUM,INFO)
-
pvm_start_pvmd
- Starts up a pvmd3 process, the master of a new virtual machine.
The routine can be set to block until startup is complete or to return
immediately.
pvm_start_pvmd (argc,argv,block);
Programming PVM point-to-point communications is typically done as follows:
- Sending Process
- Initializes default send buffer
- Packs data (variables) to be sent into send buffer
- Sets message tag
- Sends data in send buffer across network to receiving task(s)
- Receiving Process
- Receives message with specfied message tag into default
receive buffer
- Unpacks data from receive buffer into corresponding variables
-
pvm_initsend
PVMFINITSEND
- Clears the send buffer and specifies the type of data format encoding
to be used. By default, PVM assumes a heterogenous machine
environment and messages are encoded (converted) to XDR format.
Specify PvmDataRaw to turn off XDR format conversion, or PvmDataInPlace
to both turn off XDR and skip copying message into send buffer before
sending. If successful, returns the message buffer identifier;
if an error, then an integer < 0 is returned.
pvm_initsend (PvmDataDefault)
PVMFINITSEND (PVMDATARAW,INFO)
-
pvm_pkdatatype
PVMFPACK( datatype ...)
- Pack the specified data type into the active send buffer. Should match
a corresponding unpack routine in the receive process. C language
uses a separate routine for each data type. Fortran uses a single
routine, but specifies the data type within the routine's
calling parameters. Note that structure data types must be packed by
their individual data element. Valid routines/data types are:
C Fortran
----------- -------------------------
pvm_pkbyte PVMFPACK (BYTE1 ...)
pvm_pkstr PVMFPACK (STRING ...)
pvm_pkfloat PVMFPACK (REAL4 ...)
pvm_pkdouble PVMFPACK (REAL8 ...)
pvm_pkcplx PVMFPACK (COMPLEX8 ...)
pvm_pkdcplx PVMFPACK (COMPLEX16 ...)
pvm_pkint PVMFPACK (INTEGER4 ...)
pvm_pkshort PVMFPACK (INTEGER2 ...)
pvm_pklong
pvm_pkuint
pvm_pkushort
pvm_pkulong
pvm_pkint (&array[offset],nbytes,stride)
pvm_pkstr ("helloworld")
PVMFPACK (INTEGER4,ARRAY(OFFSET),NBYTES,STRIDE,INFO)
PVMFPACK (STRING,'helloworld',NBYTES,STRIDE,INFO)
-
pvm_upkdatatype
PVMFUNPACK( datatype ...)
- Unpack the specified data type into the active receive buffer.
Should match a corresponding pack routine in the sending process.
C language uses a separate routine for each data type. Fortran uses a
single routine, but specifies the data type within the
routine's calling parameters. Note that structure data types must be
unpacked by their individual data element. Valid routines/data types are:
C Fortran
----------- -------------------------
pvm_upkbyte PVMFUNPACK (BYTE1 ...)
pvm_upkstr PVMFUNPACK (STRING ...)
pvm_upkfloat PVMFUNPACK (REAL4 ...)
pvm_upkdouble PVMFUNPACK (REAL8 ...)
pvm_upkcplx PVMFUNPACK (COMPLEX8 ...)
pvm_upkdcplx PVMFUNPACK (COMPLEX16 ...)
pvm_upkint PVMFUNPACK (INTEGER4 ...)
pvm_upkshort PVMFUNPACK (INTEGER2 ...)
pvm_upklong
pvm_upkuint
pvm_upkushort
pvm_upkulong
pvm_upkint (&array[offset],nbytes,stride)
pvm_upkstr (stringvar)
PVMFUNPACK (INTEGER4,ARRAY(OFFSET),NBYTES,STRIDE,INFO)
PVMFUNPACK (STRING,STRINGVAR,NBYTES,STRIDE,INFO)
-
pvm_send
PVMFSEND
- Immediately sends the data in the active message buffer to the specified
destination task. This is a blocking, asynchronous send operation.
Returns 0 if successful, < 0 otherwise.
pvm_send (tids[1],MSGTAG);
PVMFSEND (TIDS(1),MSGTAG,INFO)
-
pvm_psend
PVMFPSEND
- Both packs and sends message with a single call. Syntax requires
specification of a valid data type for both C and Fortran. Valid
types are:
DATA TYPE C FORTRAN
------------------ ----------- ----------
byte PVM_BYTE BYTE1
string PVM_STR STRING
real PVM_FLOAT REAL4
double PVM_DOUBLE REAL8
complex PVM_CPLX COMPLEX8
double complex PVM_DCPLX COMPLEX16
int PVM_INT INTEGER4
short PVM_SHORT INTEGER2
long integer PVM_LONG
unsigned short int PVM_USHORT
unsigned int PVM_UINT
unsigned long int PVM_ULONG
pvm_psend (tid,msgtag,array,1000,PVM_FLOAT);
PVMFPSEND (TID,MSGTAG,BUF,CNT,REAL4,INFO)
-
pvm_mcast
PVMFMCAST
- Multicasts a message stored in the active send buffer to ntask tasks
specified in the tids array. The message is not sent to the caller even
if listed in the array of tids. The receiving processes can call
either pvm_recv or pvm_nrecv to receive their copy of the multicast
message.
pvm_mcast (tids,ntask,msgtag);
PVMFMCAST (NPROC,TIDS,MSGTAG,INFO)
-
pvm_recv
PVMFRECV
- Blocks the receiving process until a message with the specified tag
has arrived from the specified tid. The message is then placed in a
new active receive buffer, which also clears the current receive buffer.
Using -1 as either the tag or tid argument causes a message with any
tag and/or from any source to be received. If the routine is successful,
it returns the buffer id of the new receive buffer; if an error occurs,
then an integer < 0 is returned.
pvm_recv (tid,msgtag);
PVMFRECV (-1,MSGTAG,BUFID)
-
pvm_nrecv
PVMFNRECV
- Same as pvm_recv, except a non-blocking receive operation is
performed. If the specified message has arrived, this routine returns
the buffer id of the new receive buffer. If the message has not
arrived, it returns 0. If an error occurs, then an integer < 0 is
returned.
pvm_nrecv (tid,msgtag);
PVMFNRECV (-1,MSGTAG,BUFID)
-
pvm_precv
PVMFPRECV
- Both receives and unpacks a message with a single call. Accepts -1 as
wildcard for either source tid or tag and returns actual source tid,
actual message tag and actual message length. Syntax requires
specification of a valid data type for both C and Fortran. Valid
types are:
DATA TYPE C FORTRAN
------------------ ----------- ----------
byte PVM_BYTE BYTE1
string PVM_STR STRING
real PVM_FLOAT REAL4
double PVM_DOUBLE REAL8
complex PVM_CPLX COMPLEX8
double complex PVM_DCPLX COMPLEX16
int PVM_INT INTEGER4
short PVM_SHORT INTEGER2
long integer PVM_LONG
unsigned short int PVM_USHORT
unsigned int PVM_UINT
unsigned long int PVM_ULONG
pvm_precv (tid,msgtag,array,cnt,PVM_FLOAT,&asrc,&atag,&alen);
CALL PVMFPRECV (-1,4,BUF,CNT,REAL4,ASRC,ATAG,ACNT,INFO)
-
pvm_trecv
PVMFTRECV
- Same as pvm_recv, except the receive will time-out after the
specified number of seconds and microseconds. If the specified message
arrives before the time-out, this routine returns the buffer id of the
new receive buffer. If the message does not arrive, it returns 0. If
an error occurs, then an integer < 0 is returned.
bufid = pvm_trecv (source,msgtag,&tmout)
PVMFTRECV (SOURCE,MSGTAG,SEC,USEC,BUFID)
-
pvm_probe
PVMFPROBE
- Checks to see if a message with specified msgtag has arrived from
specified tid. If a matching message has arrived pvm_probe returns a
buffer identifier in bufid. This bufid can be used in a pvm_bufinfo call
to determine information about the message such as its source and length.
Accepts -1 as wildcard specification for msgtag and tid.
pvm_probe (tid,msgtag);
PVMFPROBE (-1,4,ARRIVED)
-
pvm_recvf
- This routine defines the comparison function to be used by
the pvm_recv, pvm_nrecv, and pvm_probe functions. It is available as a
means to customize PVM message passing. See the man page for details.
Example PVM Code
Process Control and Point-to-Point Message Passing
C Language
#include <stdio.h>
#include "pvm3.h"
#define NTASKS 6
#define HELLO_MSGTYPE 1
main() {
int mytid, parent_tid, tids[NTASKS], msgtype, i, rc;
char helloworld[13] = "HELLO WORLD!";
mytid = pvm_mytid();
parent_tid = pvm_parent();
pvm_catchout(stdout);
pvm_setopt(PvmRoute, PvmRouteDirect);
if (parent_tid == PvmNoParent) {
printf("Parent task id= %d\n",mytid);
printf("Spawning child tasks ...\n");
for (i=0; i < NTASKS; i++) {
rc = pvm_spawn("hello", NULL, PvmTaskDefault, "", 1, &tids[i]);
printf(" spawned child tid = %d\n", tids[i]);
}
printf("Saying hello to all child tasks...\n");
msgtype = HELLO_MSGTYPE;
rc = pvm_initsend(PvmDataDefault);
rc = pvm_pkstr(helloworld);
for (i=0; i < NTASKS; i++)
rc = pvm_send(tids[i], msgtype);
printf("Parent task done.\n");
}
if (parent_tid != PvmNoParent) {
printf("Child task id= %d\n",mytid);
msgtype = HELLO_MSGTYPE;
rc = pvm_recv(-1, msgtype);
rc = pvm_upkstr(helloworld);
printf(" ***Reply to: %d : HELLO back from %d!\n",parent_tid, mytid);
}
rc = pvm_exit();
}
Example PVM Code
Process Control and Point-to-Point Message Passing
Fortran
program hello
include 'fpvm3.h'
parameter(NTASKS = 6)
parameter(HELLO_MSGTYPE = 1)
integer mytid, parent_tid, tids(NTASKS), msgtype, i, info
character*12 helloworld/'HELLO WORLD!'/
call pvmfmytid(mytid)
call pvmfparent(parent_tid)
call pvmfcatchout(1,info)
call pvmfsetopt(PVMROUTE, PVMROUTEDIRECT, info)
if (parent_tid .eq. PvmNoParent) then
print *,'Parent task id= ', mytid
print *,'Spawning child tasks...'
do 10 i=1,NTASKS
call pvmfspawn("hello", PVMDEFAULT, " ", 1, tids(i), info)
print *,' spawned child tid = ', tids(i)
10 continue
print *,'Saying hello to all child tasks...'
msgtype = HELLO_MSGTYPE
call pvmfinitsend(PVMDEFAULT, info)
call pvmfpack(STRING, helloworld, 12, 1, info)
do 20 i=1,NTASKS
call pvmfsend(tids(i), msgtype, info)
20 continue
print *, 'Parent task done.'
endif
if (parent_tid .ne. PvmNoParent) then
print *, 'Child task id= ', mytid
msgtype = HELLO_MSGTYPE
call pvmfrecv (-1, msgtype, info)
call pvmfunpack(STRING, helloworld, 12, 1, info)
print *,' ***Reply to: ',parent_tid, ' : HELLO back from ',
& mytid,'!'
endif
call pvmfexit(info)
end
-
pvm_barrier
PVMFBARRIER
- Blocks the calling process until all processes in a group have called
pvm_barrier().
pvm_barrier ("worker",5 );
PVMFBARRIER ('worker',COUNT,INFO)
-
pvm_bcast
PVMFBCAST
- Asynchronously broadcasts the data in the active send buffer to a
group of processes. The broadcast message is not sent back to the sender.
Receiving tasks can use any PVM receive routine to receive the message.
Note that any PVM task can call pvm_bcast(), it need not be a member
of the group.
pvm_bcast ("worker",msgtag);
PVMFBCAST (`worker',5,INFO)
-
pvm_gather
PVMFGATHER
- A specified member of the group receives messages from each member of
the group and gathers these messages into a single array. All group
members must call pvm_gather(). Note: pvm_gather() does not block. If
a task calls pvm_gather and then leaves the group before the root
has called pvm_gather() an error may occur.
pvm_gather (&getmatrix,&myrow,10,PVM_INT,msgtag,"workers",root);
PVMFGATHER (GETMATRIX,MYCOLUMN,COUNT,INT4,MTAG,`workers',ROOT,INFO)
-
pvm_scatter
PVMFSCATTER
- Performs a scatter of data from the specified root member of the group
to each of the members of the group, including itself. All group members
must call pvm_scatter(). Each receives a portion of the data array from
the root in their local result array.
pvm_scatter (&getmyrow,&matrix,10,PVM_INT,msgtag,"workers",root);
PVMFSCATTER (GETMYCOLUMN,MATRIX,COUNT,INT4,MTAG,`workers',ROOT,INFO)
-
pvm_reduce
PVMFREDUCE
- Performs a reduce operation over members of the specified group.
All group members call pvm_reduce() with their local data, and the
result of the reduction operation appears on the user specified root
task. Users can define their own reduction functions or use one of
the following predefined PVM reduction functions: PvmMin PvmMax
PvmSum PvmProduct. NOTE: Fortran users must be certain to
declare the predefined reduction functions as external. For example:
External PvmMin .
pvm_reduce (PvmMax,&myvals,10,PVM_INT,msgtag,"workers",root);
PVMFREDUCE (PvmMax,MYVALS,COUNT,INT4,MTAG,`workers',ROOT,INFO)
-
pvm_joingroup
PVMFJOINGROUP
- Enrolls the calling task in the named group and returns the instance
number of this task in this group. Instance numbers start at 0 and
count up; error numbers less than zero indicate an error. Tasks that
leave and rejoin a group may/may not be assigned the same instance
number - PVM attempts to "fill gaps" in the instance number sequence
left by departing group members.
rc = pvm_joingroup ("worker");
CALL PVMFJOINGROUP (`group2',INUM)
-
pvm_lvgroup
PVMFLVGROUP
- Unenrolls the calling process from a named group.
rc = pvm_lvgroup ("worker");
CALL PVMFLVGROUP (`group2',INFO)
-
pvm_gsize
PVMFGSIZE
- Returns the number of members presently in the named group.
size = pvm_gsize ("worker");
CALL PVMFGSIZE (`group2',SIZE)
-
pvm_gettid
PVMFGETTID
- Returns the tid of the process identified by a group name and instance
number.
tid = pvm_gettid ("worker",0);
PVMFGETTID ('worker',5,TID)
-
pvm_getinst
PVMFGETINST
- Returns the instance number in a group of a PVM process.
inum = pvm_getinst ("worker",tid[i]);
PVMFGETINST (`GROUP3',TID,INUM)
-
pvm_freezegroup
PVMFFREEZEGROUP
- Freezes dynamic group membership and caches
info locally by all group members, making the group static. This
is a synchronizing routine and must be called by all group
members to complete.
info = pvm_freezegroup("worker",size);
CALL PVMFFREEZEGROUP('group2',size,info)
Example PVM Code
Group Management and Collective Communications
C Language
#include <stdio.h>
#include "pvm3.h"
#define FIRST 0
#define NTASKS 4
int main() {
int mytid, tids[NTASKS-1], rank, tag1=1, tag2=2, sum, max, info;
/* Startup - get taskid and then join group */
mytid = pvm_mytid();
rank = pvm_joingroup("summax");
sum = rank;
max = rank;
/* First task spawns the rest */
if (rank == FIRST) {
info = pvm_spawn("summax", NULL, PvmTaskDefault, "", NTASKS-1, &tids);
printf("Rank= %d spawned %d tasks\n",rank,info);
}
/* Causes a barrier until NTASKS have joined, then freezes group */
pvm_freezegroup("summax",NTASKS);
/* Use collective communications calls to compute sum and max across group */
pvm_reduce(PvmSum,&sum,1,PVM_INT,tag1,"summax",FIRST);
pvm_reduce(PvmMax,&max,1,PVM_INT,tag2,"summax",FIRST);
/* First task prints results */
if (rank == FIRST) {
printf("Done. Sum= %d Max= %d\n",sum, max);
}
/* Make sure everybody is done before leaving group and quitting */
pvm_barrier("summax",NTASKS);
pvm_lvgroup("summax");
pvm_exit();
}
Example PVM Code
Group Management and Collective Communications
Fortran
program summax
include 'fpvm3.h'
integer FIRST, NTASKS, TAG1, TAG2
parameter(FIRST=0)
parameter(NTASKS=4)
parameter(TAG1=1)
parameter(TAG2=2)
integer mytid, rank, tids(NTASKS-1), info, sum, max
C Important - Fortran needs to declare these as external
external PvmSum
external PvmMax
C Startup - get taskid and then join group.
call pvmfmytid(mytid)
call pvmfjoingroup('summax',rank)
sum = rank
max = rank
C First task spawns the rest
if (rank .eq. FIRST) then
call pvmfspawn('summax',PVMDEFAULT," ",NTASKS-1,tids,info)
print *,'Rank= ',rank,'spawned',info,'tasks'
endif
C Causes a barrier until NTASKS have joined, then freezes group
call pvmffreezegroup('summax',NTASKS,info)
C Use collective communications calls to compute sum and max across group
call pvmfreduce(PvmSum,sum,1,INTEGER4,tag1,'summax',FIRST,info)
call pvmfreduce(PvmMax,max,1,INTEGER4,tag2,'summax',FIRST,info)
C First task prints results
if (rank .eq. FIRST) then
print *,'Done. Sum= ',sum,' Max=',max
endif
C Make sure everybody is done before leaving group and quitting
call pvmfbarrier('summax',NTASKS,info)
call pvmflvgroup('summax',info)
call pvmfexit(info)
end
-
pvm_parent
PVMFPARENT
- Returns the tid of the process that spawned the calling process. If
the calling process was not created with pvm_spawn, then tid is set
to PvmNoParent. Typically used by the "master" process to determine
that it is the first active process and needs to spawn the "worker"
processes.
tid = pvm_parent ();
PVMFPARENT (TID)
-
pvm_bufinfo
PVMFBUFINFO
- Returns information about the specified message buffer. Typically used
after a "wildcard" receive operation to determine facts about the
last received message such as its size or source.
pvm_bufinfo (bufid,&bytes,&type,&source);
PVMFBUFINFO (BUFID,BYTES,TYPE,SOURCE,INFO)
-
pvm_pstat
PVMFPSTAT
- Returns the status of the specified PVM process. Values
are PvmOk if the task is running, PvmNoTask if not, and PvmBadParam if
the tid is bad.
status = pvm_pstat (tid);
PVMFPSTAT (TID,STATUS)
-
pvm_mstat
PVMFMSTAT
- Returns the status of a host in the virtual machine. Values
PvmOk host is OK, PvmNoHost host is not in virtual machine, PvmHostFail
host is unreachable (and thus possibly failed).
mstat = pvm_mstat ("msr.ornl.gov");
PVMFMSTAT (`msr.ornl.gov',MSTAT)
-
pvm_config
PVMFFCONFIG
- Returns information about the present virtual machine configuration.
The information returned is similar to that available from the console
command conf". The C function returns information about the entire
virtual machine in one call. The Fortran function returns information
about one host per call and cycles through all the hosts. See the
man page for details.
pvm_config(&nhost,&narch,&hostp);
PVMFCONFIG (NHOST,NARCH,DTID(i),HOST(i),ARCH(i),SPEED(i),INFO)
-
pvm_tasks
PVMFTASKS
- Returns information about the tasks running on the virtual machine.
The information returned is the same as that available from the console
command "ps". The C function returns information about the entire virtual machine in one call. The Fortran function returns information about one
task per call and cycles through all the tasks. See the man page for
details.
pvm_tasks (0,&ntask,&taskp);
PVMFTASKS (DTID,NTASK,TID(i),PTID(i),DTID(i),FLAG(i),AOUT(i),INFO)
-
pvm_hostsync
PVMFHOSTSYNC
- Samples the time-of day clock of a host in the virtual machine and
returns both the clock value and the difference between local and remote
clocks. See the man page for details.
-
pvm_pvm_tidtohost
- Returns the integer host id on which the specified process tid is located.
See the man page for details.
- PVM includes a number of miscellaneous routines used for various purposes.
See the man page for details about any of these less often used routines.
- Process notification of certain PVM events:
- Error message setting and displaying:
- Using additional message buffers:
- Using simple, global name/value database pairs:
- Setting trace masks for event tracing:
Assuming that you are already connected to a functional network (Internet
protocol) of Unix processors, follow the steps below to get started with PVM.
-
Acquire, build and install the PVM software(optional)
-
Design your application and prepare to execute your PVM session.
-
Compile your application components
-
Create your PVM hostfile
-
Create your $HOME/.rhosts file
-
Start the master PVM daemon
-
Execute your application
-
Quit PVM
- This is important only if your system does not already run PVM.
- However, there are instructions for obtaining documentation available in
this section, including the PVM user's guide and quick reference card.
-
Always need libpvm3.a library. Also need libfpvm3.a library for
components coded in Fortran.
% cc -o myprog myprog.c -I$PVM_ROOT/include
-L$PVM_ROOT/lib/$PVM_ARCH -lpvm3
% xlf -o myprog myprog.f -I$PVM_ROOT/include
-L$PVM_ROOT/lib/$PVM_ARCH -lfpvm3 -lpvm3
-
For Dynamic Groups, also need libgpvm3.a added before libpvm3.a
% cc -o myprog myprog.c -I$PVM_ROOT/include
-L$PVM_ROOT/lib/$PVM_ARCH -lgpvm3 -lpvm3
% xlf -o myprog myprog.f -I$PVM_ROOT/include
-L$PVM_ROOT/lib/$PVM_ARCH -lfpvm3 -lgpvm3 -lpvm3
-
Make sure that executable components are located in
~/pvm3/bin/PVM_ARCH
on each machine as required.
-
Don't forget to link other libraries (essl, nag, imsl) if your
application requires them.
Example Makefiles:
###############################################################################
# FILE: make.hello.c
# DESCRIPTION: Makefile for trivial PVM example - C Language
###############################################################################
CC = cc
OBJ = hello
SRC = hello.c
PVMDIR = /source/pd/pvm3.3/pvm3
INCLUDE = -I${PVMDIR}/include
LIBS = -L${PVMDIR}/lib/RS6K -lpvm3
${OBJ}: ${SRC}
${CC} ${SRC} ${INCLUDE} ${LIBS} -o ${OBJ}
###############################################################################
# FILE: make.hello.f
# DESCRIPTION: Makefile for trivial PVM example - Fortran
###############################################################################
F77 = xlf
OBJ = hello
SRC = hello.f
PVMDIR = /source/pd/pvm3.3/pvm3
INCLUDE = -I${PVMDIR}/include
LIBS = -L${PVMDIR}/lib/RS6K -lfpvm3 -lpvm3
${OBJ}: ${SRC}
${F77} ${SRC} ${INCLUDE} ${LIBS} -o ${OBJ}
-
Your PVM hostfile defines your parallel virtual machine. It contains the
names of all desired machines, one per line
-
Only needs to reside on the machine where you start up PVM
-
The filename can be whatever you like
-
Comment lines start with "#"
-
Options include:
-
& = must precede hostname if the host is to be added dynamically later
-
ep = executable program component paths if not using default
-
dx = daemon path if not using default
~/pvm3/bin/PVM_ARCH
-
lo = login userid
-
pw = password entry required
-
* = apply to all following hosts in hostfile
Example Hostfiles:
########################################################
# FILENAME: hostfile.1
# DESCRIPTION: Simple PVM hostfile with no options
########################################################
fr29s01.mhpcc.edu
fr29s02.mhpcc.edu
fr29s03.mhpcc.edu
fr29s04.mhpcc.edu
fr29s05.mhpcc.edu
fr29s06.mhpcc.edu
fr29s07.mhpcc.edu
fr29s08.mhpcc.edu
########################################################
# FILENAME: hostfile.2
# DESCRIPTION: PVM hostfile with a few options
########################################################
fr29s01.mhpcc.edu
fr29s02.mhpcc.edu
fr29s03.mhpcc.edu
fr29s04.mhpcc.edu
#
# next two hosts will be may be added dynamically later
#
&fr10n12.mhpcc.edu
&fr12n14.mhpcc.edu
#
# next host has differnet path for its PVM executables
#
beech.tc.cornell.edu ep=/u/user02/pvm/testing/bin
#
# next line applies to all hosts which follow it
#
* lo=user02 pw
kanaha.mhpcc.edu
makena.mhpcc.edu
littleb.mhpcc.edu
-
Start up the master (first) daemon.
% pvmd3 hostfile &
-
The master deamon will be started on your local machine
-
Automatically
starts up daemons on all other machines (remote) specified in your hostfile.
-
Do not run in the background if using the password (pw) specification in
your hostfile.
-
Should only have one pvmd3 running on each machine in your virtual machine!
-
PVM console can be started after pvmd3 by typing "pvm". PVM console
commands can then be issued.
-
Start your parallel application by launching the first instance of your
program on your login machine. Note: PVM will start this instance on
your login machine even if it isn't in your hostfile.
% myprog
-
What happens from here depends upon your application. For example:
-
If you spawn (via PvmTaskDefault) more tasks than there are machines in
your virtual machine, PVM will assign multiple tasks to machines - it
simply starts over from the top of the hostfile. Generally, this is
not recommended, since your tasks will be competing with each other
for CPU resources.
-
A partial list of PVM error codes and messages is
available here.
-
Make sure all application components include a PVM library call of
"pvmfexit(info)" or "pvm_exit()".
-
Halting the master pvmd3 will automatically kill all other pvmd3s and
all processes enrolled in this PVM.
-
Running in pvm console mode: use "halt" command
-
Running in the background: enter console mode by
typing "pvm" and then use the halt command.
-
Running in the foreground: suspend PVM (Control-Z) and put it in the
background (bg). Then enter console mode by typing "pvm" and use the
halt command.
-
Situations (problems) may arise when you need to manually kill
processes on either the local or remote hosts - or both.
% rsh spn22 ps x
PID TTY STAT TIME COMMAND
27438 - S 0:00 rshd -v
28208 - R 0:00 ps x
31023 - S 0:00 csh -c ps x
65581 - S 0:00 /source/pd/msg_pass/pvm3/lib/RS6K/pvmd3
65594 - S 0:00 /u1/user02/pvm3/bin/RS6K/mypvmproc
% rsh spn22 kill 65581 65594
-
Abnormal terminations of PVM may leave files in /tmp which prevent
you from restarting (known bug). Should delete all /tmp/pvm*.<uid>
files on all machines before restarting the master pvmd3.
% id -u
10045
% rm /tmp/pvm*.10045
-
Local PVM cleanup utility
syntax: pvmcleanup hostfile < -u user components list >
fr2n02% pvmcleanup hostfile
pvmcleanup will use your PVM hostfile: hostfile
Cleaning up on host: fr2n01.mhpcc.edu
Removing /tmp/pvmd.1336 /tmp/pvml.1336
Killed pvmd process 5046
Cleaning up on host: fr2n02.mhpcc.edu
Removing /tmp/pvmd.1336 /tmp/pvml.1336
Killed pvmd process 20650
Cleaning up on host: fr2n03.mhpcc.edu
Removing /tmp/pvmd.1336 /tmp/pvml.1336
Killed pvmd process 18710
Cleaning up on host: fr2n04.mhpcc.edu
Removing /tmp/pvmd.1336 /tmp/pvml.1336
Killed pvmd process 4890
Done.
fr2n02%
kanaha% pvmd3 hosts3 &
[1] 15683
kanaha% pvm
pvmd already running.
pvm> conf
3 hosts, 1 data format
HOST DTID ARCH SPEED
kanaha.mhpcc.edu 40000 RS6K 1000
makena.mhpcc.edu 80000 RS6K 1000
littleb.mhpcc.edu c0000 RS6K 1000
pvm> quit
pvmd still running.
kanaha% hello
Parent task id= 262146
Spawning child tasks ...
spawned child tid = 524289
spawned child tid = 786433
spawned child tid = 262147
spawned child tid = 524290
spawned child tid = 786434
spawned child tid = 262148
Saying hello to all child tasks...
Parent task done.
kanaha% cat /tmp/pvml.288
[t80040000] ready Wed Mar 9 17:05:02 1994
[t80040000] [t40003] Child task id= 262147
[t80040000] [t40003] ***Reply to: 262146 : HELLO back from 262147!
[t80040000] [t40004] Child task id= 262148
[t80040000] [t40004] ***Reply to: 262146 : HELLO back from 262148!
[t80040000] [t80001] Child task id= 524289
[t80040000] [t80001] ***Reply to: 262146 : HELLO back from 524289!
[t80040000] [tc0001] Child task id= 786433
[t80040000] [tc0001] ***Reply to: 262146 : HELLO back from 786433!
[t80040000] [tc0002] Child task id= 786434
[t80040000] [tc0002] ***Reply to: 262146 : HELLO back from 786434!
[t80040000] [t80002] Child task id= 524290
[t80040000] [t80002] ***Reply to: 262146 : HELLO back from 524290!
kanaha% pvm
pvmd already running.
pvm> halt
[1] Done pvmd3 hosts2
kanaha%
- The MHPCC will have multiple versions of PVM available at any given
time. The current PVM components are installed in /source/pd/msg_pass/pvm3/.
To find the exact version number you are using, start the PVM console and
type "version". For example:
fr2n11.mhpcc.edu% pvm
pvmd already running.
pvm> version
3.3.10
- To run PVM, it is necessary to set a few environment variables in
your .cshrc or .profile file:
For csh:
setenv PVM_ROOT /source/pd/msg_pass/pvm3
setenv PVM_ARCH RS6K
set path=($path $PVM_ROOT/lib/$PVM_ARCH)
For ksh:
export PVM_ROOT=/source/pd/msg_pass/pvm3
export PVM_ARCH=RS6K
export PATH=$PATH:$PVM_ROOT/lib/$PVM_ARCH
- As pointed out in Step 2, you'll need to create two symbolic links to
run pvm:
ln -s $PVM_ROOT/lib ~/pvm3/lib
ln -s $PVM_ROOT/bin/RS6K/pvmgs ~/pvm3/bin/RS6K/pvmgs
- Ways to increase Performance
- Running PVM over the High Performance Switch (HPS)
- Make sure to use Direct Routing:
pvm_setopt(PvmRoute, PvmRouteDirect);
call pvmfsetopt(PVMROUTE, PVMROUTEDIRECT, info)
- Make sure to avoid XDR conversion on a homogenous system:
pvm_initsend(PvmDataRaw);
call pvm_initsend(PvmDataRaw, bufid)
- Any version of PVM 3.3.9 or greater offers an MPI port for US
communications.
- PVMe is IBM's enhanced PVM designed to run using user space over the
switch. As a result, it is possible to get better performance.
- Common Problems and Debugging with PVM
- The most common problem in starting PVM daemons is probably due
to daemons already running on the machine(s), or the existence of a
/tmp/pvmd.uid file. You'll need to kill old daemons and/or remove any
/tmp/pvmd.uid files, which can be done with a pvmcleanup.
- The second most common problem is probably due to incorrect
.rhosts files.
- Another, less frequent startup problem is due to .cshrc files that
require interactive input or which display messages.
- Use the pvm_catchout routine to redirect all standard
output from worker nodes to the master's standard out.
- Watch out for code maintenance. In shared filesystem, this isn't
usually too much trouble. On independent file systems, you must maintain
all of the different versions of your executable on multiple machines.
- Make sure all pvmd3s and PVM libraries are the same PVM version.
PVM will not work if there's a mismatch. This usually happens when
someone installs a new version of PVM and you try to run an executable
compiled with the older version.
References and Acknowledgements
- "PVM 3 User's Guide and Reference Manual", Oak Ridge National
Laboratory, Oak Ridge, TN
- "The PVM talk", Bob Manchek, University of Tennessee
- We gratefully acknowledge the Cornell Theory Center, Ithaca, New York
for providing some original material included in this document.
© Copyright 1995
Maui High Performance Computing Center. All rights reserved.
Documents located on the Maui High Performance Computing Center's WWW server
are copyrighted by the MHPCC. Educational institutions are encouraged to
reproduce and distribute these materials for educational use as long as
credit and notification are provided. Please retain this copyright notice
and include this statement with any copies that you make. Also, the MHPCC
requests that you send notification of their use to help@mail.mhpcc.edu.
Commercial use of these materials is prohibited without prior written
permission.
Last revised: 26 September 1996 Blaise Barney