Homework Four - Elementary MPI programming
Due : Thursday Decemeber 3, 2015
Assignment:
Parallelize the program using MPI. To do this follow the instructions in
Using OpenMPI on Gollum
hostfile for all nodes and cores using gigabit ethernet for spawning
hostfile for all nodes but only 2 cores per node using gigabit ethernet for spawning
When compiling and running you must NOT use gollum itself but one of the nodes node1 to node12.
Present the results
as a table of run time and spedup
Speedup is defined as T1/Tp where T1 is the time on 1 process and Tp is the time on p processes.
Comment on the results.
The preferred way to time the codes is to use MPI_Wtime() to time the relevant part e.g.
double starttime, endtime;
starttime = MPI_Wtime();
.... stuff to be timed ...
endtime = MPI_Wtime();
printf("That took %f seconds\n",endtime-starttime);
-
Implement and check the time to do a matrix-matrix product of a 1500x900
matrix A with a 900x1200 matrix B of doubles
As in the first OpenMP exercise
Define Aij = (i+1)*(j+1) and Bij = 1/((double) (i+1)* (double)(j+1)).
The result matrix C=A*B should be Cij= 900*(double)(i+1)/(double) (j+1) . You should check that the result is correct in each case by comparing A*B with a matrix C with these values.
-
The matrices A and B should initially be distributed by block rows over the processes used.
The matrix-multiply can then be accomplished by a variation of the
matrix-vector multiply code such as
for each column x of B
Compute the parallel metrix-vector product Ax
- Note that you should not read in the matrices but calculate them using the formula given. You should also not write them out but verify that the results are correct in a manner similar to the OpenMP homework.
-
You should run the code using 1,2,3,4,5,6, 7 and 8 processes on gollum by using up to 8 processes on one node and by using at most 2 processes on each node.
-
List the various run times and speedup in seperate tables for each case
and comment on the speedup of each
Note that to implement this program you may need to increase the default stack size.
If you do not you will get a segmentation error. To do this you need to execute shell commands similar to:
ulimit -s unlimited
on each node.
Note: when timing the output remember not to print anything while timing.
This assignment is an individual assignment, to be done on your own without help from other
students in the class. However, you may use any
materials from any written resource, including web resources.
Instructions for submitting the homework using svn are contained in
this file.