Cluster Computing Homework 3 - Elementary MPI Programming

The goal of this assignment is learn basic message programming and to develop a program using simple partitioning. For this assignment you are to write two programs in either C or C++ using the MPI message passing library. Your programs must compile without warnings and execute correctly on for full credit. Also, for full credit use good programming style, including the use of an appropriate amount of comments. In addition to the source code, your submission must also include the answers, in a plain text file, to the questions found at the end of the page.
Do not - I repeat DO NOT - submit binaries!!


  1. For this assignment,implement the adaptive version of the trapezoidal rule as described in class from Pacheco, Chapter 4. Implement both a serial version and a parallel version. The interval should be input from the command line. The serial program should execute with the command line interface:

          trapezoid < left > < right >

    The parallel version should execute under MPI with the command line interface:

          mpirun -np <number of processors> trapezoid < left > < right >

    You may assume that in the serial version that you start with 2 trapezoids and double it each time until the result is "close enough", for example until the change is less than .0001. For the parallel version, you may assume that each process starts with 2 trapezoids and iterates independently until its portion is "close enough". In order for the total error to be the same as that of the sequential case, note that you must adapt the stopping criteria appropriately. You may hardcode the function.

    For now, you may use the function

    y=x^3 - 9x^2 + 26x - 24

    Also, for convenience, you may use the following sample command line tests. You may also want to develop some test calls of your own in order to thouroughly test the program :

    The program should output the interval of integration, the value of the integral, the number of trapezoids used in the final calculation. For the parallel version you should print the local version of this information for each process, as well as the total integral.

    For both programs answer the following summary questions in a plain text file:

    1. Were there any features of the assignment that did you not successfully implement?
    2. How did the input you used test your program thoroughly for its correct operation?
    3. How did you analyze the output of your program to prove to yourself that the output of your program shows your program is working correctly?
    4. How long did you spend designing, coding, and debugging this assignment?

  2. Using LAM, test the times to broadcast a vector of 100, 1000 and 10000 floating point entries over 2,4, and 8 nodes and the different algorithms Comment on the results, identify the most efficient in each case.
  3. Implement and check the time to do a matrix-matrix product of a 1000x500 matrix with a 500x1000 matrix of floats on 2,4, and 8 nodes. Assume that the two matrices A and B have been distributed by block rows then one can use the matrix-vector multiplication code from those online to compute the matrix-matrix product AB by:
    for each column of B {Compute the parallel matrix-vector product Ax}.

    Due Date : Friday April 13, 2007.