Updated on 2022-05-09 GMT+08:00

Open MPI Delivered with the IB Driver

Scenarios

This section describes how to run the built-in MPI (version 3.0.0rc6) of the IB driver on a configured ECS.

Prerequisites

  • An ECS equipped with InfiniBand NICs has been created, and an EIP has been bound to it.
  • Multiple ECSs have been created using a private image.

Procedure

  1. Use PuTTY and a key pair to log in to the ECS.

    Ensure that the username specified during ECS creation is used to establish the connection.

  2. Run the following command to disable user logout upon system timeout:

    # TMOUT=0

  3. Run the following command to check whether the ECSs to be tested can be logged in to from each other without a password:

    $ ssh Username@SERVER_IP

  4. Run the following commands to disable the firewall of the ECS:

    # iptables -F

    # service firewalld stop

  5. Run the following command to log out of the root account:

    # exit

  6. Run the following command to add the hostfile file:

    # vi hostfile

    Add the IP addresses or names of the ECSs (IP addresses corresponding with the names of ECSs are contained in the /etc/hosts directory). For example, add the following IP addresses:

    # cat hostfile

    192.168.0.1

    192.168.0.2

    ...

  7. Run the following command to run the hostname command in the cluster:

    # mpirun --allow-run-as-root -np <hostfile_node_number> -pernode --hostfile hostfile hostname

    Figure 1 Running the hostname command in the cluster
  8. Modify hostfile and run MPI benchmark with the path of hostfile specified.

    For example, to modify the hostfile file and run MPI benchmark on two ECSs, run the following command:

    # mpirun --allow-run-as-root -np 2 -pernode --hostfile /root/hostfile /usr/mpi/gcc/openmpi-3.0.0rc6/tests/imb/IMB-MPI1 PingPong

    Run Intel MPI benchmark in a cluster containing two nodes. In the RDMA network, the minimum latency is less than 1.5 us.

    #------------------------------------------------------------
    #    Intel (R) MPI Benchmarks 4.1, MPI-1 part
    #------------------------------------------------------------
    # Date                  : Mon Jul 16 10:12:51 2018
    # Machine               : x86_64
    # System                : Linux
    # Release               : 3.10.0-514.10.2.el7.x86_64
    # Version               : #1 SMP Fri Mar 3 00:04:05 UTC 2017
    # MPI Version           : 3.1
    # MPI Thread Environment:
    
    # New default behavior from Version 3.2 on:
    
    # the number of iterations per message size is cut down
    # dynamically when a certain run time (per message size sample)
    # is expected to be exceeded. Time limit is defined by variable
    # "SECS_PER_SAMPLE" (=> IMB_settings.h)
    # or through the flag => -time
    
    # Calling sequence was:
    
    # /usr/mpi/gcc/openmpi-3.0.0rc6/tests/imb/IMB-MPI1 PingPong
    
    # Minimum message length in bytes:   0
    # Maximum message length in bytes:   4194304
    #
    # MPI_Datatype                   :   MPI_BYTE
    # MPI_Datatype for reductions    :   MPI_FLOAT
    # MPI_Op                         :   MPI_SUM
    #
    #
    
    # List of Benchmarks to run:
    
    # PingPong
    
    #---------------------------------------------------
    # Benchmarking PingPong
    # #processes = 2
    #---------------------------------------------------
    #bytes #repetitions      t[usec]   Mbytes/sec
    0         1000         1.87         0.00
    1         1000         1.93         0.49
    2         1000         1.78         1.07
    4         1000         1.79         2.13
    8         1000         1.77         4.31
    16         1000         1.78         8.57
    32         1000         1.79        17.09
    64         1000         1.85        33.02
    128         1000         1.90        64.12
    256         1000         2.40       101.58
    512         1000         2.53       192.90
    1024         1000         2.85       342.61
    2048         1000         3.23       604.14
    4096         1000         4.32       904.98
    8192         1000         5.89      1325.65
    16384         1000         8.48      1842.47
    32768         1000        12.50      2500.57
    65536          640        21.79      2867.89
    131072          320        34.28      3646.50
    262144          160        42.19      5925.52
    524288           80        66.55      7513.14
    1048576           40       114.95      8699.54
    2097152           20       213.71      9358.48
    4194304           10       402.59      9935.78
    
    
    # All processes entering MPI_Finalize
  9. Deploy your MPI application in the Linux cluster and run the MPI application in the Linux cluster using the preceding method.