Help Center/ Ubiquitous Cloud Native Service/ Best Practices/ Fleets/ Multi-Active DR for Multi-Cloud Clusters
Updated on 2023-04-12 GMT+08:00

Multi-Active DR for Multi-Cloud Clusters

Scenario

To tackle single points of failure (SPOFs), UCS allows instances of an application to run on multiple clouds. When one of the clouds is down, cluster federation will migrate instances to other clouds and switch over traffic within seconds, significantly improving service reliability.

Figure 1 shows the multi-active DR solution in UCS. Under DNS policies, instances of an application are distributed to three Kubernetes clusters: two Huawei Cloud CCE clusters (deployed in different regions) and one third-party cloud cluster.

Figure 1 Multi-active DR for multi-cloud clusters

Preparations

  • You have created a cluster. The following is an example of creating a CCE cluster in CN South-Guangzhou and CN East-Shanghai1 (guides: Buying a CCE Cluster). The Kubernetes version must be 1.19 or later, and each cluster must have at least one available node.

    In your production environment, you can deploy clusters in different regions, AZs, or even clouds to implement multi-active DR.

  • You have created a public zone in Huawei Cloud DNS. For details, see Routing Internet Traffic to a Website.

Setting Up the Basic Environment

  1. Register clusters to UCS and configure cluster access. For details, see Registering a Cluster.

    For example, register clusters ccecluster01 and ccecluster02 to the container fleet ucs-group of UCS and check whether the clusters are running properly.

  2. Enable cluster federation for the fleet where the clusters are located and ensure that the clusters have been connected to the cluster federation. For details, see Cluster Federation.

    Figure 2 Clusters

  3. Creating a Federated Deployment

    To show the traffic switchover effect, the container image versions of the two clusters in this section are different. (This difference does not exist in the real-world production environment.)

    • Cluster ccecluster01: If the example application uses the image nginx:gz, the message ccecluster01 is in Guangzhou. is returned.
    • Cluster ccecluster02: If the example application uses the image nginx:sh, the message ccecluster02 is in Shanghai. is returned.

    Before the operation, upload the image of the example application to the SWR image repository in the region where the cluster is located. That is, upload the image nginx:gz to CN South-Guangzhou and the image nginx:sh to CN East-Shanghai1. Otherwise, the federated workload will malfunction because it cannot pull the image.

    In this example, example clusters and workloads are not limited in terms of cloud service providers, regions, and quantity.

    1. Log in to the Huawei Cloud UCS console. In the navigation pane, choose Fleets.
    2. Click the name of the fleet for which cluster federation has been enabled. The details page is displayed.
    3. In the navigation pane, choose Federation > Workloads, and click Create from Image in the upper right corner.
    4. Enter the basic information and configure container parameters. The image name can be customized. Click Next: Scheduling and Differentiation.
    5. Configure the cluster scheduling policy, complete differentiated cluster configuration, and click Create Workload.
      • Scheduling Mode: Select Cluster weight and set the weight of the two clusters to 1:1.
      • Differentiated Settings: Click on the right of the cluster to enable differentiated settings. Set the image name of ccecluster01 to swr.cn-south-1.myhuaweicloud.com/k8s-test2/nginx:gz (address of the image nginx:gz in the SWR image repository) and that of ccecluster02 to swr.cn-east-3.myhuaweicloud.com/k8s-test2/nginx:sh.
    Figure 3 Scheduling and differentiation

  4. Create a LoadBalancer access.

    1. Log in to the Huawei Cloud UCS console. In the navigation pane, choose Fleets.
    2. Click the name of the fleet for which cluster federation has been enabled. The details page is displayed.
    3. In the navigation pane, choose Federation > Services and Ingresses, and click Create Service in the upper right corner.
    4. Configure the parameters and click OK.
      • Service Type: Select LoadBalancer.
      • Port: Select TCP for Protocol, and enter the service port and container port, for example, 8800 and 80.
      • Cluster: Click to add clusters ccecluster01 and ccecluster02 in sequence. Select a shared ELB instance for LoadBalancer. The ELB instance must be in the VPC of the cluster. If no ELB instance is available in the list, click Create Load Balancer to create one on the ELB console. Retain default values for other parameters.
      • Selector: A service is associated with a load label through a selector. In this example, a load label is referenced to add a label.
      Figure 4 Creating a Service

  5. Create a DNS policy.

    1. Log in to the Huawei Cloud UCS console. In the navigation pane, choose Fleets.
    2. Click the name of the fleet for which cluster federation has been enabled. The details page is displayed.
    3. In the navigation pane, choose Federation > DNS Policies and add a root domain name.
    4. Click Create DNS Policy in the upper right corner and configure parameters.
      • Target Service: Select the service created in 4.
      • Distribution Mode: Select Adaptive. Traffic is automatically distributed based on the number of pods in each cluster. In this example, both ccecluster01 and ccecluster02 contain one pod. The two clusters receive traffic at the ratio of 1:1, as shown in Figure 6.
    Figure 5 Configuring the traffic ratio
    Figure 6 Traffic ratio topology

Verifying Multi-Active DR Scenarios

You have deployed applications in clusters ccecluster01 and ccecluster02 and allowed external access via LoadBalancing Services. After the DNS policy in 5 is created, the system automatically adds a resolution record for the selected root domain name and generates a unified external access path (domain name address) on UCS. Therefore, you can access the domain name address to verify traffic distribution.

  1. Obtain the domain name address.

    1. Log in to the Huawei Cloud UCS console. In the navigation pane, choose Fleets.
    2. Click the name of the fleet for which cluster federation has been enabled. The details page is displayed.
    3. In the navigation pane, choose Federation > DNS Policies. The value of Domain Name Address in the list is the domain name address.
    Figure 7 Domain name address

  2. Run the following command on a host that has been connected to the public network to continuously access the domain name address and check the cluster application processing status:

    • Generally, applications in both clusters receive traffic and each cluster processes 50% of the traffic.
      while true;do wget -q -O- helloworld.default.mcp-xxx.svc.xxx.co:8800; done
      ccecluster01 is in Guangzhou.
      ccecluster02 is in Shanghai.
      ccecluster01 is in Guangzhou.
      ccecluster02 is in Shanghai.
      ccecluster01 is in Guangzhou.
      ccecluster02 is in Shanghai.
      ...
    • When an application exception occurs on ccecluster01 (simulating an application exception by shutting down a cluster node), the system routes all traffic to ccecluster02, so that users are unaware of the exception.
      while true;do wget -q -O- helloworld.default.mcp-xxx.svc.xxx.co:8800; done
      ccecluster02 is in Shanghai.
      ccecluster02 is in Shanghai.
      ccecluster02 is in Shanghai.
      ccecluster02 is in Shanghai.
      ccecluster02 is in Shanghai.
      ccecluster02 is in Shanghai.
      ...

      Return to the UCS console. You can see that the cluster traffic ratio in the domain name list has changed. ccecluster02 takes over 100% traffic, which is consistent with the configured traffic ratio and what we have observed.

      Figure 8 Viewing domain names