Cluster Evaluation

Migrating applications from one environment to another is a challenging task, so you need to plan and prepare carefully. kspider is a tool used to collect information about the source cluster. It provides cluster-related data such as the Kubernetes version, scale, workload quantity, storage, and in-use images. The data helps you understand the current status of the cluster and evaluate migration risks, and select a proper destination cluster version and scale.

How kspider Works

Figure 1 shows the architecture of kspider, which consists of three modules: collection, connection management, and analysis. The collection module can collect data of the source cluster, including namespaces, workloads, nodes, and networks. The connection management module establishes connections with the API Server of the source cluster. The analysis module aims to output the collected data of the source cluster (generating the cluster-*.json file) and provide the recommendation information of the destination cluster (generating the preferred-*.json file) after evaluation.

Figure 1 kspider architecture
Click to enlarge

Usage of kspider

kspider can run on Linux (x86 and Arm) and Windows. The usage is similar in both environments. This section uses the Linux (x86) environment as an example.

If Linux (Arm) or Windows is used, replace kspider-linux-amd64 in the following command with kspider-linux-arm64 or kspider-windows-amd64.exe.

Prepare a server, upload kspider to the server, and decompress the tool package. For details, see Preparations. Run ./kspider-linux-amd64 -h in the directory where kspider is located to learn about its usage.

-k, --kubeconfig: specifies the location of the kubeconfig file of kubectl. The default value is $HOME/.kube/config. The kubeconfig file is used to configure access to the Kubernetes cluster. The kubeconfig file contains the authentication credentials and endpoints (access addresses) required for accessing and registering the Kubernetes cluster. For details, see the Kubernetes documentation.
-n, --namespaces: specifies the collected namespace. By default, system namespaces such as kube-system, kube-public, and kube-node-lease are excluded.
-q, --quiet: indicates static exit.
-s, --serial: specifies the unique sequence number of the output aggregation file (cluster-{serial}.json) and recommendation file (preferred-{serial}.json).

$ ./kspider-linux-amd64 -h
A cluster information collection and recommendation tool implement by Go.

Usage:
  kspider [flags]

Aliases:
  kspider, kspider

Flags:
  -h, --help                help for kspider
  -k, --kubeconfig string   The kubeconfig of k8s cluster's. Default is the $HOME/.kube/config. (default "$HOME/.kube/config")
  -n, --namespaces string   Specify a namespace for information collection. If multiple namespaces are specified, separate them with commas (,), such as ns1,ns2. default("") is all namespaces
  -q, --quiet               command to execute silently
  -s, --serial string       User-defined sequence number of the execution. The default value is the time when the kspider is started. (default "1673853404")

Step 1: Collect Data from the Source Cluster

Connect the source cluster to the cloud. For details, see Registering an Attached Cluster (Public Network Access).

Use the default parameter settings to collect data of all namespaces in the cluster. Run the ./kspider-linux-amd64 command.

Command output:

[~]# ./kspider-linux-amd64
The Cluster version is v1.15.6-r1-CCE2.0.30.B001
There are 5 Namespaces
There are 2 Nodes
	Name	 CPU	 Memory	 IP	 Arch	 OS	 Kernel	 MachineID
	10.1.18.64	 4	 8008284Ki	 [10.1.18.64 10.1.18.64]	 amd64	 linux	 3.10.0-1127.19.1.el7.x86_64	 ef9270ed-7eb3-4ce6-a2d8-f1450f85489a
	10.1.19.13	 4	 8008284Ki	 [10.1.19.13 10.1.19.13]	 amd64	 linux	 3.10.0-1127.19.1.el7.x86_64	 2d889590-9a32-47e5-b947-09c5bda81849
There are 9 Pods
There are 0 LonePods: 
There are 2 StatefulSets: 
	Name	 Namespace	 NodeAffinity
	minio	 default	 false
	minio	 minio	 false
There are 3 Deployments: 
	Name	 Namespace	 NodeAffinity
	rctest	 default	 true
	flink-operator-controller-manager	 flink-operator-system	 false
	rctest	 minio	 false
There are 1 DaemonSets: 
	Name	 Namespace	 NodeAffinity
	ds-nginx	 minio	 false
There are 0 Jobs: 
There are 0 CronJobs: 
There are 4 PersistentVolumeClaims: 
	Namespace/Name	 Pods
	default/pvc-data-minio-0	 default/minio-0
	minio/obs-testing	 minio/ds-nginx-9hmds,minio/ds-nginx-4jsfg
	minio/pvc-data-minio-0	 minio/minio-0
There are 5 PersistentVolumes: 
	Name	 Namespace	 pvcName	 scName	 size	 key
	pvc-bd36c70f-75bf-4000-b85c-f9fb169a14a8	 minio-pv	 obs-testing	 csi-obs	 1Gi	 pvc-bd36c70f-75bf-4000-b85c-f9fb169a14a8
	pvc-c7c768aa-373a-4c52-abea-e8b486d23b47	 minio-pv	 pvc-data-minio-0	 csi-disk-sata	 10Gi	 1bcf3d00-a524-45b1-a773-7efbca58f36a
	pvc-4f52462b-3b4c-4191-a63b-5a36a8748c05	 minio	 obs-testing	 csi-obs	 1Gi	 pvc-4f52462b-3b4c-4191-a63b-5a36a8748c05
	pvc-9fd92c99-805a-4e65-9f22-e238130983c8	 default	 pvc-data-minio-0	 csi-disk	 10Gi	 590afd05-fc68-4c10-a598-877100ca7b3f
	pvc-a22fd877-f98d-4c3d-a04e-191d79883f97	 minio	 pvc-data-minio-0	 csi-disk-sata	 10Gi	 48874130-df77-451b-9b43-d435ac5a11d5
There are 7 Services: 
	Name	 Namespace	 ServiceType
	headless-lxprus	 default	 ClusterIP
	kubernetes	 default	 ClusterIP
	minio	 default	 NodePort
	flink-operator-controller-manager-metrics-service	 flink-operator-system	 ClusterIP
	flink-operator-webhook-service	 flink-operator-system	 ClusterIP
	headless-lxprus	 minio	 ClusterIP
	minio	 minio	 NodePort
There are 0 Ingresses: 
There are 6 Images: 
	Name
	gcr.io/flink-operator/flink-operator:v1beta1-6
	flink:1.8.2
	swr.cn-north-4.myhuaweicloud.com/paas/minio:latest
	nginx:stable-alpine-perl
	swr.cn-north-4.myhuaweicloud.com/everest/minio:latest
	gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0
There are 2 Extra Secrets: 
	SecretType
	cfe/secure-opaque
	helm.sh/release.v1

After the kspider command is executed, the following files are generated in the current directory:

cluster-*.json: This file contains data collected from the source cluster and applications. The data can be used to analyze and plan the migration.
preferred-*.json: This file contains information about the recommended destination cluster. A preliminary evaluation is performed for the source cluster according to its scale and node specifications. The file provides suggestions on the version and scale of the destination cluster.

View the data collected from the source cluster and applications.

You can use a text editor or JSON viewer to open the cluster-*.json file to view the data. Replace the asterisk (*) in the file name with the actual timestamp or serial number to find and open the correct file.

Description of the cluster-*.json file:

{
  K8sVersion: Kubernetes version. The value is a string.
  Namespaces: number of namespaces. The value is a string.
  Pods: total number of pods. The value is an integer.
  Nodes: node information. The IP address is used as the key to display node information.
    IP addresses
      CPU: CPU. The value is a string.
      Arch: CPU architecture. The value is a string.
      Memory: memory. The value is a string.
      HugePages1Gi: 1-GB hugepage memory. The value is a string.
      HugePages2Mi: 2-MB hugepage memory. The value is a string.
      OS: node OS. The value is a string.
      KernelVersion: OS kernel version. The value is a string.
      RuntimeVersion: running status and version of the node container. The value is a string.
      InternalIP: internal IP address. The value is a string.
      ExternalIP: external IP address. The value is a string.
      MachineID: node ID. The value is a string. Ensure that the CCE ID is the same as the ECS ID.
  Workloads: workload
    Deployment: workload type. The value can be Deployment, StatefulSet, DaemonSet, CronJob, job, or LonePod.
      default: namespace name
        Count: quantity. The value is an integer.
        Items: details. The value is an array.
          Name: workload name. The value is a string.
          Namespace: namespace name. The value is a string.
          NodeAffinity: node affinity. The value is of the Boolean type.
          Replicas: number of replicas. The value is an integer.
  Storage: storage
    PersistentVolumes: persistent volume
      pv-name: The PV name is used as the key.
        VolumeID: volume ID. The value is a string.
        Namespace: namespace. The value is a string.
        PvcName: name of the bound PVC. The value is a string.
        ScName: storage class name. The value is a string.
        Size: size of the space to request. The value is a string.
        Pods: name of the pod that uses the PV. The value is a string.
        NodeIP: IP address of the node where the pod is located. The value is a string.
        VolumePath: path of the node to which the pod is mounted. The value is a string.
    OtherVolumes: volumes of other types
      Type: AzureFile, AzureDisk, GCEPersistentDisk, AWSElasticBlockStore, Cinder, GlusterFS, NFS, CephFS, FlexVolume, or DownwardAPI
        The volume ID, volume name, and volume shared path are keys.
          Pods: name of the pod. The value is a string.
          NodeIP: IP address of the node where the pod is located. The value is a string.
          Information that uniquely identifies a volume, such as the volume ID, volume name, and volume shared path. The value is a string.
  Networks: network
    LoadBalancer: load balancer type
      service: network type, which can be Service or ingress.
        Name: name. The value is a string.
        Namespace: namespace name. The value is a string.
        Type: type. The value is a string.
  ExtraSecrets: extended secret type
    Secret type. The value is a string.
  Images: image
    Image repo. The value is a string.
}

Example:

{
  "K8sVersion": "v1.19.10-r0-CCE22.3.1.B009",
  "Namespaces": 12,
  "Pods": 33,
  "Nodes": {
    "10.1.17.219": {
      "CPU": "4",
      "Memory": "7622944Ki",
      "HugePages1Gi": "0",
      "HugePages2Mi": "0",
      "Arch": "amd64",
      "OS": "EulerOS 2.0 (SP9x86_64)",
      "KernelVersion": "4.18.0-147.5.1.6.h687.eulerosv2r9.x86_64",
      "RuntimeVersion": "docker://18.9.0",
      "InternalIP": "10.1.17.219",
      "ExternalIP": "",
      "MachineID": "0c745e03-2802-44c2-8977-0a9fd081a5ba"
    },
    "10.1.18.182": {
      "CPU": "4",
      "Memory": "7992628Ki",
      "HugePages1Gi": "0",
      "HugePages2Mi": "0",
      "Arch": "amd64",
      "OS": "EulerOS 2.0 (SP5)",
      "KernelVersion": "3.10.0-862.14.1.5.h520.eulerosv2r7.x86_64",
      "RuntimeVersion": "docker://18.9.0",
      "InternalIP": "10.1.18.182",
      "ExternalIP": "100.85.xxx.xxx",
      "MachineID": "2bff3d15-b565-496a-817c-063a37eaf1bf"
    }
  },
  "Workloads": {
    "CronJob": {},
    "DaemonSet": {
      "default": {
        "Count": 1,
        "Items": [
          {
            "Name": "kubecost-prometheus-node-exporter",
            "Namespace": "default",
            "NodeAffinity": false,
            "Replicas": 3
          }
        ]
      }
    },
    "Deployment": {
      "default": {
        "Count": 1,
        "Items": [
          {
            "Name": "kubecost-cost-analyzer",
            "Namespace": "default",
            "NodeAffinity": false,
            "Replicas": 1
          }
        ]
      },
      "kubecost": {
        "Count": 1,
        "Items": [
          {
            "Name": "kubecost-kube-state-metrics",
            "Namespace": "kubecost",
            "NodeAffinity": false,
            "Replicas": 1
          }
        ]
      }
    },
    "Job": {},
    "LonePod": {},
    "StatefulSet": {
      "minio-all": {
        "Count": 1,
        "Items": [
          {
            "Name": "minio",
            "Namespace": "minio-all",
            "NodeAffinity": false,
            "Replicas": 1
          }
        ]
      }
    }
  },
  "Storage": {
    "PersistentVolumes": {
      "demo": {
        "VolumeID": "demo",
        "Namespace": "fluid-demo-test",
        "PvcName": "demo",
        "ScName": "fluid",
        "Size": "100Gi",
        "Pods": "",
        "NodeIP": "",
        "VolumePath": ""
      },
      "pvc-fd3a5bb3-119a-44fb-b02e-96b2cf9bb36c": {
        "VolumeID": "82365752-89b6-4609-9df0-007d964b7fe4",
        "Namespace": "minio-all",
        "PvcName": "pvc-data-minio-0",
        "ScName": "csi-disk",
        "Size": "10Gi",
        "Pods": "minio-all/minio-0",
        "NodeIP": "10.1.23.159",
        "VolumePath": "/var/lib/kubelet/pods/5fc47c82-7cbd-4643-98cd-cea41de28ff2/volumes/kubernetes.io~csi/pvc-fd3a5bb3-119a-44fb-b02e-96b2cf9bb36c/mount"
      }
    },
    "OtherVolumes": {}
  },
  "Networks": {
    "LoadBalancer": {}
  },
  "ExtraSecrets": [
    "cfe/secure-opaque",
    "helm.sh/release.v1"
  ],
  "Images": [
    "nginx:stable-alpine-perl",
    "ghcr.io/koordinator-sh/koord-manager:0.6.2",
    "swr.cn-north-4.myhuaweicloud.com/paas/minio:latest",
    "swr.cn-north-4.myhuaweicloud.com/everest/e-backup-test:v1.0.0",
    "gcr.io/kubecost1/cost-model:prod-1.91.0",
    "gcr.io/kubecost1/frontend:prod-1.91.0"
  ]
}

Step 2: Evaluate the Destination Cluster

After the kspider command is executed, in addition to the cluster-*.json file, the preferred-*.json file is also generated in the current directory. After performing preliminary evaluation for the source cluster according to its scale and node specifications, the file provides the recommended version and scale of the destination cluster. This helps you better plan and prepare for the migration.

Description of the preferred-*.json file:

{
  K8sVersion: Kubernetes version. The value is a string.
  Scale: cluster scale. The value is a string.
  Nodes: node information
    CPU: CPU. The value is a string.
    Memory: memory. The value is a string.
    Arch: CPU architecture. The value is a string.
    KernelVersion: OS kernel version. The value is a string.
    ProxyMode: cluster proxy mode. The value is a string.
  ELB: whether ELB is required. The value is of the Boolean type.
}

Evaluation rules for each field in the preceding file:

**Table 1** Evaluation rules
Field	Evaluation Rule
K8sVersion	If the version is earlier than 1.21, the main release version of the UCS cluster (for example, 1.21, which changes over time) is recommended. If the version is later than the main release version, the latest version of the UCS cluster is recommended.
Scale	< 25 nodes in the source cluster: Destination cluster of 50 nodes is recommended. 25 ≤ Nodes in the source cluster < 100: Destination cluster of 200 nodes is recommended. 100 ≤ Nodes in the source cluster < 500: Destination cluster of 1000 nodes is recommended. Nodes in the source cluster ≥ 500: Destination cluster of 2000 nodes is recommended.
CPU/Memory	Statistics about the specification of the largest quantity are collected.
Arch	Statistics about the specification of the largest quantity are collected.
KernelVersion	Statistics about the specification of the largest quantity are collected.
ProxyMode	Configure this parameter according to the cluster scale. For a cluster with more than 1,000 nodes, ipvs is recommended. For a cluster with fewer than 1,000 nodes, iptables is recommended.
ELB	Check whether the source cluster has a LoadBalancer Service.

Example:

{
  "K8sVersion": "v1.21",
  "Scale": 50,
  "Nodes": {
    "CPU": "4",
    "Memory": "7622952Ki",
    "Arch": "amd64",
    "KernelVersion": "3.10.0-862.14.1.5.h520.eulerosv2r7.x86_64"
  },
  "ELB": false,
  "ProxyMode": "iptables"
}