tf-operator

Introduction

tf-operator is an operator who manages the lifecycle of TensorFlow applications on Kubernetes. It aims to specify and run TensorFlow applications (workloads) as easily as running other types of workloads on Kubernetes.

Official project introduction and documentation: https://github.com/kubeflow/tf-operator

Notes and Constraints

  • This add-on can be installed only in CCE clusters of v1.13 and v1.15.
  • Some regions do not support this add-on. For details, refer to the console.

Installing the Add-on

  1. Log in to the CCE console. In the navigation pane, choose Add-ons. On the Add-on Marketplace tab page, click Install Add-on under tf-operator.
  2. On the Install Add-on page, select the cluster and the add-on version, and click Next: Configuration.
  3. Click Install to directly install the add-on. Currently, the tf-operator add-on has no configurable parameters.

    After the add-on is installed, click Go Back to Previous Page. On the Add-on Instance tab page, select the corresponding cluster to view the running instance. This indicates that the add-on has been installed on each node in the cluster.

Upgrading the Add-on

  1. Log in to the CCE console. In the navigation pane, choose Add-ons. On the Add-on Instance tab page, click Upgrade under tf-operator.

    • If the Upgrade button is not available, the current add-on is already up-to-date and no upgrade is required.
    • During the upgrade, the tf-operator add-on of the original version on cluster nodes will be discarded, and the add-on of the target version will be installed.

  2. On the Basic Information page, select the add-on version and click Next.
  3. Click Upgrade.

Uninstalling the Add-on

  1. Log in to the CCE console. In the navigation pane, choose Add-ons. On the Add-on Instance tab page, click Uninstall under tf-operator.
  2. In the dialog box displayed, click Yes to uninstall the add-on.