创建TFJob
功能介绍
创建TFJob。
TFJob即Tensorflow任务,是基于Tensorflow开源框架的kubernetes自定义资源类型,有多种角色可以配置,能够帮助您更简单地实现Tensorflow的单机或分布式训练。Tensorflow开源框架的信息详见:https://www.tensorflow.org 。
URI
POST /apis/kubeflow.org/v1/namespaces/{namespace}/tfjobs
|
参数 |
是否必选 |
描述 |
|---|---|---|
|
namespace |
Yes |
object name and auth scope, such as for teams and projects |
|
参数 |
是否必选 |
描述 |
|---|---|---|
|
pretty |
No |
If 'true’, then the output is pretty printed. |
请求消息
请求参数:
请求参数的详细描述请参见表154。
请求示例:
{
"apiVersion": "kubeflow.org/v1",
"kind": "TFJob",
"metadata": {
"name": "tfjob-test"
},
"spec": {
"backoffLimit": 6,
"tfReplicaSpecs": {
"Ps": {
"replicas": 1,
"template": {
"spec": {
"containers": [
{
"args": [
"python",
"/opt/tf-benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py",
"--batch_size=1",
"--model=resnet50",
"--variable_update=parameter_server",
"--flush_stdout=true",
"--num_gpus=1",
"--local_parameter_device=cpu",
"--device=cpu",
"--data_format=NHWC"
],
"image": "*.*.*.215:20202/cci/tf-benchmarks-cpu:v1",
"name": "tensorflow",
"ports": [
{
"containerPort": 2222,
"name": "tfjob-port"
}
],
"resources": {
"limits": {
"cpu": "2",
"memory": "4Gi"
},
"requests": {
"cpu": "2",
"memory": "4Gi"
}
}
}
],
"restartPolicy": "OnFailure",
"imagePullSecrets": [
{
"name": "imagepull-secret"
}
]
}
}
},
"Worker": {
"replicas": 1,
"template": {
"spec": {
"containers": [
{
"args": [
"python",
"/opt/tf-benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py",
"--batch_size=1",
"--model=resnet50",
"--variable_update=parameter_server",
"--flush_stdout=true",
"--local_parameter_device=cpu",
"--device=cpu",
"--data_format=NHWC"
],
"image": "*.*.*.215:20202/cci/tf-benchmarks-cpu:v1",
"name": "tensorflow",
"ports": [
{
"containerPort": 2222,
"name": "tfjob-port"
}
],
"resources": {
"limits": {
"cpu": "2",
"memory": "4Gi"
},
"requests": {
"cpu": "2",
"memory": "4Gi"
}
}
}
],
"restartPolicy": "OnFailure",
"imagePullSecrets": [
{
"name": "imagepull-secret"
}
]
}
}
}
}
}
}
响应消息
响应参数:
响应参数的详细描述请参考表154。
响应示例:
{
"apiVersion": "kubeflow.org/v1",
"kind": "TFJob",
"metadata": {
"creationTimestamp": "2019-07-23T12:39:47Z",
"generation": 1,
"name": "tfjob-test",
"namespace": "kube-test",
"resourceVersion": "72050567",
"selfLink": "/apis/kubeflow.org/v1/namespaces/kube-test/tfjobs/tfjob-test",
"uid": "f461f966-ad46-11e9-aaa4-340a9837e413"
},
"spec": {
"backoffLimit": 6,
"tfReplicaSpecs": {
"Ps": {
"replicas": 1,
"template": {
"spec": {
"containers": [
{
"args": [
"python",
"/opt/tf-benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py",
"--batch_size=1",
"--model=resnet50",
"--variable_update=parameter_server",
"--flush_stdout=true",
"--num_gpus=1",
"--local_parameter_device=cpu",
"--device=cpu",
"--data_format=NHWC"
],
"image": "*.*.*.215:20202/cci/tf-benchmarks-cpu:v1",
"name": "tensorflow",
"ports": [
{
"containerPort": 2222,
"name": "tfjob-port"
}
],
"resources": {
"limits": {
"cpu": "2",
"memory": "4Gi"
},
"requests": {
"cpu": "2",
"memory": "4Gi"
}
}
}
],
"imagePullSecrets": [
{
"name": "imagepull-secret"
}
],
"restartPolicy": "OnFailure"
}
}
},
"Worker": {
"replicas": 1,
"template": {
"spec": {
"containers": [
{
"args": [
"python",
"/opt/tf-benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py",
"--batch_size=1",
"--model=resnet50",
"--variable_update=parameter_server",
"--flush_stdout=true",
"--local_parameter_device=cpu",
"--device=cpu",
"--data_format=NHWC"
],
"image": "*.*.*.215:20202/cci/tf-benchmarks-cpu:v1",
"name": "tensorflow",
"ports": [
{
"containerPort": 2222,
"name": "tfjob-port"
}
],
"resources": {
"limits": {
"cpu": "2",
"memory": "4Gi"
},
"requests": {
"cpu": "2",
"memory": "4Gi"
}
}
}
],
"imagePullSecrets": [
{
"name": "imagepull-secret"
}
],
"restartPolicy": "OnFailure"
}
}
}
}
},
"status": {
}
}
状态码
|
状态码 |
描述 |
|---|---|
|
200 |
OK |
|
201 |
Created |
|
202 |
Accepted |
|
401 |
Unauthorized |
|
400 |
Badrequest |
|
500 |
Internal error |
|
403 |
Forbidden |