Updated on 2024-06-11 GMT+08:00

Failed to Start a Service

Symptom

After a service is started, the system displays a message, indicating a container startup failure.

Figure 1 Service startup failure

Faulty AI Application

If the image used for creating an AI application is faulty, recreate the image by following the instructions provided in Creating a Custom Image and Using It to Create an AI Application. Ensure the image can be started properly and the expected data can be returned through curl on the local host.

Incorrect Port in the Image

The port enabled in the image is not 8080, or the port enabled in the image is different from the port configured during AI application creation. As a result, the register-agent cannot communicate with the AI application during service deployment. After a certain period of time (20 minutes at most), it is considered that starting the AI application failed.

If this fault occurs, check the port enabled in the custom image code and the port configured during AI application creation. Ensure that the two ports are the same. If you do not specify a port during AI application creation, ModelArts will listen to port 8080 by default. In this case, the port enabled in the custom image code must be 8080.
Figure 2 Port enabled in the custom image code
Figure 3 Port configured during AI application creation

Incorrect Health Check Configuration

If health check is enabled in the image, perform the following operations to locate the fault:

  • Check whether the health check address configured during AI application creation is the same as the actual one.

    If the AI application is created using a base image provided by ModelArts, the health check URL must be /health by default.

    Figure 4 Configuring the health check URL

Incorrect customize_service.py

Check service runtime logs to locate the fault and rectify it.

Pulling an Image Failed

If the service fails to be started and a message is displayed indicating that the image fails to be pulled, see What Do I Do If an Image Fails to Be Pulled When a Service Is Deployed, Started, Upgraded, or Modified?

Scheduling Failed Due To Insufficient Resources

The service fails to be started, and a message is displayed indicating that resources are insufficient and service scheduling fails. For details, see What Do I Do If Resources Are Insufficient When a Service Is Deployed, Started, Upgraded, or Modified?.