Updated on 2024-12-30 GMT+08:00

Failed to Start a Service

Symptom

After a service is started, the system displays a message, indicating a container startup failure.

Figure 1 Service startup failure

Faulty Model

If the image used for creating a model is faulty, recreate the image by following the instructions provided in Creating a Custom Image and Using It to Create an AI Application. Ensure that the image can be started properly and the expected data can be returned through curl on the local host.

Incorrect Port in the Image

The port enabled in the image is not 8080, or the port enabled in the image is different from the port configured during model creation. As a result, the register-agent cannot communicate with the model during service deployment. After a certain period of time (20 minutes at most), it is considered that starting the model fails.

If this fault occurs, check the port enabled in the custom image code and the port configured during model creation. Ensure that the two ports are the same. If you do not specify a port during model creation, ModelArts will listen to port 8080 by default. In this case, the port enabled in the custom image code must be 8080.
Figure 2 Port enabled in the custom image code
Figure 3 Port configured during model creation

Incorrect Health Check Configuration

If health check is enabled in the image, perform the following operations to locate the fault:

  • Check whether the health check address configured during model creation is the same as the actual one.

    If the model is created using a base image provided by ModelArts, the health check URL must be /health by default.

    Figure 4 Configuring the health check URL

Incorrect customize_service.py

Check service runtime logs to locate the fault and rectify it.

Pulling an Image Failed

If the service fails to be started and a message is displayed indicating that the image fails to be pulled, see What Do I Do If an Image Fails to Be Pulled When a Service Is Deployed, Started, Upgraded, or Modified?

Scheduling Failed Due To Insufficient Resources

The service fails to be started, and a message is displayed indicating that resources are insufficient and service scheduling fails. For details, see Resources Are Insufficient When a Service Is Deployed, Started, Upgraded, or Modified.

Insufficient Memory

The service fails to be started, and a message is displayed indicating that the memory is insufficient. For details, see What Can I Do if the Memory Is Insufficient?.