How Do I Check Whether an Error Is Caused by a Model When a Real-Time Service Is Running But Prediction Failed?
Symptom
A running real-time service is used for prediction. After a prediction request is initiated, the received response does not meet the expectation. It is difficult to determine whether the issue is caused by the model.
Possible Cause
After a real-time service is started, either of the following methods can be used for prediction:
- Method 1: Perform prediction on the Prediction tab of the service details page.
- Method 2: Obtain the API URL on the Usage Guides tab of the service details page, and use cURL or Postman for prediction.
This issue may occur after an inference request is initiated, regardless of whether method 1 or 2 is used.
An inference request is finally sent to the model. The issue may be caused by an error occurred when the model processed the inference request. Determine whether the issue is caused by the model, which facilitates rapid fault locating.
Solution
No matter whether method 1 or 2 is used, obtain the response header and body of the inference request.
- If method 1 is used, obtain the response to the inference request through the developer tool of the browser. Take Google Chrome as an example. Press F12 to open the developer tool, click the Network tab and then Predict. The response to the inference request is displayed on the Network tab page.
Figure 1 Response to an inference request
Find the inference request in the Name pane. The URL of the inference request contains keyword /v1/infers. View the complete URL in the Headers pane. Obtain the response in Headers and Response.
- If method 2 is used, obtain the response header and body through different tools. For example, run the cURL command and use -I to obtain the response header.
If Server in the obtained response header is ModelArts and the response body does not contain a ModelArts.XXXX error code, the response is returned by the model. If the response is not as expected, the issue is caused by the model.
Summary and Suggestions
A model can be imported from a container image, OBS, or AI Gallery. The following provides common troubleshooting methods for each model source:
- For a model imported from a container image, the cause of the issue varies depending on image customization. Check model logs to identify the cause.
- For a model imported from OBS, if the response you received contains an MR error code, for example, MR.0105, view logs on the Logs tab of the real-time service details page to identify the cause.
- For a model imported from AI Gallery, consult the publisher of the model for the cause.
Real-Time Services FAQs
- What Do I Do If a Conflict Occurs in the Python Dependency Package of a Custom Prediction Script When I Deploy a Real-Time Service?
- How Do I Speed Up Real-Time Prediction?
- Can a New-Version AI Application Still Use the Original API?
- What Is the Format of a Real-Time Service API?
- How Do I Check Whether an Error Is Caused by a Model When a Real-Time Service Is Running But Prediction Failed?
- How Do I Fill in the Request Header and Request Body of an Inference Request When a Real-Time Service Is Running?
- Why Cannot I Access the Obtained Inference Request Address from the Initiator Client?
- What Do I Do If Deploying a Service Failed Due to Insufficient Quota?
- Why Did My Service Deployment Fail with Proper Deployment Timeout Configured?
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbotmore