Updated on 2026-05-18 GMT+08:00

Viewing Notebook Events

Instance statuses and key operations such as creating, starting, and stopping an instance, and changing the instance flavor are recorded in the backend. You can view the events on the notebook instance details page to monitor the instance statuses. You can refresh events on the right of the Event tab. You can also set the interval for automatically refreshing events to 30 seconds, 1 minute, or 5 minutes.

Viewing Events of a Notebook Instance

To view the event details of a notebook, click the notebook name. On the displayed notebook details page, click the Event tab.

Notebook Instance Events

Table 1 Events during instance creation

Event

Description

Severity

Solution

Scheduled

The instance has been scheduled.

Warning

Normal event, no action required.

PullingImage

The image is being pulled.

Warning

Normal event, no action required.

PulledImage

The image has been pulled.

Warning

Normal event, no action required.

NotebookHealthy

The instance is running and healthy.

Major

Normal event, no action required.

CreateNotebookFailed

Creating an instance failed.

Critical

Internal service error. Submit a service ticket to contact O&M engineers.

PullImageFailed

Pulling the image failed.

Critical

Check whether the image selected during instance creation exists. If the image does not exist, select another image to create an instance. If the image exists, submit a service ticket to contact O&M engineers.

FailedCreate

Failed to create notebook container. Please contact SRE to check node {node_name}

Critical

Internal service error. Submit a service ticket to contact O&M engineers.

CreateContainerError

Failed to create container. Please contact SRE to check node {node_name}

Critical

Internal service error. Submit a service ticket to contact O&M engineers.

FailedAttachVolume

Failed to attach volume. Please contact SRE to check node {node_name}

Major

Internal service error. Submit a service ticket to contact O&M engineers.

MountVolumeFailed

Mount volume failed; Check whether the DEW secret is correct if the instance cannot change to running in five minutes

Critical

Wait for 5 to 10 minutes and check whether the instance status changes to Running. If yes, no action is required. If the status is not changed, check whether the authentication information selected when OBS is used is correct.

Mount volume failed; Check if vpc of sfs-turbo is interconnected if the instance cannot change to running in five minutes

Critical

Wait for 5 to 10 minutes and check whether the instance status changes to Running. If yes, no action is required. If the status remains unchanged, verify that SFS has been properly connected to the VPC of your dedicated resource pool. For details, see Configuring the Dedicated Resource Pool to Access the Internet.

Mount volume failed; Please contact SRE to check node {node_name} if the instance cannot change to running in five minutes

Critical

Wait for 5 to 10 minutes and check whether the instance status changes to Running. If yes, no action is required. If the status is not changed, submit a service ticket to contact O&M engineers.

Table 2 Events during instance stopping

Event

Description

Severity

Solution

StopNotebook

The instance has been stopped.

Major

Normal event, no action required.

StopNotebookResourceIdle

The notebook instance will automatically stop or has automatically stopped because resources are idle.

Major

Normal event, no action required.

Table 3 Events during instance update

Event

Description

Severity

Solution

UpdateName

Updating the instance name

Warning

Normal event, no action required.

UpdateDescription

Updating the instance description

Warning

Normal event, no action required.

UpdateFlavor

Updating the instance flavor

Major

Normal event, no action required.

UpdateImage

Updating the instance image

Major

Normal event, no action required.

UpdateStorageSize

The instance storage size is being updated.

(User %s is updating storage size from %s GB to %s GB.)

Major

Normal event, no action required.

The instance storage size has been updated.

(User %s updated the storage size.)

Major

Normal event, no action required.

UpdateKeyPair

Configured the instance key pair.

(User %s updated the instance key pair to {%s}.)

Major

Normal event, no action required.

Updating the instance key pair

(User %s updated the instance key pair from %s to %s.)

Major

Normal event, no action required.

UpdateHook

Updating a custom script

Major

Normal event, no action required.

UpdateStorageSizeFailed

Updating the storage size failed because the resources are sold out.

(EVS disks are sold out.)

Critical

Go to the details page of of the instance to be scaled out, click the Storage tab, and add dynamic storage or expand the storage capacity.

Updating the storage size failed due to an internal error.

(Updating the EVS disk size failed. The O&M engineers are handling the fault.)

Critical

Internal service error. Submit a service ticket to contact O&M engineers.

Table 4 Events during image saving

Event

Description

Severity

Solution

SaveImage

The image has been saved.

Major

Normal event, no action required.

SavedImageFailed

Saving the image failed due to processes in D status.

(There are processes in 'D' status. Check process status using 'ps -aux' and kill all the processes in 'D' status.)

Critical

Run ps -aux to query all processes in the D state, run kill -9 <PID> to stop all processes in the D state, and save the image again.

Saving the image failed because the image is too large.

(The container size (%dG) is greater than the threshold (%dG).)

Critical

Delete unnecessary directories and files except those in the /home/ma-user/work/ directory of the instance. Reduce the container image size to the threshold specified in the event description and try again.

Saving the image failed due to the limit on the number of layers.

(There are too many layers in your image.)

Critical

The number of image layers used for starting the instance exceeds 125. Create an instance startup image. During the creation, you can reduce the number of image layers by combining commands and building the image by phase.

Saving the image failed due to task timeout.

(The O&M engineers are handling the fault.)

Critical

The task timed out due to a network or dependent service exception. Submit a service ticket to contact O&M engineers.

Saving the image failed due to SWR service issues.

Critical

SWR service error. Submit a service ticket to contact O&M engineers.

CheckImageSize

The notebook container image size is {image_size}G.

{image_size} indicates the image size, which is a variable.

Warning

Normal event, no action required.

CheckImageLayer

The number of original notebook image layers is {layer_number}.

{layer_number} indicates the number of image layers, which is a variable.

Warning

Normal event, no action required.

ContainerCommitStarted

Start to commit notebook container.

Warning

Normal event, no action required.

ContainerCommitSuccess

Notebook container commit successfully.

Warning

Normal event, no action required.

ImagePushStarted

Start to push notebook image.

Warning

Normal event, no action required.

ImagePushSuccess

Notebook image push successfully.

Warning

Normal event, no action required.

ContainerCommitFailed

Failed to commit notebook container. Please contact SRE to check node {node_name}.

{node_name} indicates the node name, which is a variable and is generally in the format of an IP address, for example, 192.168.225.161.

Warning

Node error or internal service error. Submit a service ticket to contact O&M engineers.

ImagePushFailed

Failed to push Notebook image. Please contact SRE to check node {node_name}.

Warning

Failed to push the image. Try again. If the fault persists, submit a service ticket to contact O&M engineers.

Table 5 Events during instance running

Event Name

Description

Severity

Solution

NotebookUnhealthy

The instance is unhealthy.

Critical

This event may be triggered when a debugging task is started in an instance, for example, the task occupies too many CPU, memory, or I/O resources. It can be automatically cleared after the instance load decreases. Wait for a while and refresh the page. If the NotebookHealthy event is added, the instance status is normal and no action is required. If the fault persists for a long time, submit a service ticket to contact O&M engineers for assistance.

OutOfMemory

The instance is evicted because the memory usage exceeds the upper limit.

Critical

When an instance process occupies more memory than the applied specifications, this event is triggered by the Kubernetes mechanism and the instance is restarted. After the restart, the instance status changes to Normal. In future use, do not perform tasks with high memory usage.

JupyterProcessKilled

The Jupyter process stops abnormally.

Critical

This event may be triggered if the Jupyter process is stopped by mistake or an unknown error occurs in the instance container. The instance will automatically restart. After the restart, the instance status changes to Normal.

CacheVolumeExceedQuota

The /cache file size has exceeded the upper limit.

Critical

This event is triggered when the /cache directory file size exceeds the maximum limit allowed by the instance specifications. The instance will automatically restart. After the restart, the instance status changes to Normal. In future use, pay attention to the size of the /cache directory. For details about the mapping between the space allocated to the directory and the instance specifications, see What Are the Sizes of the /cache Directories for Resources with Varying Specifications on ModelArts Notebook Instances?

NotebookHealthy

The instance recovers from an abnormal state to a normal state.

Major

Normal event, no action required.

EVSSoldOut

EVS disks are sold out.

Critical

This event may be triggered when you create a notebook instance and select EVS as the storage type, but EVS disks are sold out. In this case, use OBS or PFS storage instead. If you want still want to use EVS, submit a service ticket to contact O&M engineers for capacity expansion.

Table 6 Events for dynamic OBS mounting

Event

Description

Severity

Solution

DynamicMountStorage

The OBS storage is mounted.

Major

Normal event, no action required.

DynamicUnmountStorage

The OBS storage is unmounted.

Major

Normal event, no action required.

Table 7 Events triggered on the user side

Event

Description

Severity

Solution

RefreshCredentialsFailed

Authentication failed.

Critical

Normal event, no action required.