Viewing Notebook Events

Instance statuses and key operations such as creating, starting, and stopping an instance, and changing the instance flavor are recorded in the backend. You can view the events on the notebook instance details page to monitor the instance statuses. You can refresh events on the right of the Event tab. You can also set the interval for automatically refreshing events to 30 seconds, 1 minute, or 5 minutes.

Viewing Events of a Notebook Instance

To view the event details of a notebook, click the notebook name. On the displayed notebook details page, click the Event tab.

Notebook Instance Events

**Table 1** Events during instance creation
Event	Description	Severity	Solution
Scheduled	The instance has been scheduled.	Warning	Normal event, no action required.
PullingImage	The image is being pulled.	Warning	Normal event, no action required.
PulledImage	The image has been pulled.	Warning	Normal event, no action required.
NotebookHealthy	The instance is running and healthy.	Major	Normal event, no action required.
CreateNotebookFailed	Creating an instance failed.	Critical	Internal service error. Submit a service ticket to contact O&M engineers.
PullImageFailed	Pulling the image failed.	Critical	Check whether the image selected during instance creation exists. If the image does not exist, select another image to create an instance. If the image exists, submit a service ticket to contact O&M engineers.
FailedCreate	Failed to create notebook container. Please contact SRE to check node {node_name}	Critical	Internal service error. Submit a service ticket to contact O&M engineers.
CreateContainerError	Failed to create container. Please contact SRE to check node {node_name}	Critical	Internal service error. Submit a service ticket to contact O&M engineers.
FailedAttachVolume	Failed to attach volume. Please contact SRE to check node {node_name}	Major	Internal service error. Submit a service ticket to contact O&M engineers.
MountVolumeFailed	Mount volume failed; Check whether the DEW secret is correct if the instance cannot change to running in five minutes	Critical	Wait for 5 to 10 minutes and check whether the instance status changes to Running. If yes, no action is required. If the status is not changed, check whether the authentication information selected when OBS is used is correct.
	Mount volume failed; Check if vpc of sfs-turbo is interconnected if the instance cannot change to running in five minutes	Critical	Wait for 5 to 10 minutes and check whether the instance status changes to Running. If yes, no action is required. If the status is not changed, check whether the VPC of the dedicated resource pool has been connected to SFS. For details, see Accessing a Real-Time Service (VPC High-Speed Channel).
	Mount volume failed; Please contact SRE to check node {node_name} if the instance cannot change to running in five minutes	Critical	Wait for 5 to 10 minutes and check whether the instance status changes to Running. If yes, no action is required. If the status is not changed, submit a service ticket to contact O&M engineers.

**Table 2** Events during instance stopping
Event	Description	Severity	Solution
StopNotebook	The instance has been stopped.	Major	Normal event, no action required.
StopNotebookResourceIdle	The notebook instance will automatically stop or has automatically stopped because resources are idle.	Major	Normal event, no action required.

**Table 3** Events during instance update
Event	Description	Severity	Solution
UpdateName	Updating the instance name	Warning	Normal event, no action required.
UpdateDescription	Updating the instance description	Warning	Normal event, no action required.
UpdateFlavor	Updating the instance flavor	Major	Normal event, no action required.
UpdateImage	Updating the instance image	Major	Normal event, no action required.
UpdateStorageSize	The instance storage size is being updated. (User %s is updating storage size from %s GB to %s GB.)	Major	Normal event, no action required.
UpdateStorageSize	The instance storage size has been updated. (User %s updated the storage size.)	Major	Normal event, no action required.
UpdateKeyPair	Configured the instance key pair. (User %s updated the instance key pair to {%s}.)	Major	Normal event, no action required.
UpdateKeyPair	Updating the instance key pair (User %s updated the instance key pair from %s to %s.)	Major	Normal event, no action required.
UpdateHook	Updating a custom script	Major	Normal event, no action required.
UpdateStorageSizeFailed	Updating the storage size failed because the resources are sold out. (EVS disks are sold out.)	Critical	Go to the details page of of the instance to be scaled out, click the Storage tab, and add dynamic storage or expand the storage capacity.
UpdateStorageSizeFailed	Updating the storage size failed due to an internal error. (Updating the EVS disk size failed. The O&M engineers are handling the fault.)	Critical	Internal service error. Submit a service ticket to contact O&M engineers.

**Table 4** Events during image saving
Event	Description	Severity	Solution
SaveImage	The image has been saved.	Major	Normal event, no action required.
SavedImageFailed	Saving the image failed due to processes in D status. (There are processes in 'D' status. Check process status using 'ps -aux' and kill all the processes in 'D' status.)	Critical	Run ps -aux to query all processes in the D state, run kill -9 <PID> to stop all processes in the D state, and save the image again.
	Saving the image failed because the image is too large. (The container size (%dG) is greater than the threshold (%dG).)	Critical	Delete unnecessary directories and files except those in the /home/ma-user/work/ directory of the instance. Reduce the container image size to the threshold specified in the event description and try again.
	Saving the image failed due to the limit on the number of layers. (There are too many layers in your image.)	Critical	The number of image layers used for starting the instance exceeds 125. Create an instance startup image. During the creation, you can reduce the number of image layers by combining commands and building the image by phase.
	Saving the image failed due to task timeout. (Image saving failed due to task timeout)	Critical	The task timed out due to a network or dependent service exception. Submit a service ticket to contact O&M engineers.
	Saving the image failed due to SWR service issues.	Critical	SWR service error. Submit a service ticket to contact O&M engineers.
CheckImageSize	The notebook container image size is {image_size}G. {image_size} indicates the image size, which is a variable.	Warning	Normal event, no action required.
CheckImageLayer	The number of original notebook image layers is {layer_number}. {layer_number} indicates the number of image layers, which is a variable.	Warning	Normal event, no action required.
ContainerCommitStarted	Start to commit notebook container.	Warning	Normal event, no action required.
ContainerCommitSuccess	Notebook container commit successfully.	Warning	Normal event, no action required.
ImagePushStarted	Start to push notebook image.	Warning	Normal event, no action required.
ImagePushSuccess	Notebook image push successfully.	Warning	Normal event, no action required.
ContainerCommitFailed	Failed to commit notebook container. Please contact SRE to check node {node_name}. {node_name} indicates the node name, which is a variable and is generally in the format of an IP address, for example, 192.168.225.161.	Warning	Node error or internal service error. Submit a service ticket to contact O&M engineers.
ImagePushFailed	Failed to push Notebook image. Please contact SRE to check node {node_name}.	Warning	Failed to push the image. Try again. If the fault persists, submit a service ticket to contact O&M engineers.

**Table 5** Events during instance running
Event Name	Description	Severity	Solution
NotebookUnhealthy	The instance is unhealthy.	Critical	This event may be triggered when a debugging task is started in an instance, for example, the task occupies too many CPU, memory, or I/O resources. It can be automatically cleared after the instance load decreases. Wait for a while and refresh the page. If the NotebookHealthy event is added, the instance status is normal and no action is required. If the fault persists for a long time, submit a service ticket to contact O&M engineers for assistance.
OutOfMemory	The instance is evicted because the memory usage exceeds the upper limit.	Critical	When an instance process occupies more memory than the applied specifications, this event is triggered by the Kubernetes mechanism and the instance is restarted. After the restart, the instance status changes to Normal. In future use, do not perform tasks with high memory usage.
JupyterProcessKilled	The Jupyter process stops abnormally.	Critical	This event may be triggered if the Jupyter process is stopped by mistake or an unknown error occurs in the instance container. The instance will automatically restart. After the restart, the instance status changes to Normal.
CacheVolumeExceedQuota	The /cache file size has exceeded the upper limit.	Critical	This event is triggered when the /cache directory file size exceeds the maximum limit allowed by the instance specifications. The instance will automatically restart. After the restart, the instance status changes to Normal. In future use, pay attention to the size of the /cache directory. For details about the mapping between the space allocated to the directory and the instance specifications, see What Are the Sizes of the /cache Directories for Resources with Varying Specifications on ModelArts Notebook Instances?
NotebookHealthy	The instance recovers from an abnormal state to a normal state.	Major	Normal event, no action required.
EVSSoldOut	EVS disks are sold out.	Critical	This event may be triggered when you create a notebook instance and select EVS as the storage type, but EVS disks are sold out. You are advised to switch the storage type to OBS parallel file system or OBS bucket. If you want still want to use EVS, submit a service ticket to contact O&M engineers for capacity expansion.

**Table 6** Events for dynamic OBS mounting
Event	Description	Severity	Solution
DynamicMountStorage	The OBS storage is mounted.	Major	Normal event, no action required.
DynamicUnmountStorage	The OBS storage is unmounted.	Major	Normal event, no action required.

**Table 7** Events triggered on the user side
Event	Description	Severity	Solution
RefreshCredentialsFailed	Authentication failed.	Critical	Normal event, no action required.

Parent topic: Managing Notebook Instances

Previous topic: Saving a Notebook Instance

Next topic: Notebook Cache Directory Alarm Reporting

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot