Help Center> ModelArts> FAQs> General Issues> Why Is the Job Still Queued When Resources Are Sufficient?
Updated on 2023-11-22 GMT+08:00

Why Is the Job Still Queued When Resources Are Sufficient?

  • If a public resource pool is used, the resources may be used by other users. Please wait or find solutions in Why Is a Training Job Always Queuing?.
  • If a dedicated resource pool is used, perform the following operations:
    1. Check whether other jobs (including inference jobs, training jobs, and development environment jobs) are running in the dedicated resource pool.

      On the Dashboard page, you can go to the details page of the running jobs or instances to check whether the dedicated resource pool is used. You can stop them based on your needs to release resources.

      Figure 1 Dashboard
    2. Go to the details page of the dedicated resource pool to check whether there are other queuing jobs.

      If yes, the new job needs to be queued.

      Figure 2 Queuing jobs
    3. Check whether resources are fragmented.

      For example, the cluster has two nodes, and there are four idle cards on each node. However, your job requires eight cards on one node. In this case, the idle resources cannot be allocated to your job.

General Issues FAQs

more