How Do I Use Multiple Ascend Cards for Debugging in a Notebook Instance?
An Ascend multi-card training job runs in multi-process, multi-card mode. The number of cards is equal to the number of Python processes. The Ascend underlayer reads the environment variable RANK_TABLE_FILE, which has been configured in the development environment, without requiring manual configuration. For example, to run a job on eight cards, the code is as follows:
export RANK_SIZE=8 current_exec_path=$(pwd) echo 'start training' for((i=0;i<=$RANK_SIZE-1;i++)); do echo 'start rank '$i mkdir ${current_exec_path}/device$i cd ${current_exec_path}/device$i echo $i export RANK_ID=$i dev=`expr $i + 0` echo $dev export DEVICE_ID=$dev python train.py > train.log 2>&1 & done
Set the environment variable DEVICE_ID in train.py.
devid = int(os.getenv('DEVICE_ID')) context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", device_id=devid)
Others FAQs
- How Do I Use Multiple Ascend Cards for Debugging in a Notebook Instance?
- Why Is the Training Speed Similar When Different Notebook Flavors Are Used?
- How Do I Perform Incremental Training When Using MoXing?
- How Do I View GPU Usage on the Notebook?
- How Can I Obtain GPU Usage Through Code?
- Which Real-Time Performance Indicators of an Ascend Chip Can I View?
- What Are the Relationships Between Files Stored in JupyterLab, Terminal, and OBS?
- How Do I Migrate Data from an Old-Version Notebook Instance to a New-Version One?
- How Do I Use the Datasets Created on ModelArts in a Notebook Instance?
- pip and Common Commands
- What Are Sizes of the /cache Directories for Different Notebook Specifications in DevEnviron?
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.
more