MindSpeed-MM

This section describes the YAML configuration file and parameters for training. You can choose parameters as required.

Configuring Parameters in the YAML File

Modify the YAML file.

Click to enlarge

Set the dataset and model path.

Parameter	Example Value	Description
backend_config.data.dataset_param.preprocess_parameters.model_name_or_path	/home/ma-user/AscendFactory/ckpts/hf_path/Qwen2.5-VL-7B-Instruct	[Mandatory] Weight path before conversion. Change it based on the actual situation.
backend_config.data.dataset_param.basic_parameters.dataset_dir	/home/ma-user/AscendFactory/data	[Mandatory] Dataset path. Change it based on the actual situation.

Set the weight conversion.

Parameter	Example Value	Description
backend_config.convert_ckpt_hf2mg.cfg.mm_dir	/home/ma-user/AscendFactory/ckpts/mm_path/converted_weight_TP${backend_config.training.tensor-model-parallel-size}_PP${backend_config.training.pipeline-model-parallel-size}	[Mandatory] Directory for saving the converted file. Change it based on the actual situation.
backend_config.convert_ckpt_hf2mg.cfg.hf_config.hf_dir	${backend_config.data.dataset_param.preprocess_parameters.model_name_or_path}	Hugging Face weight directory.
backend_config.convert_ckpt_hf2mg.cfg.parallel_config.llm_pp_layers	- 1 - 10 - 10 - 7	Number of llm layers assigned to each PU. It must match the value configured for pipeline_num_layers in backend_config.model during fine-tuning.
backend_config.convert_ckpt_hf2mg.cfg.parallel_config.vit_pp_layers	- 32 - 0 - 0 - 0	Number of vit layers assigned to each PU. It must match the value configured for pipeline_num_layers in backend_config.model during fine-tuning.
backend_config.convert_ckpt_mg2hf.cfg.parallel_config.tp_size	1	TP parallelism count. Ensure that it matches the configuration used in training.
backend_config.convert_ckpt_mg2hf.cfg.save_hf_dir	${af_output_dir}/ckpt_converted_mg2hf	Directory for storing the converted Hugging Face model after MindSpeed-MM fine-tuning.
backend_config.convert_ckpt_mg2hf.cfg.parallel_config.llm_pp_layers	-1 - 10 - 10 - 7	Number of llm layers assigned to each PU. It must match the value configured for pipeline_num_layers in backend_config.model during fine-tuning.
backend_config.convert_ckpt_mg2hf.cfg.parallel_config.vit_pp_layers	- 32 - 0 - 0 - 0	Number of vit layers assigned to each PU. It must match the value configured for pipeline_num_layers in backend_config.model during fine-tuning.
backend_config.convert_ckpt_mg2hf.cfg.parallel_config.tp_size	1	TP parallelism count. Ensure that it matches the configuration used in training.

Set the model saving, loading, and log information.

Parameter	Example Value	Description
backend_config.training.load	${..convert_ckpt_hf2mg.cfg.mm_dir}	Model loading path. Change it based on the actual situation.
backend_config.training.save	${af_output_dir}/saved_checkpoints	Model save path. Change it based on the actual situation.
backend_config.training.no-load-optim	true	Specifies whether to load the optimizer state. Set this parameter to false if loading is required.
backend_config.training.no-load-rng	true	Specifies whether to load the random number state. Set this parameter to false if loading is required.
backend_config.training.no-save-optim	true	Specifies whether to save the optimizer state. Set this parameter to false if loading is required.
backend_config.training.no-save-rng	true	Specifies whether to save the random number state. Set this parameter to false if loading is required.
backend_config.training.log-interval	1	Log interval.
backend_config.training.save-interval	5000	Save interval.