更新时间:2024-12-30 GMT+08:00
常见问题
- 报错提示RuntimeError: Default process group has not been initialized, please make sure to call init_process_group
- 训练运行报错AttributeError: 'torch_npu._C._NPUDeviceProperties' object has no attribute 'multi_processor_count'
- deepspeed多卡训练报错TypeError: deepspeed_init() got an unexpected keyword argument 'resume_from_checkpoint'
- Huggingface缓存目录空间不足,出现OSError: [Errno 122] Disk quota exceeded
- 调用transformers出现ImportError: Using the `Trainer` with `PyTorch` requires `accelerate`: Run `pip install --upgrade accelerate`
- 调用transformers出现ImportError: libcblas.so.3: cannot open shared object file: No such file or directory
- transformers调用cuda上的操作,或者执行卡死