Help Center/ PanguLargeModels/ FAQs/ FAQs Related to LLM Fine-Tuning and Training/ Why Is the Performance of the Fine-Tuned Pangu Model in Actual Scenarios Worse Than That During Evaluation?

Updated on 2025-11-04 GMT+08:00

View PDF

Why Is the Performance of the Fine-Tuned Pangu Model in Actual Scenarios Worse Than That During Evaluation?

The model evaluation yields a good result for a target task during fine-tuning. However, after the model is deployed, it cannot give a satisfactory answer to a question that belongs to the same target task. Locate the fault as follows:

Test set quality: Check whether the target task and data distribution of the test set are consistent with those in the actual scenario. Evaluating a model using a low-quality test set cannot reflect the actual performance of the model.
Data quality: Check the quality of the training data. If the training sample or its data distribution does not align with the target task, this phenomenon will be aggravated. In addition, if the actual scenario is predicted to change continuously, you are advised to periodically update the training data and fine-tune the model.

Parent topic: FAQs Related to LLM Fine-Tuning and Training