We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
为什么Mengzi-T5-base-MT的模型大小只有Mengzi-T5-base的一半,加载模型再保存以后,又恢复和base相同的大小
The text was updated successfully, but these errors were encountered:
Sorry, something went wrong.
Mengzi-T5-base-MT训练过程使用fp16,保存模型的权重对应也是fp16,不影响直接加载使用。 而Mengzi-T5-base训练过程使用fp32,保存模型的权重对应也是fp32。 可以在config.json中查看参数torch_dtype,可以看到对应是float16或float32。
config.json
torch_dtype
float16
float32
No branches or pull requests
为什么Mengzi-T5-base-MT的模型大小只有Mengzi-T5-base的一半,加载模型再保存以后,又恢复和base相同的大小
The text was updated successfully, but these errors were encountered: