size mismatch for classifier.bias: copying a param with shape torch.Size #20

Jorigorn · 2020-07-20T14:17:25Z

请问可以直接执行分类任务吗？还是必须finetun.
我下载了所有数据，直接执行这个报错：

python run_sequence_level_classification.py
--task_name ChnSentiCorp
--do_train
--do_eval
--do_lower_case
--data_dir /path/to/dataset/ChnSentiCorp
--bert_model /path/to/zen_model
--max_seq_length 512
--train_batch_size 32
--learning_rate 2e-5
--num_train_epochs 30.0

07/20/2020 22:14:06 - INFO - ZEN.tokenization - loading vocabulary file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/vocab.txt
07/20/2020 22:14:06 - INFO - ZEN.ngram_utils - loading ngram frequency file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/ngram.txt
07/20/2020 22:14:08 - INFO - ZEN.modeling - loading weights file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/pytorch_model.bin
07/20/2020 22:14:08 - INFO - ZEN.modeling - loading configuration file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/config.json
07/20/2020 22:14:08 - INFO - ZEN.modeling - Model config {
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"num_hidden_word_layers": 6,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128,
"word_size": 104089
}

Traceback (most recent call last):
File "examples/run_sequence_level_classification.py", line 396, in
main()
File "examples/run_sequence_level_classification.py", line 361, in main
if task_name not in processors:
File "/data/anaconda3/lib/python3.6/site-packages/ZEN-0.1.0-py3.6.egg/ZEN/modeling.py", line 839, in from_pretrained
RuntimeError: Error(s) in loading state_dict for ZenForSequenceClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([3, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([2]).
sh-4.2$

thanks a lot.

shizhediao · 2022-09-27T00:29:49Z

It seems that you used the wrong checkpoint for your task.
Your task is ChnSentiCorp while you are loading ZEN_ft_NLI_v0.1.0.
Please try [ZEN_ft_SA](http://zen.chuangxin.com/ZEN/models/ZEN_ft_SA_v0.1.0.zip)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

size mismatch for classifier.bias: copying a param with shape torch.Size #20

size mismatch for classifier.bias: copying a param with shape torch.Size #20

Jorigorn commented Jul 20, 2020

shizhediao commented Sep 27, 2022 •

edited

Loading

size mismatch for classifier.bias: copying a param with shape torch.Size #20

size mismatch for classifier.bias: copying a param with shape torch.Size #20

Comments

Jorigorn commented Jul 20, 2020

shizhediao commented Sep 27, 2022 • edited Loading

shizhediao commented Sep 27, 2022 •

edited

Loading