-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The reproduction result is not good on the Overall indicator. #10
Comments
I can not reproduce performance too. I hope author can provide more detail information. Same issue issue |
We have checked and updated more detailed instruction. In general, for the model with PLM, after having the “base" model, we load it and freeze the PLM encoder (simply add .detach() after encoder output). The final stage is fine-tuning the full model, remember to perform grid search to make sure it achieves best performance. In our experiment, we use this checkpoint for MixATIS and this checkpoint for MixSNIPS as base model. In the case of MixATIS, you could try learning rate 3e-5 (freezing) and 3e-6 (after freezing). |
Thank you for the update. May I ask on which graphics card the experiment was conducted?Thanks,happy everyday! |
The reproduction of the results on Overall is not very good. I ran it on V100, and here are my parameter settings and experimental results. May I ask what the reason is, or how should I reproduce it correctly? Thank you!
python main.py --token_level word-level
--model_type roberta
--model_dir dir_base
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_train_epochs 100
--intent_loss_coef 0.5
--learning_rate 1e-5
--train_batch_size 32
--num_intent_detection
--use_crf
python main.py --token_level word-level
![not_good_overall](https://private-user-images.githubusercontent.com/145429002/337559559-4e2ac516-38eb-4bdc-9f97-d9388bf466dd.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1NDU1OTcsIm5iZiI6MTczOTU0NTI5NywicGF0aCI6Ii8xNDU0MjkwMDIvMzM3NTU5NTU5LTRlMmFjNTE2LTM4ZWItNGJkYy05Zjk3LWQ5Mzg4YmY0NjZkZC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE0JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNFQxNTAxMzdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0xNjU1MDNjMTk1NWQwZTM2NDVmZDM5ODU3ZGUzNDE0YmQwYzhhYTFhYTBkODJlMWQ4ZmM4ZjExYjE0MDQ3MDAxJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.m69fwv40Wx84w9CO_MAOqowtBptsB2UbIyPiDNIFC0o)
--model_type roberta
--model_dir misca
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_train_epochs 100
--intent_loss_coef 0.5
--learning_rate 1e-5
--num_intent_detection
--use_crf \
--base_model dir_base
--intent_slot_attn_type coattention
The text was updated successfully, but these errors were encountered: