Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset and evaluation protocol #1

Closed
RParedesPalacios opened this issue Mar 22, 2020 · 4 comments
Closed

Dataset and evaluation protocol #1

RParedesPalacios opened this issue Mar 22, 2020 · 4 comments

Comments

@RParedesPalacios
Copy link
Contributor

RParedesPalacios commented Mar 22, 2020

I have several questions:

1- how many images are available in total?

2- if the protocol is 10 fold cross validation why there are only 9 files in the balanced-tsv folder? I guess that files inside this folder are the one used with the suggested command:

python3 pneumo_cnn_classifier_training.py «FILE_TSV_BALANCED»

3- If there is enough images i recommends to use a hold-out protocol instead of a cross-validation since hold-out is more easy to manage and faster to train that the cross-validation.

@maigva
Copy link
Contributor

maigva commented Mar 22, 2020

Thank you Roberto for your feedback.
1.- The dataset is presented by 5 groups, you can see the number of images by labels here: https://github.com/BIMCV-CSUSP/BIMCV-COVID-19/blob/master/padchest-covid/datasets.ipynb
2.- Yes, this is just an early example. we are waiting from AI experts feedback to improve it even with other different proposals.
3.-thank you so much, i appreciate this comment.

@RParedesPalacios
Copy link
Contributor Author

Thank you Mariam,
with such as figures I recommend to use a hold out evaluation procedure with a unique training, validation and test set. The 95% confidence interval is going to be quite tight and it is not necessary to use a cross-validation protocol imho.
Also I think could be interested to have a table in the README with the model names and results that people are obtaining, like a leaderboard.

@maigva
Copy link
Contributor

maigva commented Mar 23, 2020

FYI
I just upload "neumo_dataset_balanced_9.tsv"

@RParedesPalacios
Copy link
Contributor Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants