Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code for Identifying Incorrect Labels #42

Open
GSidiropoulos opened this issue Sep 21, 2022 · 3 comments
Open

Code for Identifying Incorrect Labels #42

GSidiropoulos opened this issue Sep 21, 2022 · 3 comments

Comments

@GSidiropoulos
Copy link

Dear authors, thank you for sharing your work. I was wondering if you can also provide the code for identifying the incorrect labels. From what I understand the label corrections to produce a corrected version are given, however, the code to reproduce is not available.

Best regards,
Georgios

@xuhdev
Copy link
Collaborator

xuhdev commented Sep 21, 2022

We identified the incorrect labels manually, while we utilized some tools to help speed up human review. I believe @frreiss has them available but I don't think they have changed the manual nature of identifying labels in any way.

@GSidiropoulos
Copy link
Author

Thank you for your reply. My question mostly refers to the code you have here -> https://github.com/CODAIT/text-extensions-for-pandas/tree/master/tutorials/corpus. How can I obtain the 1054 flagged training samples? Do I have to combine the results from the CoNLL_4.ipynb and CoNLL_3.ipynb notebooks?

@GSidiropoulos
Copy link
Author

GSidiropoulos commented Nov 29, 2022

Can you please clarify how we are supposed to combine the results of CoNLL_2.ipynb, CoNLL_3.ipynb, CoNLL_4.ipynb notebooks in order to obtain the results you report in the paper? In the paper you mention that " We
considered any label where fewer than 7 models agreed with the corpus label to be “flagged”". Is this the only condition you use in order to flag labels?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants