-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Folia file parsing error #192
Comments
I can't seem to access that FoLiA XML file, https://gitlab.com/parseme/annotations/ gives a 404. Perhaps it is a private repository I (https://gitlab.com/proycon) can't access? I suspect that the word is referenced before it appears I have seen something like that in the past, but it would indeed be a bug in FLAT or the folia library, the entity layer not being inserted at the proper place. |
Possibly related issue: |
I added you (as proycon) to the parseme projet on Gitlab. All repos in this project will be available for you. THe XenophonAnabasis4REANNOTATION.folia.xml shoudl be accessible now. |
I looked into the file and I think that one of the problems is that an annotation (containing 2-3 tokens) spans over tokens of two different sentences. |
Yes, the first entity layer in that file, for sentence The bug if of course somewhere in FLAT's underlying libraries as it should have never written the entity in that layer for that sentence. I don't suppose you remember the exact steps that replicates such a mis-annotation? I'll first expand an existing fix ( |
…erences is also set Ref: proycon/flat#192
I published a new version of the FLAT container image on Docker Hub that contains this fix/workaround. The actual root cause remains to be found and fixed though. |
I made some tests myself though and noticed that it is possible to select two tokens in two different sentences and group them in one annotation. When we then try to delete such a cross-sentence annotation, it is no longer possible. I also noticed that in the file and sent you, and also in other files having the same parsing error, this precise situation occurred: an annotation covered two tokens from two different sentences. Could that be an issue? |
Several of our FLAT users have an issue with their working files. They work on annotations and sometimes it happens that when they come back to the same file, FLAT cannot open them again but displays and error message:

I downloaded the file for which the message was displayed. It is the XenophonAnabasis4REANNOTATION.folia.xml file. I checked that the reference in line 209 (XenophonAnabasis4REANNOTATION.text.1.S.326.W.8) does exist in the file, so I do not understand the problem, especially because the file was not manipulated outside FLAT.
AI also tried to download this file and convert it to another format (.cup for PARSEME) using Folia libraries and the same error message occurred for this file.
Several other files are also affected by the same problem, for instance: XenophonAnabasis3GSclean.
The text was updated successfully, but these errors were encountered: