Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master: how to annotate no sense #154

Open
arademaker opened this issue Aug 16, 2019 · 3 comments
Open

master: how to annotate no sense #154

arademaker opened this issue Aug 16, 2019 · 3 comments

Comments

@arademaker
Copy link
Member

In the annotation of rival, I found a missing sense as adjective.

image

But the system does not allow me to mark no sense but surely an adjective. The menu with the 0 option is not displayed when there is no sense in the target PoS.

@arademaker arademaker changed the title how to annotated no sense master: how to annotated no sense Aug 16, 2019
@odanoburu
Copy link
Contributor

no sense but surely an adjective.

how would this be represented in the data?

although it definitely is a problem that you can't pick no sense like this.

@arademaker
Copy link
Member Author

arademaker commented Aug 16, 2019

It seems to me that we have two options:

  1. make the no sense a command outside the menu of senses. Since you are right that we don't have currently a way to encode a confirmed POS tag without choosing a sense. In the data, the field pos contains the automatically assigned pos tag only, the selected senses are the confirmation (or not) of this PoS tag.

  2. change the data format to something more general, such as having the user directly or indirectly (by the selection of the senses) confirming the PoS tag. In that case, we may have an additional problem to what tagset to use. We can be conservative adding another field that would allow values such as (a, n, v, r). Change the tagset to http://moin.delph-in.net/ErgLeTypes, https://talp-upc.gitbook.io/freeling-4-0-user-manual/tagsets/tagset-en or https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html.

In the glosstag, currently we have the majority of the tokens without a value for the pos field:

$ awk '$0 ~ /^[0-9]/ {print $6}' ar.data | sort | uniq -c | sort -nr
868496 _
264353 NN
174798 IN
148860 DT
143368 JJ
136720 :
69350 CC
62924 NNS
47344 NNP
42179 VBN
38710 VBG
35383 VB
34321 RB
21968 TO
20823 VBZ
16004 CD
15549 WDT
14024 )
14024 (
7859 PRP
6215 WP
5270 ,
4468 PRP$
4391 VBP
2482 VBD
2211 MD
2118 WRB
1966 JJR
1089 RP
1071 RBR
1069 JJS
 863 WP$
 363 RBS
 175 PDT
  60 SYM
   5 FW
   4 .
   2 UH
   2 ...

@odanoburu
Copy link
Contributor

  1. might be solved by custom sense menu implementation #156

@odanoburu odanoburu changed the title master: how to annotated no sense master: how to annotate no sense Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants