-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcpc_query: Parse two (sub)activities correctly #349
Conversation
Can you give an example of what the new output looks like? (i.e. instead of the results shown in #348) |
Sure: webchem::bcpc_query("2,4-D")[[1]][c("activity", "subactivity")]
#> $activity
#> [1] "herbicides" "plant growth regulators"
#>
#> $subactivity
#> [1] "phenoxyacetic herbicides" "auxins" Created on 2022-01-31 by the reprex package (v2.0.1) |
Is there always one sub-activity for every activity? If so, then this seems potentially reasonable. If some activities have multiple sub-activities, then it will become difficult to determine which sub-activity is nested under which activity with this format. Don't get me wrong, this is an improvement over the previous behavior (which was obviously a bug)! I'm just wondering if maybe the output needs a re-work as a data frame or further nested list. I'll let @stitam or others decide whether to accept this PR as-is or try to figure out an even better fix before merging. |
This commit makes bcpc_query support more than two activities and empty subactivities
This gave errors on my local system when testing with `devtools::test()`
Related? #295 |
Thanks @jranke for opening this PR and apologies for my delay in responding to it. I agree with Eric that this is an improvement over the previous version, but I think we should think a bit more about some special cases. For example: webchem::bcpc_query("thiodicarb")[[1]][c("activity", "subactivity")]
#> $activity
#> [1] "insecticidesmolluscicides"
#>
#> $subactivity
#> [1] "oxime carbamate insecticidesmolluscicides" Created on 2022-02-15 by the reprex package (v2.0.1) This one is different from the website output which is 1. insecticides (oxime carbamate insecticides), 2. molluscicides Here is another one: webchem::bcpc_query("methiocarb")[[1]][c("activity", "subactivity")]
#> $activity
#> [1] "acaricides (carbamate acaricides)bird repellentsinsecticidesmolluscicides"
#>
#> $subactivity
#> [1] "phenyl methylcarbamate insecticidesmolluscicides" Created on 2022-02-15 by the reprex package (v2.0.1) Regarding @Aariq's concern, I think returning a data frame with columns "activity" and "subactivity" would be great, it would remove any ambiguity around interpreting the output. I personally don't like nested lists but that's also a solution. Yet another alternative, activity and subactivity could contain vectors of equal length. This is not explicit but maybe enough for now to merge the PR and then we can think elsewhere about output structure for the function in general. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jranke for opening this PR! I've added my comments within the conversation.
With the changes in this PR I get webchem::bcpc_query("thiodicarb")[[1]][c("activity", "subactivity")]
#> $activity
#> [1] "insecticides" "molluscicides"
#>
#> $subactivity
#> [1] "oxime carbamate insecticides" NA Created on 2022-02-15 by the reprex package (v2.0.1)
As you can see my PR returns NA for the empty subactivity.
Here I get webchem::bcpc_query("methiocarb")[[1]][c("activity", "subactivity")]
#> $activity
#> [1] "acaricides" "bird repellents" "insecticides" "molluscicides"
#>
#> $subactivity
#> [1] "carbamate acaricides" NA
#> [3] "phenyl methylcarbamate insecticides" NA Created on 2022-02-15 by the reprex package (v2.0.1)
This is what this PR does.
Yes, I would propose to take it as it is for the moment. |
Ok, let's merge for now, then take a look for edge cases where activities might have more than one sub-activity as a separate issue (which may be fixed by changing output to a tibble) |
OK, thanks for reviewing! |
For some reason the function didn't update properly on my end when I pulled this branch for reviewing. After merging I tested it again and I could reproduce your output. Thanks @jranke for the clarification. |
No problem - maybe you pulled the first version before I updated the PR. |
Closes #348
PR task list:
devtools::document()