You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both just have the text: "Manuscript text" with the comment "comment text"
However the include_text argument fails for "does not work.docx" due to the introduction to a tab symbol.
"does not work.docx" |>
docxtractr::read_docx() |>
docxtractr::docx_extract_all_cmnts(include_text = TRUE)
#> # A tibble: 1 x 6
#> id author date initials comment_text word_src
#> * <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 0 James Conigrave 2022-01-18T02:08:00Z "" Comment text ""
"works.docx" |>
docxtractr::read_docx() |>
docxtractr::docx_extract_all_cmnts(include_text = TRUE)
#> # A tibble: 1 x 6
#> id author date initials comment_text word_src
#> * <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 0 James Conigrave 2022-01-18T02:08:00Z "" Comment text Manuscript t~
It appears that in the file "does not work" there are small changes to the xml which break the functionality. I'm not quite sure how they have been caused but would love a fix if you have time!
The text was updated successfully, but these errors were encountered:
First off, thank you for this package, it's really useful.
I've run into an interesting scenario where the argument include_text = TRUE fails for a word document.
Here are two near identical word documents:
works.docx
does not work.docx
Both just have the text: "Manuscript text" with the comment "comment text"
However the include_text argument fails for "does not work.docx" due to the introduction to a tab symbol.
It appears that in the file "does not work" there are small changes to the xml which break the functionality. I'm not quite sure how they have been caused but would love a fix if you have time!
The text was updated successfully, but these errors were encountered: