Skip to content

Commit

Permalink
Refactor pdfminer
Browse files Browse the repository at this point in the history
  • Loading branch information
pprados committed Jan 31, 2025
1 parent ceda8bc commit 278c6d2
Show file tree
Hide file tree
Showing 9 changed files with 2,706 additions and 620 deletions.
2,032 changes: 1,977 additions & 55 deletions docs/docs/integrations/document_loaders/pdfminer.ipynb

Large diffs are not rendered by default.

660 changes: 262 additions & 398 deletions docs/docs/integrations/document_loaders/pymupdf.ipynb

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion libs/community/extended_testing_deps.txt
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ openapi-pydantic>=0.3.2,<0.4
oracle-ads>=2.9.1,<3
oracledb>=2.2.0,<3
pandas>=2.0.1,<3
pdfminer-six>=20221105,<20240706
pdfminer-six==20231228
pdfplumber>=0.11
pgvector>=0.1.6,<0.2
playwright>=1.48.0,<2
Expand Down Expand Up @@ -104,3 +104,4 @@ mlflow[genai]>=2.14.0
databricks-sdk>=0.30.0
websocket>=0.2.1,<1
writer-sdk>=1.2.0
unstructured[pdf]>=0.15
Loading

0 comments on commit 278c6d2

Please sign in to comment.