Skip to content

v1.4.1

Compare
Choose a tag to compare
@adbar adbar released this 19 Jan 17:02

Extraction:

  • extraction bugs fixed (#263, #266), more robust HTML doctype parsing
  • XML output improvements by @knit-bee (#273, #274)
  • adjust thresholds for link density in paragraphs

Metadata:

  • improved title and sitename detection (#284)
  • faster author, categories, domain name, and tags extraction
  • fixes to author emoji regexes by @felipehertzer (#269)

Command-line interface:

  • review argument consistency and add deprecation warnings (#261)

Setup:

  • make download timeout configurable (#263)
  • updated dependencies, use of faust-cchardet for Python 3.11

Full Changelog: v1.4.0...v1.4.1