trafilatura-1.3.0
- fast and robust
html2txt()
function added (#221) - more robust parsing (#228)
- fixed bugs in metadata extraction, with @felipehertzer in #213 & #226
- extraction about 10-20% faster, slightly better recall
- partial fixes for memory leaks (#216)
- docs extended and updated (#217, #225)
- prepared deprecation of old
process_record()
function - more stable processing with updated dependencies
Full Changelog: v1.2.2...v1.3.0