Skip to content
@scrapinghub

Scrapinghub

Turn web content into useful data

Pinned Loading

  1. splash splash Public

    Lightweight, scriptable browser as a service with an HTTP API

    Python 4.1k 512

  2. dateparser dateparser Public

    python parser for human readable dates

    Python 2.6k 468

  3. python-scrapinghub python-scrapinghub Public

    A client interface for Scrapinghub's API

    Python 203 63

  4. extruct extruct Public

    Extract embedded metadata from HTML markup

    Python 873 113

  5. spidermon spidermon Public

    Scrapy Extension for monitoring spiders execution.

    Python 535 100

  6. python-crfsuite python-crfsuite Public

    A python binding for crfsuite

    Python 770 222

Repositories

Showing 10 of 183 repositories
  • hcf-backend Public

    Crawl Frontier HCF backend

    scrapinghub/hcf-backend’s past year of commit activity
    Python 7 BSD-3-Clause 5 2 2 Updated Jan 30, 2025
  • dateparser Public

    python parser for human readable dates

    scrapinghub/dateparser’s past year of commit activity
    Python 2,590 BSD-3-Clause 468 289 (6 issues need help) 51 Updated Jan 30, 2025
  • web-poet Public

    Web scraping Page Objects core library

    scrapinghub/web-poet’s past year of commit activity
    Python 96 BSD-3-Clause 15 16 (1 issue needs help) 13 Updated Jan 30, 2025
  • andi Public

    Library for annotation-based dependency injection

    scrapinghub/andi’s past year of commit activity
    Python 22 BSD-3-Clause 5 3 1 Updated Jan 30, 2025
  • scrapy-poet Public

    Page Object pattern for Scrapy

    scrapinghub/scrapy-poet’s past year of commit activity
    Python 119 BSD-3-Clause 28 10 (1 issue needs help) 4 Updated Jan 29, 2025
  • shub-workflow Public
    scrapinghub/shub-workflow’s past year of commit activity
    Python 13 BSD-3-Clause 15 3 2 Updated Jan 24, 2025
  • scrapinghub-stack-scrapy Public

    Software stack with latest Scrapy and updated deps

    scrapinghub/scrapinghub-stack-scrapy’s past year of commit activity
    Dockerfile 63 BSD-3-Clause 20 2 1 Updated Jan 6, 2025
  • scrapinghub-entrypoint-scrapy Public

    Scrapy entrypoint for Scrapinghub job runner

    scrapinghub/scrapinghub-entrypoint-scrapy’s past year of commit activity
    Python 25 BSD-3-Clause 16 8 1 Updated Jan 6, 2025
  • python-scrapinghub Public

    A client interface for Scrapinghub's API

    scrapinghub/python-scrapinghub’s past year of commit activity
    Python 203 BSD-3-Clause 63 23 2 Updated Dec 16, 2024
  • spidermon Public

    Scrapy Extension for monitoring spiders execution.

    scrapinghub/spidermon’s past year of commit activity
    Python 535 BSD-3-Clause 100 42 (2 issues need help) 7 Updated Dec 10, 2024