Skip to content

Commit

Permalink
update contributing and history files
Browse files Browse the repository at this point in the history
  • Loading branch information
adbar committed Dec 3, 2024
1 parent 1410645 commit c7214b6
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 13 deletions.
26 changes: 14 additions & 12 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,41 @@
## How to contribute

Thank you for considering contributing to Trafilatura! Your contributions make the software and its documentation better.
Your contributions make the software and its documentation better. A special thanks to all the [contributors](https://github.com/adbar/trafilatura/graphs/contributors) who have played a part in Trafilatura.


There are many ways to contribute, you could:

* Improve the documentation: Write tutorials and guides, correct mistakes, or translate existing content.
* Find bugs and submit bug reports: Help making Trafilatura a robust and versatile tool.
* Find bugs and submit bug reports: Help making Trafilatura an even more robust tool.
* Submit feature requests: Share your feedback and suggestions.
* Write code: Fix bugs or add new features.


Here are some important resources:

* [List of currently open issues](https://github.com/adbar/trafilatura/issues) (no pretention to exhaustivity!)
* [Roadmap and milestones](https://github.com/adbar/trafilatura/milestones)
* [How to Contribute to Open Source](https://opensource.guide/how-to-contribute/)
* [How to contribute to open source](https://opensource.guide/how-to-contribute/)


## Submitting changes
## Testing and evaluating the code

Please send a [GitHub Pull Request to trafilatura](https://github.com/adbar/trafilatura/pull/new/master) with a clear list of what you have done (read more about [pull requests](http://help.github.com/pull-requests/)).
Here is how you can run the tests and code quality checks:

**Working on your first Pull Request?** See this tutorial: [How To Create a Pull Request on GitHub](https://www.digitalocean.com/community/tutorials/how-to-create-a-pull-request-on-github)
- Install the necessary packages with `pip install trafilatura[dev]`
- Run `pytest` from trafilatura's directory, or select a particular test suite, for example `realworld_tests.py`, and run `pytest realworld_tests.py` or simply `python3 realworld_tests.py`
- Run `mypy` on the directory: `mypy trafilatura/`
- See also the [tests Readme](tests/README.rst) for information on the evaluation benchmark

Pull requests will only be accepted if they there are no errors in pytest and mypy.

A special thanks to all the [contributors](https://github.com/adbar/trafilatura/graphs/contributors) who have played a part in Trafilatura.
If you work on text extraction it is useful to check if performance is equal or better on the benchmark.


## Testing and evaluating the code
## Submitting changes

Here is how you can run the tests if you wish to correct the errors and further improve the code:
Please send a pull request to Trafilatura with a list of what you have done (read more about [pull requests](http://help.github.com/pull-requests/)).

- Run `pytest` from trafilatura's directory, or select a particular test suite, for example `realworld_tests.py`, and run `pytest realworld_tests.py` or simply `python3 realworld_tests.py`
- See also the [tests Readme](tests/README.rst) for information on the evaluation
**Working on your first Pull Request?** See this tutorial: [How To Create a Pull Request on GitHub](https://www.digitalocean.com/community/tutorials/how-to-create-a-pull-request-on-github)



Expand Down
3 changes: 2 additions & 1 deletion HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,14 @@ Fixes:
- more robust mapping for conversion to HTML (#721)
- CLI downloads: use all information in settings file (#734)
- downloads: cleaner urllib3 code (#736)
- CLI: print URLs early for feeds and sitemaps with `--list` with @gremid (#744)
- refine table markdown output by @unsleepy22 (#752)
- extraction fix: images in text nodes by @unsleepy22 (#757)

Metadata:
- more robust URL extraction (#710)

Command-line interface:
- CLI: print URLs early for feeds and sitemaps with `--list` with @gremid (#744)
- CLI: add 126 exit code for high error ratio (#747)

Maintenance:
Expand Down

0 comments on commit c7214b6

Please sign in to comment.