-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path06_documentation.qmd
312 lines (220 loc) · 6.57 KB
/
06_documentation.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
---
title: Documentation
format:
html: default
revealjs:
output-file: 06_documentation_slides.html
slide-number: true
footer: Python package development
logo: academy_logo.png
---
## Why document your code?
![](https://imgs.xkcd.com/comics/manuals.png)
::: {.incremental}
* Make it easier for others to use your code
* Make it easier for **you** to use your code
:::
## Readme.md
* A readme file is a text file that introduces and explains a project.
* Always include a readme file in your project.
* You can put readme files in any directory, and you can have more than one in a single project.
## Requirements
* Mention the requirements for your package
- Operating system
- Python version
- Other non-Python dependencies, e.g. VC++ redistributables
* Include information on how to install your package
- `pip install my_package`
- `pip install https://github.com/DHI/{repo}/archive/main.zip`
## Notebooks
* Jupyter notebooks are a great way to document your code
* Good for prototyping
* In a later stage, notebooks can be used to demonstrate how to use your code
* Not a replacement for documentation for a professional package
## Docstrings
```{.python code-line-numbers="|1|4-18"}
"""K-means clustering."""
class KMeans(_BaseKMeans):
"""K-Means clustering.
Parameters
----------
n_clusters : int, default=8
The number of clusters to form as well as the number of
centroids to generate.
Examples
--------
>>> X = np.array([[1, 2], [1, 4], [1, 0],
... [10, 2], [10, 4], [10, 0]])
>>> kmeans = KMeans(n_clusters=2, random_state=0, n_init="auto").fit(X)
>>> kmeans.labels_
array([1, 1, 1, 0, 0, 0], dtype=int32)
```
[sklearn.KMeans](https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/cluster/_kmeans.py)
---
```python
>>> from sklearn.cluster import KMeans
>>> help(KMeans)
class KMeans(_BaseKMeans)
| KMeans(n_clusters=8, *, init='k-means++', n_init='warn')
|
| K-Means clustering.
|
| Parameters
| ----------
| n_clusters : int, default=8
```
. . .
![](images/vs_code_docstring.png)
---
![](images/api_docs.png)
::: {.notes}
Write once, read anywhere!
:::
## Docstring - Numpy format {.smaller}
```python
def function_name(param1, param2, param3):
"""Short summary.
Long description.
Parameters
----------
param1 : int
Description of `param1`.
param2 : str
Description of `param2`.
param3 : list of str
Description of `param3`.
Returns
-------
bool
Description of return value.
"""
pass
```
::: {.notes}
There are several docstring formats. The most common is the numpy format, used by scikit-learn, pandas, numpy, scipy, etc.
:::
## Type hints {.smaller}
From Python 3.6, type hints can be used in addition to the type in the docstring.
```{.python code-line-numbers="|1"}
def remove_outlier(data:pd.DataFrame, column:str, threshold:float=3) -> pd.DataFrame:
"""Remove outliers from a dataframe.
Parameters
----------
data : pd.DataFrame
Dataframe to remove outliers from.
column : str
Column to remove outliers from.
threshold : float, optional
Number of standard deviations to use as threshold, by default 3
```
## doctest
Using code without documentation is hard, but using code with wrong documentation is even harder.
How can you make sure that the documentation is correct?
. . .
The answer is the `doctest` module built in to the Python standard library.
. . .
::: {.callout-tip}
The extensive standard library is why Python is described as a language with *"batteries included!"*
:::
---
Input, output examples in docstrings are run as tests.
```python
def add(a, b):
"""Add two numbers.
>>> add(1, 2)
3
>>> add(1, 3)
5
"""
return a + b
```
. . .
```bash
$ python -m doctest -v add.py
Failed example:
add(1, 3)
Expected:
5
Got:
4
**********************************************************************
1 items had failures:
1 of 2 in mod.add
***Test Failed*** 1 failures.
```
::: {.notes}
Doctest can pick up anything that looks like a Python session and run it as a test.
:::
## Documentation generators
* Sphinx
* mkdocs
::: {.notes}
Sphinx has been around for a long time, has lot's of functionality but is based on reStructuredText. mkdocs is a new kid on the block, based on markdown and has a lot of functionality.
:::
## {background-iframe="https://www.sphinx-doc.org/en/master/"}
## {background-iframe="https://www.mkdocs.org/"}
## mkdocs
* Text is written in markdown
* Easy to use
* API documentation can be generated with mkdocstrings
* The end result is a static website that can be hosted on e.g. GitHub pages
## Configuration
```{.yaml filename="mkdocs.yml"}
site_name: my_library
theme: "material" # or readthedocs, mkdocs, etc.
plugins:
- mkdocstrings:
handlers:
python:
options:
show_source: false # change if you want able to show source code
heading_level: 2
docstring_style: "numpy" # important!, since default is google
```
## API docs {.smaller}
::: {.incremental}
1. install mkdocstrings
`$ pip install mkdocstrings[python]`
2. Install theme, e.g. material
`$ pip install mkdocs-material`
3. Add plugin to mkdocs.yml (see above)
4. Create `index.md` in docs folder
5. Run `mkdocs serve` to view locally
:::
. . .
`docs/index.md`
```markdown
# Reference
::: my_library.simulation
```
##
![](images/mkdocs_api.png)
## GitHub pages
::: {.incremental}
* Once you have a static website, you need to share it with the world
* GitHub pages allows you to easily host a static website on GitHub
* The website is available at `https://dhi.github.io/<repository>/`
* The website can be created locally by manually editing html pages.
* For use as documentation, it is easier to use a documentation generator like mkdocs.
:::
## GitHub pages
![](images/github_pages.png)
## "Private" website
* A GitHub repository can be made private
* The website is still publicly available
* In order to "hide" it from search engines, add a `robots.txt` file to the root of the website
* This is **not** a secure way to hide a website, but it is a simple way to hide it from search engines.
```{.txt filename="robots.txt"}
User-agent: *
Disallow: /
```
## Additional resources
* https://realpython.com/python-project-documentation-with-mkdocs/
## Summary
::: {.incremental}
* Documentation is important
* Use a README file
* Use docstrings
* Use type hints
* Use `mkdocs` to generate API documentation
:::