-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdataframes-columns.qmd
106 lines (68 loc) · 1.47 KB
/
dataframes-columns.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
# jupyter: julia-1.10
engine: julia
---
# Operations on columns
```{julia}
using DataFrames, PalmerPenguins
using Tidier
import DataFramesMeta as DFM
penguins = PalmerPenguins.load() |> DataFrame;
@slice_head(penguins, n = 10)
```
## Selecting (or: throwing columns away)
### Selecting `n` columns
**Problem:** Select only some columns.
::: {.panel-tabset}
## Tidier
```{julia}
@select penguins species body_mass_g
```
## DataFramesMeta
```{julia}
DFM.@select penguins :species :body_mass_g
```
## DataFrames
```{julia}
DFM.select(penguins, [:species, :body_mass_g])
```
:::
### Selecting columns from a variable
**Problem:** Select only some columns whose names are stored in a variable.
::: {.panel-tabset}
```{julia}
my_columns = [:species, :body_mass_g];
```
## Tidier
```{julia}
@eval @select penguins $my_columns...
```
## DataFramesMeta
```{julia}
DFM.@select penguins $my_columns
```
## DataFrames
```{julia}
DFM.select(penguins, my_columns)
```
:::
## Mutating (or: creating columns)
### Creating one column based on another one
**Problem:** Create the column `body_mass_kg` by dividing `body_mass_g` by 1000.
::: {.panel-tabset}
## Tidier
```{julia}
@mutate penguins body_mass_kg = body_mass_g / 1000
```
## DataFramesMeta
```{julia}
DFM.@rtransform penguins :body_mass_kg = :body_mass_g / 1000
```
## DataFrames
```{julia}
penguins2 = copy(penguins);
penguins.body_mass_kg = penguins.body_mass_g ./ 1000;
penguins2
```
:::
## Conditionally mutating columns