forked from itsleeds/rrsrr
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path03-rstudio.Rmd
275 lines (203 loc) · 17.6 KB
/
03-rstudio.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
# Using RStudio {#rstudio}
The previous section taught the basics of the R language.
We entered and ran commands directly in the console.
In this section we will learn how to write R scripts in RStudio’s source editor.
We will also take a step back and considers how R code fits into the wider context of scripts, projects, and getting help in RStudio.
RStudio is an integrated development environment (IDE) for R that makes it easy to create and run scripts, explore R objects and functions, plot results and get help.
The first exercise is to open RStudio, take a look around, identify and explore the main components, shown in Figure 3.1.
Click on different buttons in RStudio’s GUI and try changing the Global Settings (in the Tools menu) and see RStudio’s shortcuts by pressing `Alt-Shift-K` (or `Option+Shift+K` on Mac).
```{r rstudioui, echo=FALSE, out.width="100%", fig.cap="The RStudio user interface showing the four main 'panes'."}
# download.file("https://raw.githubusercontent.com/ITSLeeds/TDS/master/courses/2day/images/rstudio-ui.png", "figures/rstudio-ui.png")
knitr::include_graphics("figures/rstudio-ui.png")
```
## Projects and scripts
Projects organise files and settings in RStudio into folders.
Each project has its own folder and Rproj file.
**When using RStudio, always ensure you are working in a named project to organise your work.**
Start a new project with by clicking on `File > New Project` in RStudio's top menu.^[
You can also create new projects from the R command line, for example with the command `rstudioapi::openProject("lrrsrr")`.
]
You create projects either in a new or existing directory (folder).
**Make a new project called 'lrrsrr' (short for 'learning reproducible road safety research with R') or a name of your choice and save it in a sensible place on your computer.**
The name of the project will appear in the top right of RStudio.
**'Scripts'** are files where R code are stored, and these can be edited in the **Source Editor** panel (the top left panel in Figure \@ref(fig:rstudioui)).
**Keeping your code in sensibly named, well organised and reproducible scripts will make your life easier.**
We could continue typing all our code into the console, as we did in Section \@ref(basics).
However, that approach is limited when working on anything more complicated than a few simple commands.
Code that you want to keep and share should be saved script files, i.e. plain text files that have the `.R` extension.
Make a new script by typing and running this command in the R console:^[
You can also create new files by clicking on File > New File > R Script with the mouse, or with the keyboard shortcut by pressing `Ctrl+Shift+N`. You can save the script and give it a sensible name like `chapter3.R` with File > Save or `Ctrl+S`.
We recommend using the command `file.edit()` to create files, however, because it is faster and develops your typing skills.
]
```{r edit, eval=FALSE}
file.edit("section3.R")
```
This will open the Source Editor and place your cursor there.
Try jumping between the Source Editor and the Console by pressing `Ctl+1` and `Ctl+2`.
Keeping scripts and other files associated with a project in a single folder per project (in an RStudio project) will help you to locate things you need and develop an efficient workflow.
Next, to check that your project is saved, close RStudio.
## Writing and running code
Re-open RStudio and ensure that you have an empty file open in the Source Editor.
We will type some basic commands into this file.
Type the following lines of code into your new `section3.R` R script and execute the result line-by-line by pressing `Ctrl+Enter` (`Command+Enter` on Mac):
```{r, eval=FALSE}
x = 1:5
y = c(0, 1, 3, 9, 18)
plot(x, y)
```
<!-- todo: add a gif? -->
When the code has been sent to the console, two objects are created, both of which are vectors of 5 elements (**Bonus:** check their length using the `length()` function).
The third line of the code chunk plots them.
Save the script by pressing `Ctrl+S`.
There are several ways to run code within a script and it is worth becoming familiar with each.
Try running the code you saved in the previous section using each of these methods:
1. Place the cursor in different places on each line of code and press `Ctrl+Enter` to run that line of code.
2. Highlight a block of code or part of a line of code and press `Ctrl+Enter` to run the highlighted code.
3. Press `Ctrl+Shift+Enter` to run all the code in a script.
4. Select `Run` on the toolbar to run all the code in a script.
5. Bonus: Use the function `source()` to run all the code in a script e.g. `source("section3.R")`
<!-- (but don't create an infinite loop!) -->
Practice alternating between the console and the source editor by pressing `Ctl+1` and `Ctl+2`.
## Viewing Objects
To practice typing code into scripts, rather than into the console, we will re-create the objects we created in Section \@ref(basics).
Create a new script called `objects.R` and type the following commands, character-for-character, including spaces in the right places. Typing rather than copy-pasting will help develop good coding style and speed:^[
Note: The `set.seed()` function is used to set the starting number in the random number generator used in R. Setting the seed to a fixed value in your code will ensure reproducability if you are using random number generation or sampling.
]
```{r}
vehicle_type = c("car", "bus", "tank")
casualty_type = c("pedestrian", "cyclist", "cat")
casualty_age = seq(from = 20, to = 60, by = 20)
set.seed(1)
dark = sample(x = c(TRUE, FALSE), size = 3, replace = TRUE)
small_matrix = matrix(1:24, nrow = 12)
crashes = data.frame(vehicle_type, casualty_type, casualty_age, dark)
```
Run the code line-by-line by pressing `Ctl+Enter` multiple times, as described in the previous section.
Try viewing the objects in the following ways:
1. Type the name of the object into the console, e.g. `crashes` and `small_matrix`, and run that code. Scroll up to see the numbers that didn't fit on the screen.
2. Use the `head()` function to view just the first 6 rows e.g. `head(small_matrix)`
3. Bonus: use the `n` argument in the previous function call to show only the first 2 rows of `small_matrix`
4. Click on the `crashes` object in the environment tab to View it in a spreadsheet.
5. Run the command `View(vehicle_type)`. What just happened?
We can also get an overview of an object using a range of functions, including:
- `summary()`
- `class()`
- `typeof()`
- `dim()`
- `length()`
View a summary of the `casualty_age` variable by running the following line of code (you should see the same output as shown below):
```{r summary}
summary(casualty_age)
```
**Exercise:** use the functions listed above (`class()` to `length()`) to test the basic R functions and extract key information about the object `vehicle_type`.
What does the output tell us about the object?
```{r summary-answers, echo=FALSE, eval=FALSE}
summary(vehicle_type)
class(vehicle_type)
typeof(vehicle_type)
dim(vehicle_type)
length(vehicle_type)
```
<!-- **Bonus**: Find out the class of the column `vehicle_type` in the data frame `crashes` with the command `class(crashes$vehicle_type)`. -->
<!-- Why has it changed? -->
<!-- Create a new object called `crashes_char` that keeps the class of the character vectors intact by using the function `tibble::tibble()` (see [tibble.tidyverse.org](https://tibble.tidyverse.org/) and Section 4 for details). -->
```{r tibble1, echo=FALSE, eval=FALSE}
tibble::tibble(
vehicle_type,
casualty_type,
casualty_age,
dark
)
```
## Autocompletion
RStudio can help you write code by autocompleting it. RStudio will look for similar objects and functions after typing the first three letters of a name.
```{r autocomp, echo=FALSE}
# knitr::include_graphics("https://raw.githubusercontent.com/ITSLeeds/TDS/master/courses/2day/images/autocomplete.jpg")
# download.file("https://raw.githubusercontent.com/ITSLeeds/TDS/master/courses/2day/images/autocomplete.jpg", "figures/autocomplete.jpg")
knitr::include_graphics("figures/autocomplete.jpg")
```
When there is more than one option, you can select from the list using the mouse or arrow keys.
Within a function, you can get a list of arguments by pressing `Tab`.
```{r help, echo=FALSE}
# knitr::include_graphics("https://raw.githubusercontent.com/ITSLeeds/TDS/master/courses/2day/images/fucntionhelp.jpg")
# download.file("https://raw.githubusercontent.com/ITSLeeds/TDS/master/courses/2day/images/fucntionhelp.jpg", "figures/functionhelp.jpg")
knitr::include_graphics("figures/functionhelp.jpg")
```
Test RStudio's amazing autocompletion capabilities by typing the beginning of object names and functions and pressing `Tab` to see what suggestions pop-up.
Try pressing `Up` and `Down` after pressing `Tab` to select different options.
**Bonus**: try autocompleting file names by typing `""` (the closing quote mark should be added automatically) and pressing `Tab` when your cursor is between the quote marks. What happens when you type `"~/"` and press `Tab` with your cursor just after the tilde (`~`) symbol. What does this mean? (**Hint:** it involves the word 'home' and you can search the web to find a full answer)
<!-- Todo: put this answer somewhere! -->
<!-- It means that you can use RStudio as a command line file browser, ~ means you 'home folder'! -->
## Getting help
Every function in R has a help page.
You can view the help using `?`. For example, `?sum` and `?plot`.
Many packages also contain 'vignettes', which are long form help documents containing examples and guides.
`vignette()` will show a list of all the vignettes available, or you can also view a specific vignette. For example, `vignette(topic = "sf1", package = "sf")`.
Try getting help on the `stats19` package by typing the following and pressing `Tab` when your cursor is just to the left of the closing bracket `)`. Autocompletion works for more than just R objects and files - try making RStudio autocomplete and run the command `vignette("stats19-vehicles")`. For example:
```{r, eval=FALSE}
vignette(stats19)
```
You can further search and explore R's help files using the **Help** panel in the bottom right window in RStudio.
## Commenting Code
It is good practice to use comments in your code to explain what it does. You can comment code using `#`
For example:
```{r}
# Create vector objects (a whole line comment)
x = 1:5 # a seqence of consecutive integers (inline comment)
y = c(0, 1, 3, 9, 18.1)
```
You can comment/uncomment a whole block of text by selecting it and using `Ctrl+Shift+C`.
<!-- not sure about the next statement so commenting out (RL) -->
<!-- and you can reformat a block of code using `Ctrl+Shift + /`. -->
**Pro tip:** You can add a comment section using `Ctrl+Shift+R`.
## The global environment
The 'Environment' tab shows all the objects in your environment. This includes datasets, parameters, and any functions you have created.
By default, new objects appear in the Global Environment, but you can see other environments with the dropdown menu.
For example, each package has its own environment.
Sometimes you wish to remove things from your environment, perhaps because you no longer need them or because things are getting cluttered.
You can remove an object with the `rm()` function e.g. `rm(x)` or `rm(x, y)`. Alternatively, you can clear your whole environment with the 'broom' button on the 'Environment' Tab.
1. Remove the object `x` that was created in a previous section.
2. What happens when you try to print the `x` by entering it into the console?
3. Try running the following commands in order: `save.image(); rm(list = ls()); load(".RData")`. What happened?
4. How big (how many bytes) is the `.RData` file in your project's folder?
5. Tidy up by removing the `.Rdata` file with `file.remove(".Rdata")`.
## Debugging Code
All the code shown so far is reproducible and, unless you introduced typos, is 'bug free', meaning that it runs without errors.
Typos are common though and even experienced R users frequently see error messages as they undertake interactive data analysis.
For that reason, learning to fix typos in R code is an important skill.
RStudio comes to the rescue here with helpful debugging features.
To test them, write some code that fails, as shown in the code chunk and exercises below \@ref(fig:debug), then answer the questions below by interacting with RStudio:
```{r, error=TRUE}
x = 1:5
y = c(0, 1, 3, 9 18.1) # R code with a typo
```
```{r debug, echo=FALSE, out.width="60%", fig.cap="Debugging code with RStudio: notice the wavy red line highlighting a typo."}
# knitr::include_graphics("https://raw.githubusercontent.com/ropensci/stats19/master/inst/rstudio-autocomplete.png")
# download.file("https://raw.githubusercontent.com/ropensci/stats19/master/inst/rstudio-autocomplete.png", "figures/rstudio-autocomplete.png")
knitr::include_graphics("figures/rstudio-autocomplete.png")
```
1. Try running the faulty code. How can the error message help debug the code?
2. What is the problem with the code shown in the figure?
3. Create other types of error in the code you have run (e.g. no symetrical brackets and other typos).
4. Does RStudio pick up on the errors? And what happens when you try to run buggy code?
**Always address debugging prompts to ensure your code is reproducible**
## Productivity boosting features
<!-- Todo: should these bullet points be subsubsections? -->
Finally, we look at functionality in RStudio that goes beyond the features described above.
RStudio is an advanced and powerful IDE and is highly customisable in myriad ways, especially since the launch of the RStudio Addins add-on system in [2016](https://cran.r-project.org/web/packages/addinslist/readme/README.html).
Rather than try to be comprehensive (an impossible task), this section provides a list of additional RStudio features, starting simple, that have been tried and tested, with links to the relevant documentation rather than extended descriptions.
- Zoom levels and appearance settings: it is important for code and other text to be the right size. Too small and it's hard to see, too big and you end up frequently scrolling up and down. The appropriate text size varies: if you're doing a screen share, big text is appropriate; if you're writing copious amounts of text (as I was when writing this prose in RStudio), smaller text will be handy. On Windows and Linux you can zoom with the shortcuts `Ctl+Shift++` and `Ctl+-` for zooming in and out respectively. See 'Tools > Global Options' menu (which can be launched with the shortcut `Alt+T` and then `G`) for more advanced 'Appearance' settings.
- Global search (and replace): in addition to search and replace functionality for single files (accessed in the standard way, by pressing `Ctl+F`), RStudio has a powerful global search feature inbuilt. Launch this feature by pressing `Ctl+Shift+F` and you can search any file types (e.g. only files ending in `.R`) for any string within an entire project. This feature is very handy when working with large, multi-file projects.
- Shortcuts: there are many, many shortcuts built into RStudio. In fact, there is even a shortcut to show the list of shortcuts. Try pressing `Alt+Shift+K` to get the complete list. Nobody I know can remember, let alone use, all of these. However, over time I expect that you will learn to love some of them. My top 5 RStudio-specific shortcuts are:^[
See online articles such as "[23 RStudio Tips, Tricks and Shortcuts](https://www.dataquest.io/blog/rstudio-tips-tricks-shortcuts/)" for more comprehensive lists of useful RStudio shortcuts.
]
- `Ctl+Enter` to send a line of code from the code editor (called the Source Editor in RStudio) to be executed or 'run' in the console. Amazingly, some other prominent IDEs such as Microsoft's VSCode editor, lack this important feature by default.
- `Ctl+1` and `Ctl+2` to switch between the console (for writing test code and 'run once' commands) and the code editor (for writing code to keep).
- `Alt+Up/Down` and `Alt+Shift+Up/Down` to move and copy lines of code up and down, handy when you want to re-order your code or make small changes to a copy of a line of code.
- [`Ctl+Shift+M`](https://nhsrcommunity.com/blog/r-studio-shortcuts/) will create the pipe operator (` %>% `, this pipe was created using the shortcut!), saving time when creating `dplyr` pipelines, as discussed in Section \@ref(data).
- `Ctl+Shift+F10` when you want to restart R, leaving you with a 'blank slate' in which packages are not loaded and objects are removed from the global environment.
<!-- - Any other key ones? Todo: think about this... -->
- Git integration for collaboration: RStudio provides two mechanisms for sharing your code with others via sites such as GitHub and GitLab, with the 'Git' panel in the top right pane and via the Terminal panel described in the next section.
- Support for Python, C++ and other languages: a joke on Twitter said "What's the best Python editor? RStudio." Although most Python programmers would probably disagree, the joke is true in the sense that R has good support for some other languages: Python and C++ in particular. If you open a Python script in RStudio on a computer that has Python and the `reticulate` R package installed, the R console will magically convert into a Python console when you press `Ctl+Enter` to execute a line of Python code, as described in the article "[Reticulated Python](https://blog.rstudio.com/2018/10/09/rstudio-1-2-preview-reticulated-python/)" on the RStudio website.
Like R package, an active community of developers is developing a range of extensions and RStudio itself is gradually evolving to meet the evolving needs of 21^st^ Century data scientists.
If there are any features that you would like to see, you can always ask others for pointers, e.g. on the [RStudio Community forum](https://community.rstudio.com/).