Using RStudio in the cloud

This lesson is designed to be run on posit.cloud in a workshop situation, but there is no reason it can’t be completed using RStudio Desktop if you are familiar with it. The lessons would need to be downloaded from Github and packages installed, but directions for that are beyond the scope of this lesson.

Reading these lessons

These lessons are written using Markdown in Quarto notebooks, so you will be directly reading the code. Most everything will look like text and be perfectly readable, but some constructions may be new to you. So some quick tips.

  • Text that starts with a dash or asterisk is part of list, and it is information for you to consume.
  • You might see formatting that starts and ends with tick marks like this: variable_name. This is a way to designate code or a file name within Markdown prose. If you are asked to use a value in some way, don’t include the tick marks.
  • You might also see links to web pages like this: tidyverse.
  • You’ll see lots of other Markdown syntax like headlines that hopefully will make sense over time.

If you see items in an ordered list that start with numbers, that is an indication YOU SHOULD DO WHAT IT SAYS.

  1. An item like this is a direction. Do what it says.
  2. This would be the next direction.

We’ll talk more about why we use Markdown and Quarto in a bit.

The story so far …

We are assuming at this point that you have ALREADY done these things:

  • Using a web browser signed up and logged into posit.cloud
  • You’ve opened a copy of the project and are reading this file.

If all that is true, then you are running a “virtual machine” in your browser that has R and RStudio installed, and you have a “project” open, and in that project you reading this file.

Part One: Getting familiar with RStudio

RStudio is your dashboard for writing R code. Open files will show up in the upper-left quadrant of RStudio.

The upper-right quadrant has several tabs, the first of which is Environment: Code objects stored in memory (such as a table of data), will show up there.

The lower-right quadrant also has several tabs, the first of which is Files: it shows what files exist in your current working directory, which is this workshop folder. Among those files are ones that end with .qmd, which are Quarto files we will edit. Once opened, they appear in the upper-left quadrant.

In the lower-left, you’ll see the Console. This will reflect all the R code you run, but we don’t do much with it when we’re working in Quarto files. There is also Terminal, which is a way to interface your computer through written commands. We won’t be using that today, but if you continue with this type of work you likely will.

I encourage writing R code within Quarto Document files (.qmd) because they enable you to both write comments (like these sentences) and to write code, in what we call a code chunk.

When you see a numbered list item like the one below, these are directions for you follow. So do what is says!

  1. In the code chunk below you’ll see a right-pointing green arrow at the top-right of the chunk. Click on that arrow and you’ll “run” all the code inside the chunk.
message <- "Hello, World!"

print(message)
[1] "Hello, World!"

In Quarto files, the results of the code in each code chunk will print below the chunk, allowing you to quickly see the results of your work. This allows us to intersperse our prose notes and explanations in between our code. The merits of this will hopefully become apparent as we work on this throughout the day.

You create a code chunk by typing ctrl+alt+i (cmd+option+i on a Mac) and you write your code in between the first and last lines of the code chunk. All your comments, questions, thoughts, and notes are written outside of the code chunks are written in Markdown, a documentation language that reads well as text but converts easily to HTML and other outputs. If you don’t know it, don’t worry about it for now … it’s really just text.

  1. Create a new code chunk after these directions using the keyboard commands.
  2. Type in the object message and then run the chunk.

The “Hello, World!” message will print out again because that text was saved into an object named message. We’ll talk about objects next.

We can usually just type an object name and it will print to the console. But we sometimes use print() to make sure the object prints to the console, even if it’s not the last line of code in a chunk.

Part Two: Terminology

Some important terminology has come up already:

  • notebook: certain type of R file that allows us to write text and code together in the same document. We are using Quarto Documents, which are a kind of interactive notebook.
  • code chunk: created by typing ctrl+alt+i (cmd+option+i on a Mac). Type your code inside code chunks.
  • Environment: this is basically your workspace memory for every R session. It’s empty until you start storing information in objects. You should see an object named message in your environment now.
  • objects and variables: While technically different things, people sometimes use the term object and variable interchangeably. An object is something stored into memory within an R session. Objects can be data, functions or anything we can reference later. A variable is the name of an object. Think of variables like nametags and objects like the actual items they’re describing. Variable is also the term R uses for the names of columns inside data.

For example, the following code stores the word “spaghetti” into an object named “x”, using the assignment operator <-, and then prints the contents of that object to the console below the code chunk.

  1. Use the green run arrow to run the code chunk below.
x <- "spaghetti"

x
[1] "spaghetti"

You can also use keyboard commands to run code: ctrl+Enter (or cmd-Return on a Mac) will run a line of code you are on or have selected. ctrl+Shift+Enter (or cmd-Shift-Return on Mac) will run all the code within a chunk.

In the example above, “spaghetti” is a string of text, or characters. You can also store numbers in objects (which do not require double quotes):

y <- 3

y
[1] 3

While you can name variables whatever you want, there are some rules and conventions:

  • You can’t start a variable name with a number.
  • Variable names must be a single unit with no spaces. Convention is to use an underscore _ to separate terms in a name.
  • Use short variable names that describe what the object represents. Don’t be generic. If your variable stores a number value of someone’s age, Use age not my_var.
  • data types: The kind of data we are working with …
    • character : commonly referred to as a string; can be letters, numbers, punctuation, etc. Always enclosed in “double quotes”.
    • integer : a whole number, such as 1, 5, 10000. No decimal places.
    • numeric : a number that can have decimal places, such as 5.2 or 100.37
    • date : real dates understand things like months and weekdays.
    • logical : either TRUE or FALSE (not quoted)
    • there are some other data types we won’t get into today.
  • vector: this is a common feature of R that you will use regularly. A vector is a series of values. Vectors can only store elements of the same data type, for example all strings or all numbers, and are created with the c() function:
  1. Use the green arrow or the keyboard commands to run the code chunk below.
food_list <- c("spagetti sauce", "noodles", "parmesan")
numbs <- c(1,2,3)

food_list
[1] "spagetti sauce" "noodles"        "parmesan"      
numbs
[1] 1 2 3
  • function: these are the action verbs of programming. (Not just in R, but also spreadsheets, database managers and other programming languages). Every function has a particular structure and does a particular thing. The structure is: function_name(arguments). For example, the sum() function works in R the same way it works in other programs.
  1. Use the green arrow to run the chunk below and see the sum of x and y.
a <- 1
b <- 2
sum(a,b)
[1] 3
  • package: a set of features and functions that are not a part of base R, but that you can add to R to increase its functionality. You install packages on your computer once by using the install.packages() function. Then every time you want to use a package (such as tidyverse) in your file, you load it into your environment using the library() function. All the necessary packages for this class have already been installed on your virtual machine.
  • pipe: a pipe moves information from one function to the next. There are two ways to write a pipe (long story) but they work the same: |> or %>%. A shortcut for typing it is ctrl+shift+m (cmd+shift+m on a Mac).
  1. Run the code chunk below to show how to use the typeof() function to find the data type of “food_list”. The second line does the same thing, but it is moving our object food_list into the function using using the pipe.
typeof(food_list)
[1] "character"
food_list |> typeof()
[1] "character"

As we go through the day you’ll hopefully see why pipes make our code much easier to understand.

Part Three: Where the data goes

The key to understanding a programming language like R is to understand how information is passed around. For the sake of this class, let’s refer to information as data, although it won’t always be tabular. Data can be stored in an object or printed to the console (which is directly below a code chunk in our notebooks). Data stored in an object shows up in the Environment, and can be referred to later in your script.

Additionally, data can be passed (or piped) through functions that filter, sort, aggregate, and/or mutate it in some way. But it will always end up either printing to the console or being stored in a variable.

Part Four: Coming together with Quarto

As you’ve seen within this document, we can mix together writing and explanation in our Quarto document. This allows us to publish our work as a repeatable, annotated document as HTML. To do this, we render our documents.

  1. Click on the Render button at the top-center of your Quarto document, or use cntl+Shift+k (cmd+Shift+k on a Mac), to turn your document into HTML.

This will open a web page either within RStudio’s “Viewer” pane, or in a separate web browser window. Look through this page and compare how your text and code looks in your Quarto document vs what is displayed on the web page.

(If you got an error when you rendered, raise your hand so someone can help troubleshoot it.)

You can use Quarto and Markdown to create websites, books, slideshows and more.

Let’s learn some more code. Save and close this document by clicking on the small x after it’s name, then open the 02-tidyverse.qmd file by clicking on it in your Files pane.