02-report-template-example

title: "02-report-template-example"
params:
  album: "1989"
echo: true

If you are looking at this code in RStudio you might see it twice, this is so we can the YAML when we render it on the Quarto website.

What does all of this YAML mean?

title: title of file.
params: what we’re filtering for.
echo: ‘true’ repeats all code chunks in render file, ‘false’ excludes them.

Set up

#| label: setup
#| message: false
#| warning: false
#| echo: false

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(janitor)


Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test

Importing our files

taylor_songs <- read_rds("data-processed-taylor/taylor_disco.rds")

taylor_songs |> glimpse()

Rows: 582
Columns: 16
$ name             <chr> "Fortnight (feat. Post Malone)", "The Tortured Poets …
$ album            <chr> "THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY", "THE …
$ release_date     <chr> "2024-04-19", "2024-04-19", "2024-04-19", "2024-04-19…
$ track_number     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
$ uri              <chr> "spotify:track:6dODwocEuGzHAavXqTbwHv", "spotify:trac…
$ acousticness     <dbl> 0.5020, 0.0483, 0.1370, 0.5600, 0.7300, 0.3840, 0.624…
$ danceability     <dbl> 0.504, 0.604, 0.596, 0.541, 0.423, 0.521, 0.330, 0.53…
$ energy           <dbl> 0.386, 0.428, 0.563, 0.366, 0.533, 0.720, 0.483, 0.57…
$ instrumentalness <dbl> 1.53e-05, 0.00e+00, 0.00e+00, 1.00e-06, 2.64e-03, 0.0…
$ liveness         <dbl> 0.0961, 0.1260, 0.3020, 0.0946, 0.0816, 0.1350, 0.111…
$ loudness         <dbl> -10.976, -8.441, -7.362, -10.412, -11.388, -7.684, -9…
$ speechiness      <dbl> 0.0308, 0.0255, 0.0269, 0.0748, 0.3220, 0.1040, 0.039…
$ tempo            <dbl> 192.004, 110.259, 97.073, 159.707, 160.218, 79.943, 8…
$ valence          <dbl> 0.281, 0.292, 0.481, 0.168, 0.248, 0.438, 0.340, 0.39…
$ popularity       <int> 82, 79, 80, 82, 80, 81, 78, 79, 82, 81, 77, 80, 82, 8…
$ duration_min     <dbl> 3.816083, 4.884133, 3.396683, 4.353800, 4.382900, 5.6…

Defining paramater(s)

Here we’ll create our parameter

1albums <- str_split_1(
2  params$album, ",")

albums

1: Use str_split_1 to split a single string into pieces to return a single character vector.
2: We use ‘params$album’ to create the string that will be read through str_split_1 . In the initial template, that will be the single parameter ‘1989’. The comma is used to determine when one value stops and the other begins; this is useful when you are looking for multiples of one type of parameter.

[1] "1989"

Album Names: 1989
Use this to check your work ^

Songs from 1989

Now let’s filter our data for the album(s) we are looking at.

songs <- taylor_songs |> 
1  filter(album %in% albums)

songs

1: Filtering for every instance that album (the column) is equal to albums (the parameter.)

Let’s do some analysis!

Let’s look at acousticness first. On Kaggle, where we got the data, the author defines acousticness like this: “A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.”

acousticness <- songs |> arrange(acousticness |> desc()) |> 
  select(
    name, album, acousticness
  )

acousticness

Now I want to look at the most danceable songs. Our data author tells us: “Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.”

danceability <- songs |> arrange(danceability |> desc()) |> 
  select(
    name, album, danceability
  )

danceability

What about popularity? It’s unclear how popularity is calculated, but it is on a scale from 0 to 100.

popularity <- songs |> arrange(popularity |> desc()) |> 
  select(
    name, album, popularity
  )

popularity

Now, I want to look at tempo vs danceability. We already saw the definition of danceability but the tempo is in beats per minute (BPM). Let’s make a chart.

ggplot(songs, aes(x = tempo, y = danceability)) +
  geom_point() +
  scale_x_continuous(name = "Tempo (BPM)", n.breaks = 10) +
  scale_y_continuous(name = "Danceability", limits = c(0,1)) +
  labs(title = str_wrap(str_glue("How does the tempo affect danceability of Taylor Swift songs from album(s): {album_names}", album_names = params$album)))

Note above, we used our 1989 variable so that our chart title will change based on what we input in our render file for which albums.