Combine results

Author

Media Innovation Group

This notebook compiles Voting Tabulation Districts election returns from different years, processes them into state-wide results, then filters them for State Rep.

Originally created by MIG data fellow Isabella Zeff, it has since been refactored by Christian McDonald.

The data some from the Texas Legislative Council’s data portal. The documentation is available here. We used the 2024 General VTDs Election Data CSV version that includes 2012 - 2024 election data reported by 2024 primary election VTDs.

Setup

Expand this to see code

library(tidyverse)
library(janitor)

Functions

Function to create totals from the county-by-county results

Expand this to see code

fun_totals_all <- function(.data){
  .data |> 
    group_by(year, election, office, name, party, incumbent) |> 
    summarize(candvotes = sum(votes), .groups = "drop") |> 
    arrange(year, office, candvotes |> desc())
}

Import and combine

Expand this to see code

all_files <- list.files(
  "data-original/vdt-returns",
  pattern = ".csv",
  full.names = TRUE)

# all_files

Check if we have all files

This makes sure that we have the main results for the time period that we are interested in (very specific to the Texas House spending analysis.) This does not take special elections into account. At one point we were missing runoff results for 2024.

Expand this to see code

main_races <- c(
  "Democratic_Primary",
  "Democratic_Runoff",
  "Republican_Primary",
  "Republican_Runoff",
  "General"
)

all_files |>
  as_tibble() |> 
  mutate(
    value = str_remove(value, "data-original/vdt-returns/"),
    value = str_remove(value, "_Election_Returns.csv"),
    year = str_sub(value, 1, 4) |> as.numeric(),
    election = str_remove(value, "^\\d{4}_")
  ) |> 
  filter(year >= 2016) |> 
  filter(election %in% main_races) |> 
  count(year, sort = T)

Combine the files

Expand this to see code

all_raw <- all_files |> 
  set_names(basename) |> 
  map(\(x) read_csv(x, col_types = cols(.default = col_character()))) |> 
  list_rbind(names_to = "source") |>
  clean_names()

Clean source

Here we use the name of the file to find the election year and name. We also turn votes into a number.

Expand this to see code

all_returns <- all_raw |> 
  mutate(
    year = str_sub(source, 1, 4),
    election = str_sub(source, 6, -22) |> str_replace_all("_", " "),
    .before = county
  ) |> 
  mutate(votes = votes |> as.numeric()) |> 
  select(-source)

all_returns |> head()

Tally votes

Expand this to see code

all_totals <- all_returns |> 
  fun_totals_all() |> 
  arrange(year, election, office, candvotes |> desc())

all_totals |> head()

Filter for State Reps

Find the state reps in the data.

Expand this to see code

rep_totals <- all_totals |> 
  filter(str_detect(office, "State Rep")) |> 
  mutate(district = parse_number(office), .after = office)

Export results

I am exporting just the rep results at this point..

Expand this to see code

all_totals |> 
  write_rds("data-processed/01-all-totals.rds")

rep_totals |> 
  write_rds("data-processed/01-house-totals.rds")

Checking unopposed races

This is just to confirm that the reason (or at least a reason) why we don’t have results from every house district is because of unopposed races.

2024 General

Of all the Texas House results we have, here are the districts for the 2024 general election.

Expand this to see code

rep_totals |> 
  distinct(year, election, district) |> 
  arrange(year, election, district) |> 
  filter(year == 2024, election == "General")

We are missing 1, 3, 9, 11 for starters.

If we look at results for this election on ballotpedia we can see those same races did not have more than one candidate on the ballot. Even a race with a write-in made it (Dist 5.)

Republican primary

And then if we look at the same for the Republican Primary:

Expand this to see code

rep_totals |> 
  distinct(year, election, district) |> 
  arrange(year, election, district) |> 
  filter(year == 2024, election == "Republican Primary")

If we compare the list above with ballotpedia’s primary election list, we see the first district missing is 3, which tracks with Cecil Bell Jr. being the only candidate. District 6 is also unopposed, etc.

Democratic Primary

For the Dems, you can look at that same list and see there was not a valid primary race for the first 18 districts. Those all had unopposed or zero candidates and the primary was canceled.

Expand this to see code

rep_totals |> 
  distinct(year, election, district) |> 
  arrange(year, election, district) |> 
  filter(year == 2024, election == "Democratic Primary")