By Christian McDonald, Assistant Professor of Practice
School of Journalism and Media, Moody College of Communication
University of Texas at Austin
The purpose of this notebook is to look at the number of deliveries per doctor and some subsets of that, answering the following questions:
I DO NOT filter for complications, fetal deaths or other factors. This could be revisited later, but the idea is to see how many people are served for these procedures regardless of complications.
I use the ATTENDING_PHYSICIAN_UNIF_ID
field to identify the attending physician, described as “Unique identifier assigned to the licensed physician expected to certify medical necessity of services rendered, with primary responsibility for the patient’s medical care and treatment.” That indication does not guarantee the doctor performed a specific procedure but billing experts tell us this is a valid indication of a maternal patient’s doctor.
The ATTENDING_PHYSICIAN_UNIF_ID values are suppressed if there are fewer than 5 in a quarter, or if the license was temporary our could not be matched. That appears to be a significant suppression for many hospitals.
library(fs)
library(tidyverse)
library(dplyr)
library(janitor)
library(DT)
library(jsonlite)
# suppresses grouping warning
options(dplyr.summarise.inform = FALSE)
I start with a subset of the THCIC in-patient public use data files that include all deliveries as outlined in AHRQ’s Inpatient Quality Indicators Technical Specifications. The DELOCMD
specification in IQI 33 is “All deliveries, identified by any-listed ICD-10-CM diagnosis code for outcome of delivery”. Some cleaning was applied as outlined in the 01-process-ahrq-del-loop
notebook. The result is imported here.
(I also have test data options, which is a small subset of the data for faster processing in testing.)
test_flag <- F
### test data
path_test <- "data-test/ahrq_del_all_loop_test.rds"
### production data
path_prod <- "data-processed/ahrq_del_cleaned.rds"
### cleaned providers list
providers_full <- read_rds("data-processed/providers_full.rds")
### import based on flag
if (test_flag == T) del <- read_rds(path_test) else del <- read_rds(path_prod)
del %>% nrow()
## [1] 1457413
These are lists from various AHRQ and Leapfrog definitions. See 01-process-lists for details.
# ICD-10-PCS procedure codes for Cesarean delivery
prcsecp_list <- read_rds("procedures-lists/ahrq_prcsecp.rds") %>% .$prcsecp
# Episiotomy codes
epi_list <- read_rds("procedures-lists/lf_epi.rds") %>% .$epi
epi_list
## [1] "0W8NXZZ"
# surgical procedure columns
surg_cols <- read_rds("procedures-lists/cols_surg.rds") %>% .$surg
Later we want to get rates for Cesarean deliveries. I create a marker column so we can do that.
AHRQ Inpatient Quality Indicator for IQI 21 Cesarean Delivery Rate, Uncomplicated defines a Cesarean delivery as such.
Number of Cesarean deliveries among cases meeting the inclusion and exclusion rules for the denominator. Cesarean deliveries are identified by any-listed ICD-10-PCS procedure codes for Cesarean delivery (PRCSECP) and without any-listed ICD-10-PCS procedure codes for hysterotomy (PRCSE2P).
I don’t apply the hysterotomy filter since we want all Cesareans.
del_csec <- del %>%
mutate(
PRCSECP = case_when(
PRINC_SURG_PROC_CODE %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_1 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_2 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_3 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_4 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_5 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_6 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_7 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_8 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_9 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_10 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_11 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_12 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_13 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_14 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_15 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_16 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_17 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_18 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_19 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_20 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_21 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_22 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_23 %in% prcsecp_list ~ TRUE,
OTH_SURG_PROC_CODE_24 %in% prcsecp_list ~ TRUE,
TRUE ~ FALSE
)
)
# peek at result
del_csec %>%
count(PRCSECP) %>%
rename(CASES = n)
I use the 2019 Leapfrog Hospital Survey p103 for the episiotomy definition. Of note is we are applying this to all deliveries as defined by AHRQ instead of Leapfrog’s original definition. The definition difference should be minimal.
# list of codes for episiotomy, which is really one: 0W8NXZZ
del_csec_epi <- del_csec %>%
mutate(
EPI = case_when(
PRINC_SURG_PROC_CODE %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_1 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_2 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_3 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_4 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_5 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_6 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_7 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_8 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_9 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_10 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_11 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_12 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_13 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_14 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_15 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_16 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_17 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_18 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_19 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_20 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_21 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_22 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_23 %in% epi_list ~ TRUE,
OTH_SURG_PROC_CODE_24 %in% epi_list ~ TRUE,
TRUE ~ FALSE
)
)
del_csec_epi %>%
count(EPI) %>%
rename(RECORD = n)
The ATTENDING_PHYSICIAN_UNIF_ID is “Suppressed when the number of physicians represented in a DRG for a hospital is less than the minimum cell size of five.”
I create a variable if doctor is suppressed …
suppressed = c("9999999998", "9999999999")
# add value of suppressed doctor to data
del_csec_epi_suppflag <- del_csec_epi %>%
mutate(
PSUPP = if_else(ATTENDING_PHYSICIAN_UNIF_ID %in% suppressed, T, F)
)
At this point our data is processed.
Use the created PSPP
field to count overall suppressions to get idea of scope.
del_csec_epi_suppflag %>%
tabyl(PSUPP) %>%
rename(count = n) %>%
adorn_pct_formatting()
The ATTENDING_PHYSICIAN is suppressed in 6.5% of deliveries. For some hospitals, like Brownwood Regional Medical Center, all the physician data is suppressed.
del_csec_epi_suppflag %>%
filter(str_detect(PROVIDER_NAME, "Brownwood Regional Medical Center")) %>%
tabyl(PSUPP) %>%
rename(count = n) %>%
adorn_pct_formatting()
Since our hospitals of interest are in Laredo, let’s check how many of those are suppressed as well.
del_csec_epi_suppflag %>%
filter(str_detect(PROVIDER_NAME, "Laredo")) %>%
tabyl(PSUPP, PROVIDER_NAME) %>%
adorn_percentages("col") %>%
adorn_pct_formatting(digits = 2) %>%
adorn_ns()
Later we want to know which hospitals have high doctor suppression rates, which would indicate that our deliveries per doctor calculations could be suspect. If the value for SUSPECT
is TRUE
, then the suppression rate is higher than 10% or otherwise suspect.
supp_rate_hosp <- del_csec_epi_suppflag %>%
group_by(THCIC_ID, PROVIDER_NAME, PSUPP) %>%
summarize(
CNT = n()
) %>%
pivot_wider(names_from = PSUPP, values_from = CNT) %>%
rename(
NSUPP = `FALSE`,
YSUPP = `TRUE`
) %>%
mutate(
S_RT = (YSUPP / (NSUPP + YSUPP)) %>% round_half_up(2),
SUSPECT = case_when(S_RT < .1 ~ F,
TRUE ~ T)
)
supp_rate_hosp %>% head(10)
To achieve a count of deliveries per doctor, I remove the records where the ATTENDING_PHYSICIAN was suppressed as this could combine multiple doctors into the same group. This suppression may have more affect on some hospitals than others.
del_supp <- del_csec_epi_suppflag %>%
filter(
!ATTENDING_PHYSICIAN_UNIF_ID %in% suppressed
)
How many deliveries per doctor were there in 2019? How does that extrapolate to average deliveries per week per doctor? Starts with suppressed doctors removed.
While we have all four quarters of 2019, there could be some records missing as hospitals have 60 days past end of quarter to file those records.
# Weeks in a year
week_var = 52.143
del_doc_hosp_2019 <- del_supp %>%
filter(YR == 2019) %>%
group_by(THCIC_ID,PROVIDER_NAME) %>%
summarize(
PHYSICIANS = n_distinct(ATTENDING_PHYSICIAN_UNIF_ID),
DELIVERIES = n()
) %>%
mutate(
DELSPERDOC = (DELIVERIES / PHYSICIANS) %>% round_half_up(1),
DELSPERWK = (DELSPERDOC / week_var) %>% round_half_up(1)
) %>%
arrange(DELSPERWK %>% desc())
# get suppression suspect value
srh <- supp_rate_hosp %>%
select(THCIC_ID, PROVIDER_NAME, SUSPECT)
del_doc_hosp_2019 <- del_doc_hosp_2019 %>%
left_join(srh, by = c("THCIC_ID", "PROVIDER_NAME")) %>%
# filter out hospitals with suspect numbers due to suppression
filter(SUSPECT == F) %>%
select(-SUSPECT)
del_doc_hosp_2019 %>% datatable()
Total deliveries after filtering years with fewer than 30 deliveries. This does NOT filter out suppressed doctors since we are looking for a total delivery count.
# group and count
deliveries_yr <- del_csec_epi_suppflag %>%
group_by(YR, THCIC_ID, PROVIDER_NAME) %>%
summarize(BIRTHS = n()) %>%
filter(BIRTHS > 30)
# pivot for table
deliveries_yr %>%
pivot_wider(names_from = YR, values_from = BIRTHS) %>%
datatable()
We get a distinct count of the hospitals studied to include in the story.
deliveries_yr %>%
ungroup() %>%
select(-YR) %>%
distinct(THCIC_ID, PROVIDER_NAME) %>%
nrow()
## [1] 263
We do the same for just 2019.
deliveries_yr %>%
ungroup() %>%
filter(YR == "2019") %>%
distinct(THCIC_ID, PROVIDER_NAME) %>%
nrow()
## [1] 223
Here we look at all deliveries to see how many mothers have Cesareans. Note this starts with all ATTENDING_PHYSICIANS, including those who’s IDs are suppressed. These are also all Cesareans, complicated or not. This is a simple rate of Cesarean births / All births in order to get an idea how many are performed.
Excludes hospitals with fewer than 120 deliveries in our time span.
csec_rate_hosp <- del_csec_epi_suppflag %>%
group_by(THCIC_ID, PROVIDER_NAME, PRCSECP) %>%
count(PRCSECP) %>%
rename(CASES = n) %>%
pivot_wider(names_from = PRCSECP, values_from = CASES) %>%
rename(NCSEC = "FALSE", CSEC = "TRUE") %>%
mutate(
TOTAL = NCSEC + CSEC,
RATE = round_half_up((CSEC / TOTAL) * 100, 1)
) %>%
filter(
TOTAL >= 120
) %>%
arrange(RATE %>% desc())
csec_rate_hosp %>%
datatable()
csec_rate_hosp %>%
filter(str_detect(PROVIDER_NAME, "Laredo"))
This is the same calculation as above, but by year, showing only the rates. Excludes hospitals with fewer than 30 deliveries in a given year.
csec_hosp_rate_yr <- del_csec_epi_suppflag %>%
group_by(YR, THCIC_ID, PROVIDER_NAME, PRCSECP) %>%
count(YR, PRCSECP) %>%
rename(CASES = n) %>%
pivot_wider(names_from = PRCSECP, values_from = CASES) %>%
rename(NCSEC = "FALSE", CSEC = "TRUE") %>%
mutate(
TOTAL = NCSEC + CSEC,
CRATE = round_half_up((CSEC / TOTAL) * 100, 1)
) %>%
filter(
TOTAL >= 30
)
Present the rate as table.
csec_hosp_rate_yr_table <- csec_hosp_rate_yr %>%
select(THCIC_ID, YR, PROVIDER_NAME, CRATE) %>%
pivot_wider(names_from = YR, values_from = CRATE) %>%
arrange(`2019` %>% desc())
csec_hosp_rate_yr_table %>%
datatable()
csec_hosp_rate_yr_table %>%
filter(str_detect(PROVIDER_NAME, "Laredo"))
Like the simple Cesarean rate above, here we look at all deliveries and note how many involved episiotomies. Note this starts with all ATTENDING_PHYSICIANS, including those who’s IDs are suppressed. Excludes hospitals with fewer than 120 deliveries in our time span.
epi_rate_hosp <- del_csec_epi_suppflag %>%
group_by(THCIC_ID, PROVIDER_NAME, EPI) %>%
count(EPI) %>%
rename(CASES = n) %>%
pivot_wider(names_from = EPI, values_from = CASES) %>%
rename(
EPI = "TRUE",
NEPI = "FALSE"
) %>%
mutate(
TOTAL = NEPI + EPI,
ERATE = round_half_up((EPI / TOTAL * 100), 1)
) %>%
filter(TOTAL >= 120) %>%
arrange(ERATE %>% desc())
epi_rate_hosp %>%
datatable()
epi_rate_hosp %>%
filter(str_detect(PROVIDER_NAME, "Laredo"))
Values exluded where there were fewer than 30 deliveries in a given year.
epi_rate_hosp_yr <- del_csec_epi_suppflag %>%
group_by(YR, THCIC_ID, PROVIDER_NAME, EPI) %>%
count(YR, EPI) %>%
rename(CASES = n) %>%
pivot_wider(names_from = EPI, values_from = CASES) %>%
rename(NEPI = "FALSE", EPI = "TRUE") %>%
mutate(
TOTAL = NEPI + EPI,
ERATE = round_half_up((EPI / TOTAL) * 100, 1)
) %>%
filter(
TOTAL >= 30
) %>%
select(YR, THCIC_ID, PROVIDER_NAME, ERATE)
# pivot for table
epi_rate_hosp_yr_table <- epi_rate_hosp_yr %>%
pivot_wider(names_from = YR, values_from = ERATE) %>%
arrange(`2019` %>% desc())
#peek at table
epi_rate_hosp_yr_table %>%
datatable()
epi_rate_hosp_yr %>%
filter(str_detect(PROVIDER_NAME, "Laredo"))
This looks to see if a mother had either a cesarean or an episiotomy. All the same things apply: All doctors, all complications, but hospitals with fewer than 120 deliveries in our time period are excluded.
epicsec_rate_hosp <- del_csec_epi_suppflag %>%
mutate(
EPICSEC = if_else((EPI == T | PRCSECP) == T, T, F)
) %>%
count(THCIC_ID, PROVIDER_NAME, EPICSEC) %>%
rename(CASES = n) %>%
pivot_wider(names_from = EPICSEC, values_from = CASES) %>%
rename(
EPICSEC = "TRUE",
NEPICSEC = "FALSE"
) %>%
mutate(
TOTAL = EPICSEC + NEPICSEC,
ECRATE = round_half_up((EPICSEC / TOTAL * 100), 1)
) %>%
filter(TOTAL >= 120) %>%
arrange(ECRATE %>% desc())
epicsec_rate_hosp %>%
datatable()
epicsec_rate_hosp %>%
filter(str_detect(PROVIDER_NAME, "Laredo"))
This uses all deliveries. Hospital excluded if fewer than 30 deliveries a year.
# prep data for table
epicsec_rate_hosp_yr <- del_csec_epi_suppflag %>%
mutate(
EPICSEC = if_else((EPI == T | PRCSECP) == T, T, F)
) %>%
count(THCIC_ID, YR, PROVIDER_NAME, EPICSEC) %>%
rename(CASES = n) %>%
pivot_wider(names_from = EPICSEC, values_from = CASES) %>%
rename(
EPICSEC = "TRUE",
NEPICSEC = "FALSE"
) %>%
mutate(
TOTAL = EPICSEC + NEPICSEC,
ECRATE = round_half_up((EPICSEC / TOTAL * 100), 1)
) %>%
filter(TOTAL >= 30)
# pivot data for table
epicsec_rate_hosp_yr_table <- epicsec_rate_hosp_yr %>%
select(THCIC_ID, YR, PROVIDER_NAME, ECRATE) %>%
pivot_wider(names_from = YR, values_from = ECRATE) %>%
arrange(`2019` %>% desc())
epicsec_rate_hosp_yr_table %>%
datatable()
epicsec_rate_hosp_yr_table %>%
filter(str_detect(PROVIDER_NAME, "Laredo"))
One value we don’t have yet is the percentage of patients using Medicaid.
mediciad_2019 <- del_csec_epi_suppflag %>%
# filter for 2019
filter(
YR == 2019
) %>%
# create MC col
mutate(
MC = if_else(FIRST_PAYMENT_SRC == "MC", T, F),
MC = if_else(is.na(MC), F, MC)
) %>%
# group and count MC
group_by(THCIC_ID, PROVIDER_NAME, PROVIDER_CITY) %>%
count(MC) %>%
# pivot to shape to get rate
pivot_wider(names_from = MC, values_from = n) %>%
# get rate
mutate(
MCRATE = ((`TRUE` / (`TRUE` + `FALSE`)) * 100) %>% round_half_up(1)
)
# peek as table
mediciad_2019 %>% datatable()
Building a single dataframe that has summary data by hospital. This is for the interactive. Before we can join them all, we need to pare some calcs down to 2019 only.
Extrating 2019 for joining later.
epi_rate_hosp_2019 <- epi_rate_hosp_yr %>%
filter(
YR == 2019
) %>%
ungroup() %>%
select(THCIC_ID, ERATE)
csec_hosp_rate_2019 <- csec_hosp_rate_yr %>%
filter(
YR == 2019
) %>%
ungroup() %>%
select(THCIC_ID, CRATE)
In this view I remove doctors with fewer than 10 deliveries in a given year to give a more accurate look at the staff doctors that do the bulk of deliveries. Suppressed doctor records are removed.
The choice of 10 deliveries as a threshold is arbitrary and could be changed if we deem necessary.
del_hosp_doc_freq <- del_supp %>%
group_by(YR, THCIC_ID, PROVIDER_NAME, ATTENDING_PHYSICIAN_UNIF_ID) %>%
summarize(DELIVERIES = n()) %>%
rename(PHYSICIAN = ATTENDING_PHYSICIAN_UNIF_ID) %>%
filter(DELIVERIES >= 10) %>%
group_by(THCIC_ID, PROVIDER_NAME) %>%
summarize(
DELIVERIES = sum(DELIVERIES),
PHYSICIANS = n_distinct(PHYSICIAN)
) %>%
mutate(
DRATE = round_half_up(DELIVERIES/PHYSICIANS,1)
) %>%
arrange(DRATE %>% desc())
del_hosp_doc_freq %>%
datatable()
Delivery rate per doctor by year. Unknown doctors are suppressed. Infrequent deliveries (<20/year) are filtered out.
del_supp %>%
filter(str_detect(PROVIDER_NAME, "Laredo")) %>%
group_by(YR, THCIC_ID, PROVIDER_NAME, ATTENDING_PHYSICIAN_UNIF_ID) %>%
summarize(DELIVERIES = n()) %>%
rename(PHYSICIAN = ATTENDING_PHYSICIAN_UNIF_ID) %>%
filter(DELIVERIES >= 10) %>%
group_by(YR, THCIC_ID, PROVIDER_NAME) %>%
summarize(
DELIVERIES = sum(DELIVERIES),
PHYSICIANS = n_distinct(PHYSICIAN)
) %>%
mutate(
RATE = round_half_up(DELIVERIES/PHYSICIANS, 1)
) %>%
select(YR, THCIC_ID, PROVIDER_NAME, RATE) %>%
pivot_wider(names_from = YR, values_from = RATE)
Just how much does doctor suppression affect these rates in Laredo? We’ll do the same calculation on the data but without suppressing unknown doctors. Note this has limitations because multiple doctors could be counted in the two suppression values.
del %>%
filter(str_detect(PROVIDER_NAME, "Laredo")) %>%
group_by(YR, THCIC_ID, PROVIDER_NAME, ATTENDING_PHYSICIAN_UNIF_ID) %>%
summarize(DELIVERIES = n()) %>%
rename(PHYSICIAN = ATTENDING_PHYSICIAN_UNIF_ID) %>%
filter(DELIVERIES >= 10) %>%
group_by(YR, THCIC_ID, PROVIDER_NAME) %>%
summarize(
DELIVERIES = sum(DELIVERIES),
PHYSICIANS = n_distinct(PHYSICIAN)
) %>%
mutate(
RATE = round_half_up(DELIVERIES/PHYSICIANS, 1)
) %>%
select(YR, THCIC_ID, PROVIDER_NAME, RATE) %>%
pivot_wider(names_from = YR, values_from = RATE)
Get the rate of medicaid deliveries statewide.
medi_rate_tx_yr_summary <- del %>%
mutate(
MC = if_else(FIRST_PAYMENT_SRC == "MC", T, F),
MC = if_else(is.na(MC), F, MC)
) %>%
count(YR, MC) %>%
rename(CASES = n) %>%
pivot_wider(names_from = MC, values_from = CASES) %>%
rename(MCF = `FALSE`, MCT = `TRUE`) %>%
mutate(
SUMMARY = "TX",
CATEGORY = "MEDICARE",
MEASUREMENT = "RATE",
TOTAL = MCF + MCT,
VALUE = round_half_up((MCT / TOTAL) * 100, 1) # RATE
) %>%
select(YR, SUMMARY, CATEGORY, MEASUREMENT, VALUE)
medi_rate_tx_yr_summary
Get the rate of cesearean deliveries across the state.
csec_rate_tx_yr_summary <- del_csec %>%
group_by(YR, PRCSECP) %>%
count(PRCSECP) %>%
rename(CASES = n) %>%
pivot_wider(names_from = PRCSECP, values_from = CASES) %>%
rename(NCSEC = "FALSE", CSEC = "TRUE") %>%
mutate(
SUMMARY = "TX",
CATEGORY = "CESAREAN",
MEASUREMENT = "RATE",
TOTAL = NCSEC + CSEC,
VALUE = round_half_up((CSEC / TOTAL) * 100, 1) # RATE
) %>%
select(YR, SUMMARY, CATEGORY, MEASUREMENT, VALUE)
csec_rate_tx_yr_summary
Get the percentage of deliveries where either an episiotomy or cesarean were performed.
epicsec_rate_tx_yr_summary <- del_csec_epi %>%
mutate(
EPICSEC = if_else((EPI == T | PRCSECP) == T, T, F)
) %>%
group_by(YR, EPICSEC) %>%
count(EPICSEC) %>%
rename(CASES = n) %>%
pivot_wider(names_from = EPICSEC, values_from = CASES) %>%
rename(
EPICSEC = "TRUE",
NEPICSEC = "FALSE"
) %>%
mutate(
SUMMARY = "TX",
CATEGORY = "EPI_OR_CSEC",
MEASUREMENT = "RATE",
TOTAL = EPICSEC + NEPICSEC,
VALUE = round_half_up((EPICSEC / TOTAL * 100), 1) # RATE
) %>%
select(YR, SUMMARY, CATEGORY, MEASUREMENT, VALUE)
epicsec_rate_tx_yr_summary
epicsec_rate_hosp_yr_summary <- epicsec_rate_hosp_yr %>%
ungroup() %>%
group_by(YR) %>%
summarize(
SUMMARY = "HOSPITAL",
CATEGORY = "EPI_OR_CSEC",
MEASUREMENT = "MEAN_OF_RATE",
# MEDIAN = median(RATE),
VALUE = round_half_up(mean(ECRATE, na.rm = TRUE),1) # MEAN OF RATE
)
epicsec_rate_hosp_yr_summary
Here we start with cleaned hospitals list and combine to get the medicare rate, deliveries per week and maternal levels. Some values were calculated from data where doctors were suppressed, and so may be NA.
The maternal levels were manually collected into a spreadsheet (and updated Nov. 24, 2020) and then used here.
# Import maternal levels
maternal_url <- "https://docs.google.com/spreadsheets/d/1efUxctHiY4cCgrqqHvBVx7jZobbx5Dgmpbf_YCd7tcE/gviz/tq?tqx=out:csv"
maternal_levels <- read_csv(maternal_url) %>%
mutate(
THCIC_ID = as.character(THCIC_ID)
)
## Parsed with column specification:
## cols(
## THCIC_ID = col_double(),
## PROVIDER_NAME = col_character(),
## LEVEL = col_character()
## )
blurbs_2019 <- providers_full %>%
rename(PROVIDER_NAME = PROVIDER_NAME_CLEANED) %>%
select(-PROVIDER_CITY) %>%
left_join(
mediciad_2019 %>%
ungroup() %>%
select(THCIC_ID, MCRATE),
by = "THCIC_ID") %>%
# get deliveries per wk
left_join(
del_doc_hosp_2019 %>%
ungroup() %>%
select(THCIC_ID, DELSPERWK),
by = "THCIC_ID"
) %>%
left_join(maternal_levels %>% select(-PROVIDER_NAME))
## Joining, by = "THCIC_ID"
blurbs_2019 %>% head(10)
For export:
Plus summary files for another notebook.
if (test_flag == F) medi_rate_tx_yr_summary %>%
write_rds("data-processed/medi_rate_tx_yr_summary.rds")
if (test_flag == F) csec_rate_tx_yr_summary %>%
write_rds("data-processed/csec_rate_tx_yr_summary.rds")
if (test_flag == F) epicsec_rate_tx_yr_summary %>%
write_rds("data-processed/epicsec_rate_tx_yr_summary.rds")
if (test_flag == F) epicsec_rate_hosp_yr_summary %>%
write_rds("data-processed/epicsec_rate_hosp_yr_summary.rds")
if (test_flag == F) blurbs_2019 %>%
arrange(PROVIDER_NAME) %>%
write_json("exports/blurbs_2019.json")
# A klaxon to indicate the processing is complete
beepr::beep(4)