2 Intro: Cancer Experiemnt
library(tidyverse) # super helpful everything!
library(haven) # inporting SPSS, SAS, & Stata data files
library(psych) # lots of nice tidbits
2.1 Source of Data
Mid-Michigan Medical Center, Midland, Michigan, 1999: A study of oral condition of cancer patients.
2.2 Description of the Study
The data set contains part of the data for a study of oral condition of cancer patients conducted at the Mid-Michigan Medical Center. The oral conditions of the patients were measured and recorded at the initial stage, at the end of the second week, at the end of the fourth week, and at the end of the sixth week. The variables age, initial weight and initial cancer stage of the patients were recorded. Patients were divided into two groups at random: One group received a placebo and the other group received aloe juice treatment.
Sample size n = 25 patients with neck cancer.
The treatment is Aloe Juice.
2.3 Variables
ID
patient identification numbertrt
treatment group0
placebo1
aloe juice
age
patient’s age, in yearsweightin
patient’s weight (pounds) at the initial stagestage
initial cancer stage- coded
1
through4
- coded
totalcin
oral condition at the initial stagetotalcw2
oral condition at the end of week 2totalcw4
oral condition at the end of week 4totalcw6
oral condition at the end of week 6
2.4 Import Data
The Cancer
dataset is saved in SPSS format, which is evident from the .sav
ending on the file name.
The haven
package is downloaded as part of the tidyverse
set of packages, but is not automatically loaded. It must have its own library()
function call (see above). The haven::read_spss()
function works very simarly to the readxl::read_excel()
function we used last chapter (Wickham, Miller, and Smith 2022).
- Make sure the dataset is saved in the same folder as this file
- Make sure the that folder is the working directory
<- haven::read_spss("https://github.com/CEHS-research/PSY-6600_students/raw/master/Data/Cancer.sav") cancer_raw
::glimpse(cancer_raw) tibble
Rows: 25
Columns: 9
$ ID <dbl> 1, 5, 6, 9, 11, 15, 21, 26, 31, 35, 39, 41, 45, 2, 12, 14, 16…
$ TRT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1…
$ AGE <dbl> 52, 77, 60, 61, 59, 69, 67, 56, 61, 51, 46, 65, 67, 46, 56, 4…
$ WEIGHIN <dbl> 124.0, 160.0, 136.5, 179.6, 175.8, 167.6, 186.0, 158.0, 212.8…
$ STAGE <dbl> 2, 1, 4, 1, 2, 1, 1, 3, 1, 1, 4, 1, 1, 2, 4, 1, 2, 1, 4, 2, 1…
$ TOTALCIN <dbl> 6, 9, 7, 6, 6, 6, 6, 6, 6, 6, 7, 6, 8, 7, 6, 4, 6, 6, 12, 5, …
$ TOTALCW2 <dbl> 6, 6, 9, 7, 7, 6, 11, 11, 9, 4, 8, 6, 8, 16, 10, 6, 11, 7, 11…
$ TOTALCW4 <dbl> 6, 10, 17, 9, 16, 6, 11, 15, 6, 8, 11, 9, 9, 9, 11, 8, 11, 6,…
$ TOTALCW6 <dbl> 7, 9, 19, 3, 13, 11, 10, 15, 8, 7, 11, 6, 10, 10, 9, 7, 14, 6…
2.5 Wrangle Data
<- cancer_raw %>%
cancer_clean ::rename_all(tolower) %>%
dplyr::mutate(id = factor(id)) %>%
dplyr::mutate(trt = factor(trt,
dplyrlabels = c("Placebo",
"Aloe Juice"))) %>%
::mutate(stage = factor(stage)) dplyr
2.6 Overview
2.6.2 Variable Names
names(cancer_clean)
[1] "id" "trt" "age" "weighin" "stage" "totalcin" "totalcw2"
[8] "totalcw4" "totalcw6"
2.6.3 Quick Glimpse
::glimpse(cancer_clean) tibble
Rows: 25
Columns: 9
$ id <fct> 1, 5, 6, 9, 11, 15, 21, 26, 31, 35, 39, 41, 45, 2, 12, 14, 16…
$ trt <fct> Placebo, Placebo, Placebo, Placebo, Placebo, Placebo, Placebo…
$ age <dbl> 52, 77, 60, 61, 59, 69, 67, 56, 61, 51, 46, 65, 67, 46, 56, 4…
$ weighin <dbl> 124.0, 160.0, 136.5, 179.6, 175.8, 167.6, 186.0, 158.0, 212.8…
$ stage <fct> 2, 1, 4, 1, 2, 1, 1, 3, 1, 1, 4, 1, 1, 2, 4, 1, 2, 1, 4, 2, 1…
$ totalcin <dbl> 6, 9, 7, 6, 6, 6, 6, 6, 6, 6, 7, 6, 8, 7, 6, 4, 6, 6, 12, 5, …
$ totalcw2 <dbl> 6, 6, 9, 7, 7, 6, 11, 11, 9, 4, 8, 6, 8, 16, 10, 6, 11, 7, 11…
$ totalcw4 <dbl> 6, 10, 17, 9, 16, 6, 11, 15, 6, 8, 11, 9, 9, 9, 11, 8, 11, 6,…
$ totalcw6 <dbl> 7, 9, 19, 3, 13, 11, 10, 15, 8, 7, 11, 6, 10, 10, 9, 7, 14, 6…
2.6.4 Top and Bottom Rows
::headTail(cancer_clean) psych
id trt age weighin stage totalcin totalcw2 totalcw4 totalcw6
1 1 Placebo 52 124 2 6 6 6 7
2 5 Placebo 77 160 1 9 6 10 9
3 6 Placebo 60 136.5 4 7 9 17 19
4 9 Placebo 61 179.6 1 6 7 9 3
5 <NA> <NA> ... ... <NA> ... ... ... ...
6 42 Aloe Juice 73 181.5 0 8 11 16 <NA>
7 44 Aloe Juice 67 187 1 5 7 7 7
8 50 Aloe Juice 60 164 2 6 8 16 <NA>
9 58 Aloe Juice 54 172.8 4 7 8 10 8