class: center, middle, inverse, title-slide # Standard and Normal ## Cohen Chapter 4
.small[EDUC/PSY 6600] --- class: center, middle ## How do all these unusuals strike you, Watson? <br> Their cumulative effect is certainly considerable, and yet each of them is quite possible in itself. ### -- Sherlock Holmes and Dr. Watson, *The Adventure of Abbey Grange* --- # Exploring Quantitative Data ### Building on what we've already discussed: .large[.large[ 1. Always plot your data: .dcoral[make a graph]. 2. Look for the overall pattern (.nicegreen[shape, center, and spread]) and for striking departures such as .nicegreen[outliers]. 3. Calculate a numerical summary to briefly .bluer[describe center and spread]. 4. Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. ]] --- <iframe width="1000" height="750" src="https://www.youtube.com/embed/Txlm4ORI4Gs?controls=0&start=2" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- # Let's Start with Density Curves .huge[ A .dcoral[density curve] is a curve that: ] .large[.large[ - is always .nicegreen[on or above] the horizontal axis - has an .dcoral[area] of .nicegreen[exactly 1] underneath it It describes the overall pattern of a distribution and highlights proportions of observations as the .dcoral[area]. ]] --- background-image: url(figures/fig_density_normal1.png) background-position: 50% 70% background-size: 1200px # Density Curves and Normal Distributions --- <img src="figures/move_normal_height_histograms.gif" width="75%" style="display: block; margin: auto;" /> --- # Density Curves and Normal Distributions <img src="figures/fig_normal_height_histograms.png" width="1664" style="display: block; margin: auto;" /> --- # Normal Distribution .pull-left[ .large[.large[ Many dependent variables are assumed to be normally distributed ] - Many statistical procedures assume this - Correlation, regression, t-tests, and ANOVA - Also called the Gaussian distribution - for Karl Gauss ]] .pull-right[ <img src="ch4_z_scores_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> ] --- class: inverse ## Interactive Shiny App .large[Probability Distribution Viewer] [link 1](http://shiny.calpoly.sh/Prob_View/)], [link 2](https://calpolystat1.shinyapps.io/prob_view/) <img src="figures/app_prob_view.png" width="75%" style="display: block; margin: auto;" /> > Cal Poly > Statistics Department --- <img src="figures/fig_normal_height_density_shade.png" width="80%" style="display: block; margin: auto;" /> -- <img src="figures/fig_normal_height_density_rule.png" width="80%" style="display: block; margin: auto;" /> --- <iframe width="1000" height="750" src="https://www.youtube.com/embed/mtbJbDwqWLE?controls=0&start=2" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- background-image: url(figures/fig_6895rule.png) background-position: 50% 70% background-size: 1100px --- background-image: url(figures/fig_normal.png) background-position: 50% 70% background-size: 1100px --- background-image: url(figures/fig_z_table_top.png) background-position: 50% 50% background-size: 15 7 # The z-Table *(A.1)* [Download here](https://cehs-research.github.io/PSY-6600_public/psy6600%20statistical%20tables.pdf) --- # Do We Have a Normal Distribution? .huge[Check Plot!] .pull-left[Bell shaped curve?] .pull-right[Points on the line?] <img src="ch4_z_scores_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- # Z-Scores, Computation .pull-left[ .huge[.dcoral[Standardizing]] .large[ Convert a value to a standard score ("z-score") - First subtract the mean - Then divide by the standard deviation .large[ $$ z = \frac{X - \mu}{\sigma} = \frac{X - \bar{x}}{s} $$ ] ] ] .pull-right[ <img src="ch4_z_scores_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> ] --- # Z-Scores, Units .large[ - z-scores are in .nicegreen[SD units] - Represent SD distances away from the mean (M = 0) - if z-score = -0.50 then it is `\(\frac{1}{2}\)` of SD below mean - Can compare z-scores from 2 or more variables originally measured in differing units .dcoral[Note: Standardizing does NOT "normalize" the data] ] --- class: inverse, center, middle # Let's Apply This to an Exmple Situation --- background-image: url(figures/fig_blank_normal_slide.png) background-position: 50% 80% background-size: 950px # Example 1: Draw a Picture .huge[95% of students at a school are between 1.1 and 1.7 meters tall ] Assuming this data is .nicegreen[normally distributed], can you calculate the .dcoral[MEAN] and .dcoral[STANDARD DEVIATION]? --- background-image: url(figures/fig_school_height_1-1.jpg) background-position: 50% 80% background-size: 950px # Example 1: Draw a Picture .huge[95% of students at a school are between 1.1 and 1.7 meters tall ] Assuming this data is .nicegreen[normally distributed], can you calculate the .dcoral[MEAN] and .dcoral[STANDARD DEVIATION]? --- background-image: url(figures/fig_school_height_1-2.jpg) background-position: 50% 80% background-size: 950px # Example 1: Draw a Picture .huge[95% of students at a school are between 1.1 and 1.7 meters tall ] Assuming this data is .nicegreen[normally distributed], can you calculate the .dcoral[MEAN] and .dcoral[STANDARD DEVIATION]? --- background-image: url(figures/fig_school_height_1-3.jpg) background-position: 50% 80% background-size: 950px # Example 1: Draw a Picture .huge[95% of students at a school are between 1.1 and 1.7 meters tall ] Assuming this data is .nicegreen[normally distributed], can you calculate the .dcoral[MEAN] and .dcoral[STANDARD DEVIATION]? --- background-image: url(figures/fig_school_height_1-4.jpg) background-position: 50% 80% background-size: 950px # Example 1: Draw a Picture .huge[95% of students at a school are between 1.1 and 1.7 meters tall ] Assuming this data is .nicegreen[normally distributed], can you calculate the .dcoral[MEAN] and .dcoral[STANDARD DEVIATION]? --- background-image: url(figures/fig_blank_normal_slide.png) background-position: 50% 80% background-size: 950px # Example 2: Calculate a z-Score .huge[You have a friend who is 1.85 meters tall. ] .nicegreen[Class: M = 1.4 meters, SD = 0.15 meters] .dcoral[How far] is 1.85 from the mean? How many .dcoral[standard deviations] is that? --- background-image: url(figures/fig_school_height_2-1.jpg) background-position: 50% 80% background-size: 950px # Example 2: Calculate a z-Score .huge[You have a friend who is 1.85 meters tall. ] .nicegreen[Class: M = 1.4 meters, SD = 0.15 meters] .dcoral[How far] is 1.85 from the mean? How many .dcoral[standard deviations] is that? --- background-image: url(figures/fig_school_height_2-2.jpg) background-position: 50% 80% background-size: 950px # Example 2: Calculate a z-Score .huge[You have a friend who is 1.85 meters tall. ] .nicegreen[Class: M = 1.4 meters, SD = 0.15 meters] .dcoral[How far] is 1.85 from the mean? How many .dcoral[standard deviations] is that? --- background-image: url(figures/fig_school_height_2-3.jpg) background-position: 50% 80% background-size: 950px # Example 2: Calculate a z-Score .huge[You have a friend who is 1.85 meters tall. ] .nicegreen[Class: M = 1.4 meters, SD = 0.15 meters] .dcoral[How far] is 1.85 from the mean? How many .dcoral[standard deviations] is that? --- background-image: url(figures/fig_z_table_top.png) background-position: 50% 50% background-size: 15 7 # Using the z-Table [Download here](https://cehs-research.github.io/PSY-6600_public/psy6600%20statistical%20tables.pdf) --- ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = ______ 2. The .nicegreen[height] of a student with a z-socre of -2.65 = ______ 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = ______ 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-1.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = ______ 2. The .nicegreen[height] of a student with a z-socre of -2.65 = ______ 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = ______ 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-2.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = ______ 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = ______ 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-3.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = ______ 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-4.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = ______ 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-5.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = ______ 4. The .dcoral[90th percentile] for students heights = ______ --- <img src="figures/fig_school_height_3blank_ztab.png" width="75%" style="display: block; margin: auto;" /> --- <img src="figures/fig_school_height_3a_ztab.png" width="75%" style="display: block; margin: auto;" /> --- background-image: url(figures/fig_school_height_3-6.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = ______ 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-7.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = .dcoral[**81st percentile**] 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-8.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = .dcoral[**81st percentile**] 4. The .dcoral[90th percentile] for students heights = ______ --- <img src="figures/fig_school_height_3b_ztab.jpg" width="75%" style="display: block; margin: auto;" /> --- background-image: url(figures/fig_school_height_3-9.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = .dcoral[**81st percentile**] 4. The .dcoral[90th percentile] for students heights = ______ --- background-image: url(figures/fig_school_height_3-10.jpg) background-position: 50% 80% background-size: 950px ## Example 3: Standardizing Scores Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] 1. The .dcoral[z-score] for a student 1.63 m tall = .dcoral[**1.53 SDs**] 2. The .nicegreen[height] of a student with a z-socre of -2.65 = .dcoral[**1.00 meters**] 3. The .dcoral[Pecentile Rank] of a student that is 1.51 m tall = .dcoral[**81st percentile**] 4. The .dcoral[90th percentile] for students heights = .dcoral[**1.59 meters**] --- background-image: url(figures/fig_3_blank_normal_slide.png) background-position: 50% 50% ## Example 4: Find the Probability That... Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) .dcoral[More than] 1.63 m tall (2) .dcoral[Less than] 1.2 m tall (3) .dcoral[between] 1.2 and 1.63 tall --- background-image: url(figures/fig_school_height_4-1.jpg) background-position: 50% 80% background-size: 950px ## Example 4: Find the Probability That... Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) .dcoral[More than] 1.63 m tall (2) .dcoral[Less than] 1.2 m tall (3) .dcoral[between] 1.2 and 1.63 tall --- background-image: url(figures/fig_school_height_4-2.jpg) background-position: 50% 80% background-size: 950px ## Example 4: Find the Probability That... Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) .dcoral[More than] 1.63 m tall (2) .dcoral[Less than] 1.2 m tall (3) .dcoral[between] 1.2 and 1.63 tall --- background-image: url(figures/fig_school_height_4-3.jpg) background-position: 50% 80% background-size: 950px ## Example 4: Find the Probability That... Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) .dcoral[More than] 1.63 m tall (2) .dcoral[Less than] 1.2 m tall (3) .dcoral[between] 1.2 and 1.63 tall --- background-image: url(figures/fig_school_height_4-4.jpg) background-position: 50% 80% background-size: 950px ## Example 4: Find the Probability That... Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) .dcoral[More than] 1.63 m tall (2) .dcoral[Less than] 1.2 m tall (3) .dcoral[between] 1.2 and 1.63 tall --- background-image: url(figures/fig_school_height_4-5.jpg) background-position: 50% 80% background-size: 950px ## Example 4: Find the Probability That... Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) .dcoral[More than] 1.63 m tall (2) .dcoral[Less than] 1.2 m tall (3) .dcoral[between] 1.2 and 1.63 tall --- <img src="figures/fig_school_height_4blank_ztab.png" width="75%" style="display: block; margin: auto;" /> --- <img src="figures/fig_school_height_4_ztab.jpg" width="75%" style="display: block; margin: auto;" /> --- background-image: url(figures/fig_school_height_4-6.jpg) background-position: 50% 80% background-size: 950px ## Example 4: Find the Probability That... Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) .dcoral[More than] 1.63 m tall = .dcoral[**.0630**] (2) .dcoral[Less than] 1.2 m tall = .dcoral[**.0918**] (3) .dcoral[between] 1.2 and 1.63 tall = .dcoral[**.8452**] --- background-image: url(figures/fig_2_blank_normal_slide.png) background-position: 50% 50% ## Example 5: Percentiles Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[perentile rank] of a 1.7 m tall Student = ______ (2) The .dcoral[height] of a studnet in the 15th percentile = ______ --- background-image: url(figures/fig_school_height_5-1.jpg) background-position: 50% 80% background-size: 950px ## Example 5: Percentiles Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[perentile rank] of a 1.7 m tall Student = ______ (2) The .dcoral[height] of a studnet in the 15th percentile = ______ --- background-image: url(figures/fig_school_height_5-2.jpg) background-position: 50% 80% background-size: 950px ## Example 5: Percentiles Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[perentile rank] of a 1.7 m tall Student = ______ (2) The .dcoral[height] of a studnet in the 15th percentile = ______ --- background-image: url(figures/fig_school_height_5-3.jpg) background-position: 50% 80% background-size: 950px ## Example 5: Percentiles Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[perentile rank] of a 1.7 m tall Student = ______ (2) The .dcoral[height] of a studnet in the 15th percentile = ______ --- <img src="figures/fig_school_height_5blank_ztab.png" width="75%" style="display: block; margin: auto;" /> --- <img src="figures/fig_school_height_5a_ztab.png" width="75%" style="display: block; margin: auto;" /> --- background-image: url(figures/fig_school_height_5-4.jpg) background-position: 50% 80% background-size: 950px ## Example 5: Percentiles Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[perentile rank] of a 1.7 m tall Student = .dcoral[**98th percentile**] (2) The .dcoral[height] of a studnet in the 15th percentile = ______ --- background-image: url(figures/fig_school_height_5-5.jpg) background-position: 50% 80% background-size: 950px ## Example 5: Percentiles Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[perentile rank] of a 1.7 m tall Student = .dcoral[**98th percentile**] (2) The .dcoral[height] of a studnet in the 15th percentile = ______ --- <img src="figures/fig_school_height_5b_ztab.png" width="75%" style="display: block; margin: auto;" /> --- background-image: url(figures/fig_school_height_5-6.jpg) background-position: 50% 80% background-size: 950px ## Example 5: Percentiles Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[perentile rank] of a 1.7 m tall Student = .dcoral[**98th percentile**] (2) The .dcoral[height] of a studnet in the 15th percentile = .dcoral[**1.24 meters**] --- class: inverse, center, middle # Into Theory Mode Again --- background-image: url(figures/fig_pop_sample_slide.png) background-position: 50% 50% # Parameters vs. Statistics --- # Statistical Estimation .large[ - The process of .nicegreen[statistical inference] involves using information .dcoral[from a sample] to draw conclusions about a wider population. ] -- .large[ - Different random samples yield different statistics. We need to be able to describe the .dcoral[sampling distribution] of possible statistic values in order to perform .nicegreen[statistical inference]. ] -- .large[ - We can think of a statistic as a .nicegreen[random variable] because it takes numerical values that describe the outcomes of the .dcoral[random sampling process]. ] --- background-image: url(figures/fig_samp_dist_slide.png) background-position: 50% 50% # Sampling Distribution The .dcoral[LAW of LARGE NUMBERS] assures us that if we measure .nicegreen[enough] subjects, the statistic .nicegreen[x-bar] will eventually get .dcoral[very close to] the unknown parameter .nicegreen[mu]. If we took every one of the possible samples of a certain size, calculated the sample mean for each, and graphed all of those values, we'd have a .dcoral[sampling distribution]. --- class: inverse ## Interactive Shiny App .pull-leftsmall[ .large[LINK: [Sampling Distribution Demo](http://shiny.calpoly.sh/Sampling_Distribution/)] > Cal Poly > Statistics Department ] .pull-rightbig[ <img src="figures/app_sample_dist.png" width="1267" style="display: block; margin: auto;" /> ] --- background-image: url(figures/fig_samp_dist_forMEAN_slide.png) background-position: 50% 50% # Sampling Distribution for the MEAN .large[ The .dcoral[MEAN] of a sampling distribution .nicegreen[for a sample mean] is just as likely to be above or below the .dcoral[population mean], even if the distribution of the raw data is skewed. The .dcoral[STANDARD DEVIATION] of a sampling distribution .nicegreen[for a sample mean] is is SMALLER than the standard deviation for the population by a factor of the .dcoral[square-root of n]. ] --- background-image: url(figures/fig_all_samples_slide.png) background-position: 50% 50% # Normally Distributed Population .large[If the population is NORMALLY distributed:] --- background-image: url(figures/fig_bank_calls_slide.png) background-position: 50% 50% ## Skewed Population .pull-left[ The distribution of lengths of .dcoral[all] customer service calls received by a bank in a month. ] .pull-right[ The distribution of the .dcoral[sample means] (x-bar) for 500 random samples of size 80 from this population. The scales and histogram classes are exactly the same in both panels ] --- <iframe width="1000" height="750" src="https://www.youtube.com/embed/jvoxEYmQHNM?controls=0&start=2" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- background-image: url(figures/fig_clt_slide.png) background-position: 50% 50% # The Central Limit Theorem --- # The Central Limit Theorem .huge[ When a sample size (n) is .nicegreen[large], the sampling distribution of the .dcoral[sample MEAN] is **approximately normally distributed** about the .nicegreen[mean of the population] with the stadard deviation less than than of the population by a factor of .nicegreen[the square root of n]. ] --- class: inverse, center, middle # Back to the Example Situation --- background-image: url(figures/fig_2_blank_normal_slide.png) background-position: 50% 50% ## Example 6: Probabilities Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[probability] a randomly selected .nicegreen[student] is more than 1.63 m tall = ______ (2) The .dcoral[probability] a randomly selected .nicegreen[sample] of 16 students .nicegreen[average] more than 1.63 m tall = ______ --- background-image: url(figures/fig_school_height_6-1.jpg) background-position: 50% 80% background-size: 950px ## Example 6: Probabilities Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[probability] a randomly selected .nicegreen[student] is more than 1.63 m tall = ______ (2) The .dcoral[probability] a randomly selected .nicegreen[sample] of 16 students .nicegreen[average] more than 1.63 m tall = ______ --- background-image: url(figures/fig_school_height_6-2.jpg) background-position: 50% 80% background-size: 950px ## Example 6: Probabilities Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[probability] a randomly selected .nicegreen[student] is more than 1.63 m tall = ______ (2) The .dcoral[probability] a randomly selected .nicegreen[sample] of 16 students .nicegreen[average] more than 1.63 m tall = ______ --- <img src="figures/fig_school_height_6_ztab.png" width="75%" style="display: block; margin: auto;" /> --- background-image: url(figures/fig_school_height_6-3.jpg) background-position: 50% 80% background-size: 950px ## Example 6: Probabilities Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[probability] a randomly selected .nicegreen[student] is more than 1.63 m tall = ______ (2) The .dcoral[probability] a randomly selected .nicegreen[sample] of 16 students .nicegreen[average] more than 1.63 m tall = ______ --- background-image: url(figures/fig_school_height_6-4.jpg) background-position: 50% 80% background-size: 950px ## Example 6: Probabilities Assume: School's population of students heights are .nicegreen[normal (M = 1.4m, SD = 0.15m)] (1) The .dcoral[probability] a randomly selected .nicegreen[student] is more than 1.63 m tall = .dcoral[**.063**] (2) The .dcoral[probability] a randomly selected .nicegreen[sample] of 16 students .nicegreen[average] more than 1.63 m tall = .dcoral[**< .001**] --- class: inverse, center, middle # Let's Apply This to the Cancer Dataset --- # Read in the Data ```r library(tidyverse) # Loads several very helpful 'tidy' packages library(haven) # Read in SPSS datasets library(furniture) # Nice tables (by our own Tyson Barrett) library(psych) # Lots of nice tid-bits ``` ```r cancer_raw <- haven::read_spss("cancer.sav") ``` -- ### And Clean It ```r cancer_clean <- cancer_raw %>% dplyr::rename_all(tolower) %>% dplyr::mutate(id = factor(id)) %>% dplyr::mutate(trt = factor(trt, labels = c("Placebo", "Aloe Juice"))) %>% dplyr::mutate(stage = factor(stage)) ``` --- ## Standardize a variable with `scale()` .pull-left[ ```r cancer_clean %>% furniture::table1(age) ``` ``` ----------------------- Mean/Count (SD/%) n = 25 age 59.6 (12.9) ----------------------- ``` ] .pull-right[ ```r cancer_clean %>% dplyr::mutate(agez = (age - 59.6) / 12.9) %>% dplyr::mutate(ageZ = scale(age))%>% dplyr::select(id, trt, age, agez, ageZ) %>% head() ``` ``` # A tibble: 6 x 5 id trt age agez ageZ[,1] <fct> <fct> <dbl> <dbl> <dbl> 1 1 Placebo 52 -0.589 -0.591 2 5 Placebo 77 1.35 1.34 3 6 Placebo 60 0.0310 0.0278 4 9 Placebo 61 0.109 0.105 5 11 Placebo 59 -0.0465 -0.0495 6 15 Placebo 69 0.729 0.724 ``` ] --- ## Standardize a variable - not normal .pull-left[ ```r cancer_clean %>% dplyr::mutate(ageZ = scale(age)) %>% furniture::table1(age, ageZ) ``` ``` ------------------------ Mean/Count (SD/%) n = 25 age 59.6 (12.9) ageZ -0.0 (1.0) ------------------------ ``` ] .pull-right[ ```r cancer_clean %>% dplyr::mutate(ageZ = scale(age)) %>% ggplot(aes(ageZ)) + geom_histogram(bins = 14) ``` <img src="ch4_z_scores_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Questions? --- class: inverse, center, middle # Next Topic ### Intro to Hypothesis Testing: 1 Sample z-test