Introduction

Chapter 18 was about several extensions of linear regression in the form of GLMs. At the end of the chapter, several additional linear models were discussed including multilevel modeling and structural equation models. The following examples help illustrate how these approaches can be used in research using R.

  1. Let’s start by loading the tidyverse package, the furniture package, and the educ7610 package.
library(tidyverse)
library(furniture)
library(educ7610)
  1. Import the comp and gss data sets found in educ7610 into R.
data(comp)
data(gss)

Generalized Linear Models

Logistic Regression

  1. Let’s look at the comp data set first. Does the size (size) and the leadership of the CEO (ceo) predict whether or not the company had a profit (profit; dichotomous where 1 = had a profit, 0 = otherwise). Let’s use a logistic regression to understand the relationship.
logit_fit <- glm(profit ~ size + ceo, data = comp, family = binomial(link = "logit"))
summary(logit_fit)
## 
## Call:
## glm(formula = profit ~ size + ceo, family = binomial(link = "logit"), 
##     data = comp)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.5066  -0.3870  -0.1962   0.3858   1.3746  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -8.1491     3.7994  -2.145  0.03197 * 
## size          4.6237     1.7179   2.692  0.00711 **
## ceo           1.7899     0.9624   1.860  0.06292 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 33.104  on 23  degrees of freedom
## Residual deviance: 13.691  on 21  degrees of freedom
## AIC: 19.691
## 
## Number of Fisher Scoring iterations: 6
  1. What do the coefficients from this model mean here? Is this in an intuitive metric?
  2. We can get more interpretable effects using Odds Ratios and Average Marginal Effects (AME). First, we’ll get the odds ratios using the educ7610::oddsratios() function.
oddsratios(logit_fit)
##                       OR        2.5 %       97.5 %
## (Intercept) 2.890071e-04 1.536655e-08 1.488494e-01
## size        1.018716e+02 6.846841e+00 1.079402e+04
## ceo         5.988990e+00 1.097680e+00 6.327124e+01
  1. Interpret the odds ratios and the confidence intervals of these odds ratios in context of the research question.
  2. Now we’ll get the AMEs. We’ll use the margins package again.
library(margins)
margins(logit_fit) %>%
  summary()
##  factor    AME     SE       z      p  lower  upper
##     ceo 0.1509 0.0651  2.3186 0.0204 0.0233 0.2784
##    size 0.3897 0.0334 11.6568 0.0000 0.3242 0.4552
  1. Interpret these AMEs in context of the research question.
  2. Overall, the AME is a more reasonable estimate here, due to some issues with the data. Specifically, one issue with logistic regression is when an outcome is perfectly predicted by a predictor. In this case, if we use furniture::tableX() we can see that these are nearly perfectly related. When this happens the results from the odds ratios can be a bit crazy.
comp %>%
  tableX(size, profit)
##        profit
## size    0  1  Total
##   0     12 2  14   
##   1     1  9  10   
##   Total 13 11 24
  1. Let’s check the diagnostics of this logistic regression model.
diagnostics(logit_fit) %>%
  round(3)
##    profit size ceo   pred  resid dfb.1_ dfb.size dfb.ceo  dffit cov.r
## 1       1    1 3.4  2.560  0.386 -0.064    0.141   0.065  0.170 1.248
## 2       0    0 2.7 -3.316 -0.267 -0.081    0.066   0.070 -0.090 1.220
## 3       0    0 3.2 -2.421 -0.413 -0.121    0.117   0.093 -0.162 1.217
## 4       1    1 2.7  1.307  0.692  0.085    0.313  -0.087  0.538 1.312
## 5       1    1 3.4  2.560  0.386 -0.064    0.141   0.065  0.170 1.248
## 6       1    1 3.8  3.276  0.272 -0.056    0.083   0.058  0.097 1.228
## 7       1    1 4.2  3.992  0.191 -0.040    0.047   0.041  0.056 1.211
## 8       1    1 3.4  2.560  0.386 -0.064    0.141   0.065  0.170 1.248
## 9       0    1 3.7  3.097 -2.507  0.710   -1.130  -0.730 -1.321 0.158
## 10      1    1 4.1  3.813  0.209 -0.044    0.054   0.045  0.064 1.215
## 11      1    1 4.5  4.529  0.146 -0.028    0.030   0.029  0.037 1.199
## 12      1    0 4.3 -0.452  1.375 -0.263   -0.256   0.516  1.150 0.832
## 13      1    0 4.8  0.443  0.996 -0.711    0.034   0.947  1.309 1.297
## 14      0    0 3.2 -2.421 -0.413 -0.121    0.117   0.093 -0.162 1.217
## 15      0    0 2.6 -3.495 -0.244 -0.073    0.058   0.064 -0.079 1.218
## 16      0    0 2.9 -2.958 -0.318 -0.098    0.084   0.082 -0.114 1.222
## 17      0    0 3.4 -2.063 -0.489 -0.129    0.141   0.088 -0.204 1.208
## 18      0    0 3.1 -2.600 -0.378 -0.114    0.105   0.091 -0.144 1.219
## 19      0    0 2.4 -3.853 -0.205 -0.058    0.044   0.052 -0.061 1.213
## 20      0    0 2.7 -3.316 -0.267 -0.081    0.066   0.070 -0.090 1.220
## 21      1    1 2.1  0.233  1.080  1.017    0.519  -1.046  1.897 1.301
## 22      0    0 4.3 -0.452 -0.992  0.181    0.176  -0.354 -0.790 1.122
## 23      0    0 2.3 -4.032 -0.188 -0.052    0.039   0.046 -0.054 1.210
## 24      0    0 3.2 -2.421 -0.413 -0.121    0.117   0.093 -0.162 1.217
##    cook.d   hat
## 1   0.003 0.105
## 2   0.001 0.067
## 3   0.003 0.087
## 4   0.036 0.233
## 5   0.003 0.105
## 6   0.001 0.074
## 7   0.000 0.052
## 8   0.003 0.105
## 9   0.702 0.080
## 10  0.000 0.057
## 11  0.000 0.039
## 12  0.206 0.232
## 13  0.223 0.389
## 14  0.003 0.087
## 15  0.001 0.063
## 16  0.002 0.075
## 17  0.005 0.096
## 18  0.002 0.083
## 19  0.000 0.055
## 20  0.001 0.067
## 21  0.466 0.479
## 22  0.083 0.232
## 23  0.000 0.051
## 24  0.003 0.087
par(mfrow = c(2,2))
plot(logit_fit)

  1. These diagnostics will look different than in linear regression due to the nature of the dichotomous outcome. However, some of the weirdness in these diagnostics (especially “Residuals vs Fitted” and “Scale-Location”) has to do with this very small sample size for logistic regression. Generally, we want a much larger sample size to use logistic regression.
  2. Overall, we likely have an issue with normality. Point “9” may be problematic with Cook’s distance. This could more of a problem of misspecification than anything else. What variables could we be omitting here?

Poisson Regression

  1. To see how Poisson regression works, we’ll use tvhours from the gss data set. Does education (educ) predict number of hours spent watching TV per day when controlling for age and sex?
pois_fit <- glm(tvhours ~ educ + age + sex, data = gss, family = poisson(link = "log"))
summary(pois_fit)
## 
## Call:
## glm(formula = tvhours ~ educ + age + sex, family = poisson(link = "log"), 
##     data = gss)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.0091  -1.0097  -0.3247   0.4255   7.8367  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  1.4005132  0.0845090  16.572  < 2e-16 ***
## educ        -0.0458439  0.0049655  -9.232  < 2e-16 ***
## age          0.0069276  0.0008975   7.719 1.18e-14 ***
## sexMale     -0.0813783  0.0320024  -2.543    0.011 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 2487.2  on 1323  degrees of freedom
## Residual deviance: 2317.2  on 1320  degrees of freedom
##   (699 observations deleted due to missingness)
## AIC: 5843.5
## 
## Number of Fisher Scoring iterations: 5
  1. What do the coefficients mean here on educ?
  2. We can also get risk ratios for Poisson regression with educ7610::riskratios(). Note that the risk ratio and the odds ratios functions do the exact same thing: grab the coefficients and confidence intervals and exponentiate them.
riskratios(pois_fit)
##                    RR     2.5 %    97.5 %
## (Intercept) 4.0572818 3.4376898 4.7871254
## educ        0.9551911 0.9459294 0.9645026
## age         1.0069517 1.0051795 1.0087223
## sexMale     0.9218449 0.8657443 0.9814656
  1. Interpret the risk ratios in context of the research question.
  2. Let’s also get the AMEs.
margins(pois_fit) %>% 
  summary()
##   factor     AME     SE       z      p   lower   upper
##      age  0.0207 0.0027  7.6609 0.0000  0.0154  0.0259
##     educ -0.1367 0.0150 -9.1344 0.0000 -0.1660 -0.1074
##  sexMale -0.2420 0.0950 -2.5487 0.0108 -0.4282 -0.0559
  1. Interpet these in context of the research question.

Ordinal Logistic Regression

  1. To consider how to use ordinal logistic regression, we are going to split the the income variable (income06) into 3 discrete values.
gss3 <- gss %>%
  mutate(inc_cat = cut_number(income06, 3),
         inc_cat = factor(inc_cat, labels = c("Low", "Mid", "High")))
  1. Using this new variable, let’s use an ordinal logistic regression to test if educ and hompop predict this categorical income variable (inc_cat). We will use the glm_olr() function from the educ7610 package. Note that this function is a wrapper of MASS::polr(). See this website for more information on polr().
olr_fit <- glm_olr(inc_cat ~ educ + hompop, data = gss3)
olr_fit
## Ordinal Logistic Regression
##            Value Std. Error t value    p value
## educ     0.29783   0.017622  16.901 4.4103e-64
## hompop   0.34114   0.033895  10.065 7.9166e-24
## Low|Mid  4.08444   0.264444  15.445 8.1029e-54
## Mid|High 6.01402   0.286829  20.967 1.3060e-97
## ---
  1. This gives us the coefficients of the model (both educ and hompop) and the “cut-offs” or “thresholds” of the model (Low|Mid and Mid|High). Generally, we will only be interpreting the coefficients while the thresholds are important for the model to work. With this in mind, what is the interpretation of the coefficients here?
  2. We can get the odds ratios of this model.
oddsratios(olr_fit)
##              OR    2.5 %   97.5 %
## educ   1.346929 1.301701 1.394819
## hompop 1.406546 1.316680 1.503844
  1. What effect does educ have on the ordinal income variable? What about hompop?

Other Linear Models

We will provide a very brief introduction to the following methods:

Note that more training should be obtained for using these methods without guidance (i.e., you’ll know just enough to be dangerous!).

Survival Analysis

  1. Survival analysis is for “censored” data, data wherein the outcome is an event that may or may not have happened before the end of the study. Instead of assuming that if the event didn’t happen by the end of the study that it would never happen, survival analysis allows us to assess events without this assumption. To do a survival analysis, we will need the following packages and the bmt data set and a few minor variable adjustments.
library(survival)
library(OIsurv)
library(ranger)

data(bmt)
bmt2 <- bmt %>%
  dplyr::select(-c(t2,d2,d3)) %>%
  dplyr::mutate(group = factor(group),
                da = factor(da),
                dc = factor(dc),
                dp = factor(dp),
                z3 = factor(z3),
                z4 = factor(z4),
                z5 = factor(z5),
                z6 = factor(z6),
                z8 = factor(z8),
                z9 = factor(z9),
                z10 = factor(z10)
)
  1. To understand survival analysis, we will use two important aspects of it: 1) KM plots and 2) Cox Proportional Hazards model. First, for the KM plot. We want to see if there are differences in risk across the groups. There’s some data manipulation happening in this chunk so feel free to ignore that and focus on the output.
fit <- bmt2 %>% 
  filter(complete.cases(t1,d1,group)) %>%
  survfit(Surv(t1,factor(d1)) ~ group, data = .)

surv_curve <- function(survfit){
  df <- survfit$pstate %>%
    data.frame %>%
    setNames(c("left", "remaining")) %>%
    dplyr::mutate(group = rep(seq_along(survfit$n), 
                              times = survfit$strata)) %>%
    dplyr::mutate(time = survfit$time)
  
  ggplot(df, aes(time, remaining, group = factor(group), color = factor(group))) +
    geom_line() +
    geom_point(alpha = .2, shape = 3) +
    theme_bw() +
    labs(y = "Survival Rate",
         x = "Time",
         color = "Group") +
    coord_cartesian(ylim = c(0,1))
  
}
surv_curve(fit)

  1. What does this plot tell us about the survival in each group? Which group has the best survival rate?
  2. Now let’s use a Cox Proportional Hazards model to understand the differences in the risk across the groups.
bmt2 %>%
  coxph(Surv(t1,d1) ~ group, data = .)
## Call:
## coxph(formula = Surv(t1, d1) ~ group, data = .)
## 
##           coef exp(coef) se(coef)      z      p
## group2 -0.6552    0.5193   0.2937 -2.231 0.0257
## group3  0.3702    1.4480   0.2674  1.384 0.1662
## 
## Likelihood ratio test=15  on 2 df, p=0.0005539
## n= 137, number of events= 81
  1. Are group 2 or 3 significantly different than group 1 (the reference group) in risk?
  2. We can also use cars linearHypothesis() to assess differences across groups 2 and 3.
bmt2 %>%
  coxph(Surv(t1,d1) ~ group, data = .) %>%
  car::linearHypothesis("group2 = group3")
## Linear hypothesis test
## 
## Hypothesis:
## group2 - group3 = 0
## 
## Model 1: restricted model
## Model 2: Surv(t1, d1) ~ group
## 
##   Res.Df Df  Chisq Pr(>Chisq)    
## 1    136                         
## 2    135  1 14.219  0.0001627 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. Are they significantly different? Is this a surprise from the coefficients from the model?

Time Series Analysis

  1. Because time series is not used extensively in most of our fields, we won’t cover an example here. Know, however, that many of the methods used are deeply related to linear regression as we’ve discussed it.

Multilevel Modeling

  1. Multilevel models are particularly helpful when we have issues of non-independence. For example, kids nested within classes are not independent, especially if the intervention is delivered to the class as a whole. Another example is longitudinal data wherein time points are nested within individuals. In these situations, we have a couple options. We can estimate an effect for each classroom or each individual. But this really eats up degrees of freedom and we are rarely interested in whether classroom 1 is significantly different than classroom 2 or if Susan specifically performed better than Ralphy. But we don’t want these differences to affect the estimates we do care about.
  2. One way around this is to use what are called random effects. These are different than what we normally are estimating–fixed effects–because they don’t estimate a coefficient but it still controls the variation produced by the nesting.
  3. Let’s look at using package lme4 and lmerTest to estimate these types of models. We’ll also use the data from lme4 called InstEval where students are nested within classes (specifically we have the professor, not the class).
library(lme4)
library(lmerTest)
data(InstEval)
InstEval <- InstEval %>%
  mutate(studage = factor(studage, labels = c("Youngest", "Young-Mid", "Old-Mid", "Oldest"),
                          ordered = FALSE),
         lectage = factor(lectage, labels = c("Youngest", "Young", "Mid", "Old-Mid", "Old", "Oldest"),
                          ordered = FALSE))
str(InstEval)
## 'data.frame':    73421 obs. of  7 variables:
##  $ s      : Factor w/ 2972 levels "1","2","3","4",..: 1 1 1 1 2 2 3 3 3 3 ...
##  $ d      : Factor w/ 1128 levels "1","6","7","8",..: 525 560 832 1068 62 406 3 6 19 75 ...
##  $ studage: Factor w/ 4 levels "Youngest","Young-Mid",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ lectage: Factor w/ 6 levels "Youngest","Young",..: 2 1 2 2 1 1 1 1 1 1 ...
##  $ service: Factor w/ 2 levels "0","1": 1 2 1 2 1 1 2 1 1 1 ...
##  $ dept   : Factor w/ 14 levels "15","5","10",..: 14 5 14 12 2 2 13 3 3 3 ...
##  $ y      : int  5 2 5 3 2 4 4 5 5 4 ...
  1. Let’s look at the effect of student age (studage), lecturer/professor age (lectage), whether this was in the lecturer’s/professor’s department (service), the department of the class (dept) on the rating students gave of the class (y) while controlling for the nesting of students within classes, classes within departments, and professors within departments. To control for the nesting we use a new notation where we do + (1 | var), suggesting to control for var in the simplest way.
fit_mlm <- lmer(y ~ studage + lectage + service + (1 | s) + (1 | d) + (1 | dept),
                data = InstEval)
summary(fit_mlm)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: y ~ studage + lectage + service + (1 | s) + (1 | d) + (1 | dept)
##    Data: InstEval
## 
## REML criterion at convergence: 237615.8
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -3.09054 -0.74589  0.04066  0.77162  3.15582 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  s        (Intercept) 0.106724 0.32669 
##  d        (Intercept) 0.260800 0.51069 
##  dept     (Intercept) 0.006925 0.08321 
##  Residual             1.383442 1.17620 
## Number of obs: 73421, groups:  s, 2972; d, 1128; dept, 14
## 
## Fixed effects:
##                    Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)       3.291e+00  3.325e-02  2.604e+01  98.976  < 2e-16 ***
## studageYoung-Mid  5.197e-02  2.318e-02  4.090e+03   2.242  0.02502 *  
## studageOld-Mid    7.216e-02  2.398e-02  4.533e+03   3.010  0.00263 ** 
## studageOldest     1.363e-01  2.639e-02  5.063e+03   5.166 2.48e-07 ***
## lectageYoung     -8.074e-02  1.538e-02  6.834e+04  -5.249 1.53e-07 ***
## lectageMid       -1.102e-01  1.672e-02  7.205e+04  -6.588 4.49e-11 ***
## lectageOld-Mid   -1.892e-01  1.960e-02  6.845e+04  -9.650  < 2e-16 ***
## lectageOld       -1.644e-01  2.141e-02  6.859e+04  -7.681 1.59e-14 ***
## lectageOldest    -2.460e-01  2.049e-02  5.807e+04 -12.009  < 2e-16 ***
## service1         -7.278e-02  1.348e-02  3.994e+04  -5.399 6.73e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) stdY-M stdO-M stdgOl lctgYn lctgMd lctO-M lectgOld
## studgYng-Md -0.322                                                   
## studgOld-Md -0.353  0.517                                            
## studagOldst -0.346  0.491  0.575                                     
## lectageYong -0.195 -0.009 -0.023 -0.025                              
## lectageMid  -0.034 -0.197 -0.236 -0.259  0.342                       
## lectgOld-Md -0.057 -0.170 -0.209 -0.234  0.473  0.403                
## lectageOld   0.000 -0.136 -0.274 -0.283  0.300  0.417  0.374         
## lectagOldst  0.004 -0.144 -0.297 -0.373  0.406  0.447  0.498  0.490  
## service1    -0.142  0.017  0.056  0.054 -0.033 -0.044 -0.063 -0.072  
##             lctgOlds
## studgYng-Md         
## studgOld-Md         
## studagOldst         
## lectageYong         
## lectageMid          
## lectgOld-Md         
## lectageOld          
## lectagOldst         
## service1    -0.121
  1. Interpretation is similar to linear regression when it comes to the coefficients. What is the effect of student age here (hint: look at the Fixed effects)? Is it significant? Note that the p-values here are based on a certain type of estimate of the degrees of freedom called the Satterthwaite Approximation.

Structural Equation Modeling

  1. SEM is a complex set of tools that encompass a great deal of research questions. This is only going to be the briefest of brief introductions here. To do so, we will use the main SEM package lavaan (stands for LAtent VAriable ANalysis) with the FacialBurns data set provided therein.
library(lavaan)
data(FacialBurns)
str(FacialBurns)
## 'data.frame':    77 obs. of  6 variables:
##  $ Selfesteem: int  35 40 38 30 27 35 35 28 29 26 ...
##  $ HADS      : int  6 2 2 4 11 5 4 13 19 16 ...
##  $ Age       : int  23 61 34 29 46 18 64 20 47 41 ...
##  $ TBSA      : num  3.5 6 8 6 19.2 ...
##  $ RUM       : num  5 4 4 5 13 8 6 8 13 8 ...
##  $ Sex       : int  1 1 1 1 1 1 1 1 1 1 ...
  1. To use lavaan we will create a model specification that is just a long string as shown below. Here, we look at the effect of age, sex, rumination (RUM), and anxiety (HADS) on self-esteem.
model <- "
          ## Just using SEM for a multiple regression
          Selfesteem ~ Age + Sex + RUM + HADS + TBSA
"
sem(model, data = FacialBurns) %>% summary()
## lavaan 0.6-3 ended normally after 19 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                          6
## 
##   Number of observations                            77
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       0.000
##   Degrees of freedom                                 0
##   Minimum Function Value               0.0000000000000
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   Selfesteem ~                                        
##     Age               0.029    0.035    0.824    0.410
##     Sex               2.262    1.268    1.784    0.074
##     RUM              -0.374    0.170   -2.194    0.028
##     HADS             -0.421    0.074   -5.668    0.000
##     TBSA             -0.127    0.076   -1.683    0.092
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Selfesteem       17.004    2.740    6.205    0.000
  1. What is the effect of anxiety on self-esteem? Is it significant?
  2. We can further get standardized effect sizes of these, using the following. The standardized effect is the Std.all column that means both the outcome and the predictors were standardized.
sem(model, data = FacialBurns) %>% summary(standardized = TRUE)
## lavaan 0.6-3 ended normally after 19 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                          6
## 
##   Number of observations                            77
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       0.000
##   Degrees of freedom                                 0
##   Minimum Function Value               0.0000000000000
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Selfesteem ~                                                          
##     Age               0.029    0.035    0.824    0.410    0.029    0.070
##     Sex               2.262    1.268    1.784    0.074    2.262    0.165
##     RUM              -0.374    0.170   -2.194    0.028   -0.374   -0.235
##     HADS             -0.421    0.074   -5.668    0.000   -0.421   -0.529
##     TBSA             -0.127    0.076   -1.683    0.092   -0.127   -0.162
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Selfesteem       17.004    2.740    6.205    0.000   17.004    0.524
  1. What is the standardized effect of anxiety on self-esteem? How would you interpret that?
  2. Note that the same results would be obtained using linear regression here:
lm(Selfesteem ~ Age + Sex + RUM + HADS + TBSA,
   data = FacialBurns) %>%
  summary()
## 
## Call:
## lm(formula = Selfesteem ~ Age + Sex + RUM + HADS + TBSA, data = FacialBurns)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.276  -2.725  -0.347   2.773  10.357 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 36.19221    2.08571  17.352  < 2e-16 ***
## Age          0.02896    0.03659   0.792   0.4313    
## Sex          2.26199    1.32033   1.713   0.0910 .  
## RUM         -0.37365    0.17737  -2.107   0.0387 *  
## HADS        -0.42051    0.07726  -5.443 7.09e-07 ***
## TBSA        -0.12744    0.07886  -1.616   0.1105    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.294 on 71 degrees of freedom
## Multiple R-squared:  0.4761, Adjusted R-squared:  0.4392 
## F-statistic: 12.91 on 5 and 71 DF,  p-value: 6.201e-09

Conclusion

The approaches closely-related to linear regression are wide reaching, allowing one to model all sorts of outcome variables in various ways. Using what you know about linear regression, the use of nealry all these approaches is similar enough that you can learn more about any of them that you need to for your research with far less effort than if you did not know linear regression.