'Multiple regression: R splits Variable into multiple
Hey there i want to explore the effect of Age and Gender on points of a test via mlr. Yet when i type
model <- lm(punkte~ Age + Gender, data = df)
R gives me following results
(Intercept)   5.677369   0.176482  32.170  < 2e-16 ***
Age          -0.017953   0.004932  -3.640 0.000300 ***
GenderFemale  0.595369   0.154697   3.849 0.000134 ***
GenderDivers -1.416150   0.684191  -2.070 0.038964 *  
But i dont want the Gender variable to be split into multiple, also GenderMale is missing and i dont know why. Help would be appreciated very much
Solution 1:[1]
"Male" is missing since your model chooses "male" as the reference, when you have categorical variables in gender.
You can always change the reference variable by something like:
df <- within(df, gender <- relevel(factor(gender), ref = "Female"))
You can only combine the "female" and "divers" if you change the data from the root (and normally we don't do that). For example, combine those two to "non-male" or "others".
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | ElleryC | 
