![]() If 6 out of 10 Komodo dragon eggs raised at 30 ☌ were female, and 15 out of 30 eggs raised at 32☌ were female, the 60% female at 30☌ and 50% at 32☌ would get equal weight in a linear regression, which is inappropriate. One problem is that linear regression treats all of the proportions equally, even if they are based on much different sample sizes. This is not horrible, but it's not strictly correct. Often the proportions are arc-sine transformed, because that makes the distributions of proportions more normal. When there are multiple observations of the nominal variable for each value of the measurement variable, as in the Komodo dragon example, you'll often sees the data analyzed using linear regression, with the proportions treated as a second measurement variable. It would be silly to compare the mean incubation temperatures between male and female hatchlings, and test the difference using an anova or t–test, because the incubation temperature does not depend on the sex of the offspring you've set the incubation temperature, and if there is a relationship, it's that the sex of the offspring depends on the temperature. You raise 10 eggs at 30 ☌, 30 eggs at 32☌, 12 eggs at 34☌, etc., then determine the sex of the hatchlings. For example, let's say you are studying the effect of incubation temperature on sex determination in Komodo dragons. However, if you wanted to predict the probability that a 55-year-old woman with a particular cholesterol level would have a heart attack in the next ten years, so that doctors could tell their patients "If you reduce your cholesterol by 40 points, you'll reduce your risk of heart attack by X%," you would have to use logistic regression.Īnother situation that calls for logistic regression, rather than an anova or t–test, is when you determine the values of the measurement variable, while the values of the nominal variable are free to vary. those who didn't, and that would be a perfectly reasonable way to test the null hypothesis that cholesterol level is not associated with heart attacks if the hypothesis test was all you were interested in, the t–test would probably be better than the less-familiar logistic regression. You could do a two-sample t–test, comparing the cholesterol levels of the women who did have heart attacks vs. For example, imagine that you had measured the cholesterol level in the blood of a large number of 55-year-old women, then followed up ten years later to see who had had a heart attack. One clue is that logistic regression allows you to predict the probability of the nominal variable. You can also analyze data with one nominal and one measurement variable using a one-way anova or a Student's t–test, and the distinction can be subtle. Because this species is endangered, another goal would be to find an equation that would predict the probability of a wolf spider population surviving on a beach with a particular sand grain size, to help determine which beaches to reintroduce the spider to. One goal of this study would be to determine whether there was a relationship between sand grain size and the presence or absence of the species, in hopes of understanding more about the biology of the spiders. Spider presence or absence is the dependent variable if there is a relationship between the two variables, it would be sand grain size affecting spiders, not the presence of spiders affecting the sand. Sand grain size is a measurement variable, and spider presence or absence is a nominal variable. (2006) measured sand grain size on 28 beaches in Japan and observed the presence or absence of the burrowing wolf spider Lycosa ishikariana on each beach. Grain sizeĪs an example of simple logistic regression, Suzuki et al. One goal is to see whether the probability of getting a particular value of the nominal variable is associated with the measurement variable the other goal is to predict the probability of getting a particular value of the nominal variable, given the measurement variable. Simple logistic regression is analogous to linear regression, except that the dependent variable is nominal, not a measurement. ![]() Many people lump all logistic regression together, but I think it's useful to treat simple logistic regression separately, because it's simpler. ![]() I'm separating simple logistic regression, with only one independent variable, from multiple logistic regression, which has more than one independent variable. ![]() The nominal variable is the dependent variable, and the measurement variable is the independent variable. ![]() Use simple logistic regression when you have one nominal variable with two values (male/female, dead/alive, etc.) and one measurement variable. Use simple logistic regression when you have one nominal variable and one measurement variable, and you want to know whether variation in the measurement variable causes variation in the nominal variable. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |