MadSci Network: Other

Subject: Normalization variables in regression analysis questio.n

Date: Thu Mar 1 14:23:26 2001
Posted by Tim
Grade level: nonaligned School: not in school
City: Toronto State/Province: Ontario Country: canada
Area of science: Other
ID: 983474606.Ot

I'm attempting to perform a discriminant analysis on a fairly large data set 
(approx. 440 in my experimental group and 150 in my control group).  I have a 
couple of problems that I need some help with.

I know that multiple regressions depend on variables being continuous and 
normally distributed.  I have a number of dichotomous (yes/no) variables that I 
would like to include in my analysis.  I have coded them as dummy variables 
(0=no, 1=yes).  The problem is that many of them are heavily skewed (80-95% of 
responses are "no").  One text I was reading suggested performing log or square 
root transformations.  While this appears to correct the skew for continuous 
variables (I tested it out on "weight" which is one of the continuous variables 
in my data set), it doesn't have any effect on my dummy variables.  Do you have 
any suggestions as to how I might fix this problem so that these variables come 
closer to normality?  If not, are there any guidelines for how much a variable 
can deviate from normality without having a major impact on the overall 

I'm also wondering whether you have any suggestions as to how to deal with 
missing values?  The analysis program I'm using (SPSS) removes all cases with 
missing values for any variable.  This reduces my N considerably, even though 
each variable has only a few missing values.

Re: Normalization variables in regression analysis questio.n

Current Queue | Current Queue for Other | Other archives

Try the links in the MadSci Library for more information on Other.

MadSci Home | Information | Search | Random Knowledge Generator | MadSci Archives | Mad Library | MAD Labs | MAD FAQs | Ask a ? | Join Us! | Help Support MadSci

MadSci Network,
© 1995-2001. All rights reserved.