Page 1 of 1

unneeded dummy variable

Posted: Fri Nov 10, 2017 9:02 am
by Williambert
How do I get rid of an unneeded dummy variable/ response category?I entered a categorical variable into my model as an explanatory variable, and although there are no observations in one of the categories of this variable, a dummy variable for this category was still included in the model. How can I remove it ?

Re: unneeded dummy variable

Posted: Fri Nov 10, 2017 9:40 am
by ChrisCharlton
If you highlight the column in the names window and click the regenerate button then it will remove any categories that do not have any associated data and add categories if there are values that do not have a label. Note that you will need to remove the variable from the model before doing this, and then add it back again afterwards.

Re: unneeded dummy variable

Posted: Tue Dec 05, 2017 1:57 pm
by Kurthens
Williamrbert wrote: Fri Nov 10, 2017 9:02 am If you highlight the column in the names window and click the regenerate button then it will remove any categories that do not have any associated data and add categories if there are values that do not have a label. Note that you will need to remove the variable from the model before doing this, and then add it back again afterwards.
Hi Chris, could this somehow be added as a feature, so it automatically does this?

Re: unneeded dummy variable

Posted: Thu Dec 07, 2017 10:39 am
by ChrisCharlton
Unfortunately this wouldn't be very practical as this would require all the categories to be checked against the data every time the data in the column changed. It would also mean potentially losing category labelling that had been entered by the user, as once categories are removed there would be no way to know what their text was previously.

One possible change that might work however is to only include categories that actually have data associated when adding dummies to the model. Doing this could however still have complications related to missing data, as it may be that the variable has data for a category but this is excluded from the model as the associated rows have data missing in other variables.