Recoding, computation of new variables and missing values

This page is a part of PSPP Guide.

Recode command

Recoding is a method to combine (unite) several values of a given variable into less categories. Let’s say we want to recode age of respondents into three categories: up to 25 years, 26 to 50 and older than 50.

Variables could be recoded into the same variable or new variable. Usually it is better to recode into new variable, because it could happen we will need original variable values for some statistical analysis in the future.

To do this, go to Transform – Recode into different variables. Select variable to recode, set name and label of the new variable and click Change. Then click to New and old values and set old value, new value and add this recoding rule to recode list. Click Continue and then OK. New, recoded variable is computed (you can see it in data view and compute frequencies for it).

Recode command.

Recode command.

Recode - old and new values.

Recode – old and new values.

Compute command

With compute command we can compute new variables, for instance from year of birth how old are respondents. To do this, we should go to Transform – Compute, type the target variable (how_old) and create computation (2009 – yearBirth; survey was conducted in 2009). It is also possible to set type and label of the target variable directly from this window. Click to OK, and new variable is computed (you can see it in data view and compute frequencies for it).

Compute command.

Compute command.

PSPP offers several other operands and functions to perform computations. For a complete list please see the manual.

Count command

With count command we can easily compute how many times one specific value (or more of them) emerges in a given variable. This command is not available through graphical interface, but can be run from syntax file. For more information, see reference manual.

Syntax command for counting number of respondents 40 years old:

COUNT old40 = how_old (40).

Syntax command for counting number of respondents 30 and 40 years old:

COUNT old40 = how_old (40, 30).

Missing values

PSPP also includes support for unknown numeric data values. Missing observations are assigned a special value, called the system missing value. This “value” actually indicates the absence of a value; it means that the actual value is unknown. Procedures automatically exclude from analyses those observations or cases that have missing values. Details of missing value exclusion depend on the procedure and can often be controlled by the user. The system missing value exists only for numeric variables. String variables always have a defined value, even if it is only a string of spaces.

Variables, whether numeric or string, can have designated user-missing values. Every user-missing value is an actual value for that variable. However, most of the time user-missing values are treated in the same way as the system-missing value. String variables that are wider than a certain width, usually 8 characters (depending on computer architecture), cannot have user-missing values.

To define which are missing values use a recode command. System missing values are presented as a dot in PSPP data view.

Setting missing values.

Setting missing values.

Kategorije: PSPP
Ključne besede: PSPP