Types of Variables
For all types of variable (vectors), you may use the c() command
to ``concatenate'' elements into a vector, the : operator to
generate a sequence of integer values, the seq() command to
generate a sequence of non-integer values, or the rep() function
to repeat a value to a specified length. In addition, you may use the
<- operator to save variables (or any other objects) to the
workspace. For example:
> logic <- c(TRUE, FALSE, TRUE, TRUE, TRUE) # Creates `logic' (5 T/F values).
> var1 <- 10:20 # All integers between 10 and 20.
> var2 <- seq(from = 5, to = 10, by = 0.5) # Sequence from 5 to 10 by
# intervals of 0.5.
> var3 <- rep(NA, length = 20) # 20 `NA' values.
> var4 <- c(rep(1, 15), rep(0, 15)) # 15 `1's followed by 15 `0's.
For the seq() command, you may alternatively specify length instead of by to create a variable with a specific
number (denoted by the length argument) of evenly spaced
elements.
- Numeric variables are real numbers and the default
variable class for most dataset values. You can perform any type of
math or logical operation on numeric values. If var1 and var2 are numeric variables, we can compute
> var3 <- log(var2) - 2*var1 # Create `var3' using math operations.
Inf (infinity), -Inf (negative infinity),
NA (missing value), and NaN (not a number) are
special numeric values on which most math operations will fail.
(Logical operations will work, however.) Use as.numeric() to
transform variables into numeric variables. Integers are a special
class of numeric variable.
- Logical variables contain values of either
TRUE or FALSE. R supports the following logical
operators: ==, exactly equals; >, greater than; <,
less than; >=, greater than or equals; <=, less than or
equals; and !=, not equals. The = symbol is not
a logical operator. Refer to Section
for more detail on
logical operators. If var1 and var2 both have
observations, commands such as
> var3 <- var1 < var2
> var3 <- var1 == var2
create
TRUE/FALSE observations such that the
th
observation in var3 evaluates whether the logical statement is
true for the
th value of var1 with respect to the
th
value of var2. Logical variables should usually be converted
to integer values prior to analysis; use the as.integer()
command.
- Character variables are sets of text strings. Note
that text strings are always enclosed in quotes to denote that the
string is a value, not an object in the workspace or an argument for
a function (neither of which take quotes). Variables of class
character are not normally used in data analysis, but used as
descriptive fields. If a character variable is used in a
statistical operation, it must first be transformed into a factored
variable.
- Factor variables may contain values consisting of
either integers or character strings. Use factor() or as.factor() to convert character or integer variables into factor
variables. Factor variables separate unique values into levels.
These levels may either be ordered or unordered. In practice, this
means that including a factor variable among the explanatory
variables is equivalent to creating dummy variables for each level.
In addition, some models (ordinal logit, ordinal probit, and
multinomial logit), require that the dependent variable be a factor
variable.
Gary King
2007-06-01