Community

is blood pressure a continuous variable

The answer here seems pretty obvious – you
must understand the data type of each
variable in order to record its values in a
consistent manner. This probably won’t require
much thought in most cases, but consider the following example.
Suppose you are interested in th
e variable creatinine but plan
to analyze it as a binary
variable by classifying patient
s as creatinine < 1.8 or crea
tinine ³ 1.8. You could simply
collect which of these categories
each individual falls into, bu
t this probably isn’t the best
choice. If a categorical variab
le is based on the value of a continuous variable, it is
generally a good idea to collect
the continuous variable. A c
ontinuous variable provides
more information than a binary variable, whic
h usually translates into more statistical
power to detect differences among patients. If
, in the analysis phase
, you decide that you
really do want to use the bina
ry version of the variable, you
can easily use a formula in a
spreadsheet or statistical software packag
e to create the binary variable from the
continuous one you collected. On the other hand
, if you only collect th
e binary variable,
you do not have the source measurement recorded to go back to if necessary

You are probably frequently exposed to
terms such as mean, median, frequency,
proportion, two-sample t-test, chi-square test
, regression, correlation, logistic regression,
etc. These are all sta
tistical calculatio
ns or procedures, but which ones do you use – and
when? The appropriate statistical
calculation or procedure is
driven in large part by the
data types

the variables body temperature (°C) and
diabetes (0 = No diabetes, 1 = Yes diab
etes) among 1420 hospitalized cancer patients.
Diabetes is a nominal variable with only two possible values. Thus, we want to know the
number (frequency) of patients with diabetes and what proportion of the total sample they
represent. Because body temperature is a continuous variable with many possible values,
we summarize its distribution
by reporting statistics such as the median, minimum,
maximum, mean and standard deviation. Clearl
y it would not be feasible or helpful to
summarize the number and proportion of patients who had each specific body
temperature value, just as it would make no se
nse to calculate the mean of the diabetes
variable.

e a continuous va
riable (body temperature)
for each of two groups (females and males). A statistical quantity used to summarize the
distribution of a continuous variable is the mean. We see that the mean body temperature
for males was 36.90°, compared to 36.99° for females. Just as we compare means in the
two groups in our descriptive statistical analysis, we need a procedure that will
statistically compare the mean among males to the mean among females. One statistical
test for comparing means between two groups is a two-sample t-test.