STATISTICS with R

A comprehensive guide to statistical analysis in R

Paired Samples T-test in R

A paired samples t-test, also known as dependent samples t-test, is a statistical method used to compare the means of two related groups. This test is appropriate when the same participants are measured at two different points in time or under two different conditions, such as before and after an intervention or when comparing two related units like siblings or matched pairs.

Introduction to Paired Samples T-test

When data are collected from the same participants (i.e., one group) on two occasions (e.g., before and after implementing a new teaching method), the independence of the data is compromised, so an independent samples t-test cannot be used. In such cases, a dependent samples t-test is appropriate, as the data points are related due to the same participants being measured twice. For instance, a researcher might study the effect of a new teaching method on students’ math scores. The researcher administers a math test at the beginning of the course and the same test at the end, after students have experienced the new teaching method. The researcher then compares the students’ mean scores before and after the intervention.

The primary goal of a dependent samples t-test is to determine whether there is a statistically significant difference between the paired observations. This test calculates the differences between paired observations and then assesses whether the average difference is significantly different from zero. Key assumptions for the dependent samples t-test include normally distributed difference scores and the absence of outliers. Researchers commonly use this test in fields such as psychology, education, and medicine to evaluate changes or effects resulting from specific treatments or interventions.

An important assumption when conducting a paired samples t-test is that the difference scores (i.e., scores from the first occasion subtracted from the second occasion) should be normally distributed.

In the following sections, we present an example research scenario where a paired samples t-test will be used to analyze the data. We will demonstrate how to perform a paired samples t-test in R step-by-step and how to interpret the results.

Paired Samples T-test Example

Does the consumption of cardamoms have an effect on blood pressure levels?

Paired samples t-test in R
Figure 0: Does the consumption of cardamoms have an effect on blood pressure levels? Photo courtesy: Ace of Net, Unsplash

A health researcher is interested in knowing if consuming cardamom on a daily basis has a noticeable effect on blood pressure. The researcher randomly selects 30 participants and measures their blood pressure at the start of the study. During the study, the participants consume 3 grams of powdered cardamom daily for 14 weeks. At the end of the experiment, the researcher measures their blood pressure again to see if there is any improvement due to cardamom consumption.

In this study, there are two measurement occasions (before and after the cardamom consumption period) and one group. Therefore, the researcher uses a dependent samples t-test to compare the means of the same group over two measurement occasions. Table 1 includes the blood pressure readings of five participants on both occasions.

Table 1: Blood Pressure Readings Before and After Cardamom Consumption.
Participant BP Before BP After
Participant 1 132 134
Participant 2 137 117
Participant 3 163 129
Participant 4 141 129
Participant 5 141 128

The health researcher enters the data in the SPSS program in the hospital computer lab. The data for this example can be downloaded in CSV format.

Analysis: Paired Samples T-test in R

In the first step, data are read into the R Studio program. The structure of the data will be similar to the independent samples t-test, which is called long format data structure. We will create three columns (variables) in the spreadsheet, including Participant ID, the occasion the blood pressure measurement was recorded (Time: BP Before = 1 or BP After = 2), and the blood pressure measurement values.

To perform a dependent samples t-test on the data, we use the t.test function in R using the formula notation, y ~ x and the option paired=TRUE. The following code in Listing 1 shows the formula approach to perform dependent samples t-test in R assuming the variance is equal between the two times.

Listing 1: R code to run paired samples t-test.
dfCardamom <- read.csv("dfCardamom.csv")
t.test(BP ~ time, data = dfCardamom, paired = TRUE)

	Paired t-test

data:  BP by time
t = -5.3702, df = 29, p-value = 9.069e-06
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 -24.25692 -10.87641
sample estimates:
mean difference 
      -17.56667 

As results in Listing 1 above shows, the difference between BP After and BP Before mean values is -17.57 and the t value is -5.37 with 29 degrees of freedom. The p value shows p < 0.01, which is below the criterion 0.05 (and 0.025 for two-tailed hypothesis). Therefore, we conclude that a decrease of 17.567 units in blood pressure after the consumption of cardamom was statistically significant. In addition, the 95% confidence interval for the difference between the means was to -24.26 to -10.88, indicating that the difference of -17.567 points in mean values was statistically significant.

Scroll to Top