STATISTICS with R

A comprehensive guide to statistical analysis in R

Independent Samples T-test in R

An independent samples t-test is a statistical technique used to evaluate whether there is a significant difference between the mean scores of two separate groups on the same variable. The term “independent samples” refers to the fact that the data from each group are collected independently, with no overlap between participants.

Introduction to Independent Samples T-test

Researchers frequently collect data from two separate groups—such as males versus females, patients on different medications, online versus in-person students, or drinkers versus non-drinkers—to examine potential differences in a shared outcome. These comparative studies aim to assess whether the average scores on a particular measure (like reading ability, blood pressure, engagement levels, or reaction time) differ meaningfully between the two groups. Because each individual belongs to only one group, the samples are considered independent, meaning the data from one group do not influence or relate to the other.

The researcher evaluates each group by calculating their average scores on a shared measure—for instance, comparing the academic performance of students in online versus in-person programs. The goal is to assess whether the observed difference in means is large enough, based on statistical standards, to draw a meaningful conclusion. In studies like this, the independent samples t-test is used to determine whether the mean scores of the two separate groups differ significantly. This test relies on the Student’s T distribution to establish whether the difference is statistically significant, which is why it’s known as a t-test.

In the following sections, we present an example research scenario where an independent samples t-test will be used to analyze the data. We will demonstrate how to perform an independent samples t-test in R step-by-step and how to interpret the results.

Independent Samples T-test Example

Does a new teaching method have an effect on students’ math improvement?

Independent samples t-test in R
Figure 0: Is the new method of teaching mathematics better than the traditional method? Photo courtesy: Anoushka Puri, Unsplash

A middle school teacher carried out a study to evaluate which of two teaching methods—one New and one Traditional—would lead to greater improvement in students’ math performance. Sixty students were randomly selected and then randomly assigned to either the New or the Traditional teaching method. After a full year of instruction, both groups took the same math test, and their scores were recorded. To assess which method was more effective, the teacher compared the average math scores of the two groups. Since the study involved two independent groups and a continuous outcome variable (math scores), an independent samples t-test was the appropriate statistical tool for analyzing the results. Table 1 displays the math scores of five students from the study.

Table 1: Students’ math scores in New method and Traditional method groups
Student Group Score
Student 1 New Method 97
Student 2 New Method 87
Student 3 New Method 85
Student 4 Traditional Method 84
Student 5 Traditional Method 87

The teacher enters the data in a spreadsheet program in the school computer lab and saves the data as CSV format. The data for this example can be downloaded in CSV format.

Analysis: Independent Samples T-test in R

In the first step, data are read into the R Studio program. The data will be stored in the long format data structure. We will create three columns (variables) in the spreadsheet, including Student ID, Method (New and Traditional), and the test scores.

To perform an independent samples t-test on the data, we use the t.test function in R using the formula notation, y ~ x (which is read y as a function of x). The following code in Listing 1 shows the formula approach to running independent samples t-test in R assuming the variance is equal between the two times.

Listing 1: R code to run independent samples t-test.
dfScores <- read.csv("dsMathTeachingMethods.csv")
t.test(Score ~ Method, data=dfScores, var.equal=TRUE)

	Two Sample t-test

data:  Score by Method
t = 6.6616, df = 58, p-value = 1.082e-08
alternative hypothesis: true difference in means between group New and group Traditional is not equal to 0
95 percent confidence interval:
  6.435542 11.964458
sample estimates:
        mean in group New mean in group Traditional 
                 91.13333                  81.93333 

As the results in Listing 1 above shows, the difference in means between the New method and the Traditional method is statistically significant (t=6.661, df=58, p < 0.05). In addition, the 95% confidence interval [6.435, 11.964] does not include 0 (null value), hence further showing the statistical significance of the result. The mean score in the New teaching method is 91.133 and the mean score in the Traditional teaching method is 81.933. We conclude that the New method of teaching was more effective.

One assumption underlying independent samples t-test is that the variances between the two groups is identical. However, in reality, this assumption may not hold true. In that case, we can change the argument value in the code var.equal = FALSE to run an independent samples t-test assuming unequal variances. This type of t-test is also called Welch two sample t-test. Listing 2 displays the code to run Welch two sample t-test, or t-test with unequal variances.

Listing 2: R code to run Welch two sample t-test.
dfScores <- read.csv("dsMathTeachingMethods.csv")
t.test(Score ~ Method, data=dfScores, var.equal=FALSE)

	Welch Two Sample t-test

data:  Score by Method
t = 6.6616, df = 54.699, p-value = 1.382e-08
alternative hypothesis: true difference in means between group New and group Traditional is not equal to 0
95 percent confidence interval:
  6.431986 11.968014
sample estimates:
        mean in group New mean in group Traditional 
                 91.13333                  81.93333 

The results from the Welch two sample t-test are very similar to the equal-variances independent samples t-test, showing a statistically significant difference between the means of New and Traditional teaching method (t = 6.661, df = 54.699, p < 0.01).

Scroll to Top