Confidence Intervals for Sample Size Less Than 30

In the preceding discussion we have been using s, the population standard deviation, to compute the standard error. However, we don't actually know the population standard deviation, since we are working from samples. To get around this, we have been using the sample standard deviation (s) every bit an estimate. This is non a problem if the sample size is thirty or greater considering of the primal limit theorem. However, if the sample is small (<30) , nosotros have to arrange and use a t-value instead of a Z score in order to account for the smaller sample size and using the sample SD.

Therefore, if n<30, use the appropriate t score instead of a z score, and notation that the t-value will depend on the degrees of freedom (df) as a reflection of sample size. When using the t-distribution to compute a confidence interval, df = n-ane.

Calculation of a 95% confidence interval when n<xxx will then utilize the appropriate t-value in place of Z in the formula:

The T-distribution

One way to recall nearly the t-distribution is that information technology is really a large family unit of distributions that are similar in shape to the normal standard distribution, just adjusted to account for smaller sample sizes. A t-distribution for a small sample size would look similar a squashed down version of the standard normal distribution, but as the sample size increase the t-distribution volition get closer and closer to approximating the standard normal distribution.

The table below shows a portion of the table for the t-distribution. Notice that sample size is represented by the "degrees of liberty" in the first column. For determining the conviction interval df=due north-i. Discover besides that this table is prepare a lot differently than the table of Z scores. Here, only five levels of probability are shown in the column titles, whereas in the tabular array of Z scores, the probabilities were in the interior of the table. Consequently, the levels of probability are much more limited here, considering t-values depend on the degrees of liberty, which are listed in the rows.

Confidence Level

80%

90%

95%

98%

99%

Ii-sided test p-values

.20

.x

.05

.02

.01

Ane-sided examination p-values

.10

.05

.025

.01

.005

Degrees of Freedom (df)

1

3.078

6.314

12.71

31.82

63.66

ii

i.886

ii.920

4.303

6.965

9.925

three

1.638

2.353

3.182

4.541

v.841

4

ane.533

2.132

two.776

3.747

iv.604

5

1.476

2.015

2.571

3.365

4.032

half-dozen

1.440

1.943

two.447

iii.143

3.707

7

i.415

i.895

2.365

2.998

3.499

8

1.397

1.860

2.306

2.896

3.355

9

ane.383

one.833

two.262

2.821

3.250

x

1.372

ane.812

ii.228

2.764

three.169

11

ane.362

1.796

2.201

2.718

3.106

12

1.356

1.782

two.179

ii.681

3.055

thirteen

ane.350

1.771

ii.160

ii.650

3.012

14

1.345

1.761

ii.145

2.624

2.977

xv

one.341

1.753

2.131

ii.602

2.947

16

i.337

ane.746

2.120

2.583

two.921

17

i.333

one.740

two.110

2.567

ii.898

18

one.330

i.734

2.101

ii.552

2.878

xix

1.328

one.729

two.093

2.539

2.861

20

1.325

1.725

2.086

2.528

2.845

Notice that the value of t is larger for smaller sample sizes (i.eastward., lower df). When we utilise "t" instead of "Z" in the equation for the confidence interval, it will effect in a larger margin of mistake and a wider confidence interval reflecting the smaller sample size.

With an infinitely large sample size the t-distribution and the standard normal distribution volition be the same, and for samples greater than 30 they will be similar, but the t-distribution will be somewhat more than conservative. Consequently, ane can e'er use a t-distribution instead of the standard normal distribution. However, when you want to compute a 95% confidence interval for an estimate from a big sample, it is easier to just use Z=1.96.

Because the t-distribution is, if anything, more than conservative, R relies heavily on the t-distribution.

Exam Yourself

Problem #ane

Using the table above, what is the critical t score for a 95% confidence interval if the sample size (n) is 11?

Reply

Problem #2

A sample of north=10 patients gratuitous of diabetes have their body mass index (BMI) measured. The mean is 27.26 with a standard difference of two.10. Generate a 90% confidence interval for the mean BMI among patients free of diabetes.

Link to Answer in a Give-and-take file

Confidence Intervals for a Mean Using R

Instead of using the table, you can use R to generate t-values. For example, to generate t values for calculating a 95% confidence interval, utilize the function qt(1-tail expanse,df).

For example, if the sample size is 15, then df=fourteen, nosotros can calculate the t-score for the lower and upper tails of the 95% confidence interval in R:

> qt(0.025,14)
[one] -2.144787
>
qt(0.975,14 )
[i] 2.144787

Then, to compute the 95% confidence interval we could plug t=2.144787 into the equation:

Confidence Intervals from Raw Information Using R

It is as well easy to compute the point estimate and 95% confidence interval from a raw information set using the " t.test " function in R. For instance, in the data set from the Weymouth Health Survey I could compute the mean and 95% confidence interval for BMI as follows. Offset, I would load the information set and requite it a short nickname. So I would attach the information set, so use the post-obit command:

> t.test(bmi)

The output would look similar this:

Ane Sample t-test

data:  bmi
t = 228.5395, df = 3231, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
26.66357 27.12504

sample estimates:
mean of x
26.8943

R defaults to computing a 95% confidence interval, but you can specify the confidence interval as follows:

> t.test(bmi,conf.level=.90)

This would compute a 90% confidence interval.

Test Yourself

Lozoff and colleagues compared developmental outcomes in children who had been bloodless in infancy to those in children who had non been anemic. Some of the data are shown in the table beneath.

Mean + SD

Anemia in Infancy

(north=xxx)

Non-anemic in Infancy

(n=133)

Gross Motor Score

52.4+14.3

58.7+12.v

Exact IQ

101.iv+13.2`

102.9+12.iv

Source: Lozoff et al.: Long-term Developmental Outcome of Infants with Iron Deficiency, NEJM, 1991

Compute the 95% confidence interval for verbal IQ using the t-distribution

Link to the Answer in a Give-and-take file