Point Estimates, Confidence Intervals, Z-Test vs T-Tests (One sample) [Part 1]

Igorps
2 min readJan 14, 2022

--

Using Scipy.Stats

Hello!

Hello, basically, we’ll take a look on what differs Z-tests from T-tests.

Code and data comes first, then we’ll show all the definitions and theory.

Let’s code:

Point Estimates Method

  • Start with random data to compose the population

We can see that the population average age is approximately 43 years old.

  1. All right, now, let’s suppose that we want to make a survey and for this we will select randomly persons of this population
  2. Then, we will ensure if our sample really reflects our population

It shows that there is a small difference between both.

Sampling Distributions and The Central Limit Theorem

- Note that the histogram of our sample looks similar but a bit more “jagged”, that would suggest that we can’t apply techniques used for normal distributions.

In fact, we can applying the Central Limit Theorem:

For showing this, let’s take:

  1. 200 samples from our population.
  2. Make 200 point estimates of the sample mean.

Even when we take a bimodal distribution and plot the density curve of the means of each of those samples.

  • That’s how the Central Limit Theorem assist us.
  • We can see that the sampling distribution mean looks similar to a normal distribution and that approaches to the true population mean.
  • We are off .08 from the true population mean
  1. Point estimates can give a rough idea of the population parameter such as: mean.
  2. But estimates are prone to error.
  3. Taking multiple samples are not viable.
  4. It is not viable taking to 200 surveys of 500 people due to time and costs.
  5. That’s why we usually work with Confidence Intervals.

Confidence Intervals

Basically, take a point estimate and add/subtract a margin of error.

We’ll assume that we want 95% confidence interval from the point estimate mean:

Z-Test

  • Z-Critical Value: 1.959963984540054
  • Confidence Intervals: (41.690040474997915, 43.35595952500209)

These confidence interval catches the population mean that we calculated at the beginning. (43 years old)

T-Test

  • A T-distribution it is closely to a normal distribution but it gets wider when the sample is smaller.
  1. T-Critical: 2.0638985616280205
  2. Confidence Intervals: (37.68972829149538, 47.939901338134256)

These confidence interval ALSO catches the population mean that we calculated at the beginning, but with a wider range. (43 years old)

The main idea it’s that “The larger the sample sizes doesn't make such a difference using z-test or t-test”.

Which one should I choose? (One sample)

Source

For further information:

  1. Analytics Steps
  2. Analytics Vidhya

More content, such as:

Hypoteshis tests, types of errors, p-value, we’ll be covered in [Part 2].

--

--

Igorps
Igorps

Written by Igorps

Economics, Data and Technology

No responses yet