Point Estimates, Confidence Intervals, Z-Test vs T-Tests (One sample) [Part 1]
Using Scipy.Stats
Hello!
Hello, basically, we’ll take a look on what differs Z-tests from T-tests.
Code and data comes first, then we’ll show all the definitions and theory.
Let’s code:
Point Estimates Method
- Start with random data to compose the population
We can see that the population average age is approximately 43 years old.
- All right, now, let’s suppose that we want to make a survey and for this we will select randomly persons of this population
- Then, we will ensure if our sample really reflects our population
It shows that there is a small difference between both.
Sampling Distributions and The Central Limit Theorem
- Note that the histogram of our sample looks similar but a bit more “jagged”, that would suggest that we can’t apply techniques used for normal distributions.
In fact, we can applying the Central Limit Theorem:
For showing this, let’s take:
- 200 samples from our population.
- Make 200 point estimates of the sample mean.
Even when we take a bimodal distribution and plot the density curve of the means of each of those samples.
- That’s how the Central Limit Theorem assist us.
- We can see that the sampling distribution mean looks similar to a normal distribution and that approaches to the true population mean.
- We are off .08 from the true population mean
- Point estimates can give a rough idea of the population parameter such as: mean.
- But estimates are prone to error.
- Taking multiple samples are not viable.
- It is not viable taking to 200 surveys of 500 people due to time and costs.
- That’s why we usually work with Confidence Intervals.
Confidence Intervals
Basically, take a point estimate and add/subtract a margin of error.
We’ll assume that we want 95% confidence interval from the point estimate mean:
Z-Test
- Z-Critical Value: 1.959963984540054
- Confidence Intervals: (41.690040474997915, 43.35595952500209)
These confidence interval catches the population mean that we calculated at the beginning. (43 years old)
T-Test
- A T-distribution it is closely to a normal distribution but it gets wider when the sample is smaller.
- T-Critical: 2.0638985616280205
- Confidence Intervals: (37.68972829149538, 47.939901338134256)
These confidence interval ALSO catches the population mean that we calculated at the beginning, but with a wider range. (43 years old)
The main idea it’s that “The larger the sample sizes doesn't make such a difference using z-test or t-test”.
Which one should I choose? (One sample)
For further information:
More content, such as:
Hypoteshis tests, types of errors, p-value, we’ll be covered in [Part 2].