Business Statistics and Analysis2
课件来源于Rice University的Business Statistics and Analysis
三, Business Applications of Hypothesis Testing and Confidence Interval Estimation
Week 1 – Confidence Interval – Introduction
1. Introducing the tdistribution, the T.DIST Function
 Does not have any standalone business application
 Used as an interim tool to calculate confidence intervals and hypothesis testing
 Has an associated probability density function
 Associated Excel functions are the =T.DIST and =T.INV
2. Introducing Confidence Interval
 It is an ‘interval’ with some ‘confidence’ or probability attached to it
 An interval for some unknown characteristic of the population data
1 2 3 4 5 6 7 8 

A 95% confidence interval for the vote share of candidate A,
[55.7%, 64.3%]
95% of similarly constructed confidence intervals likely to have the actual vote share for A
there is a 0.95 probability that the actual vote share for A will be between 55.7% and 64.3%
We wish to find out u
Example:
US Presidential Election, predicting the proportion of votes for a particular candidate
confidence interval for the ‘population proportion’
Example:
Average starting salary of all business students who graduated last year in New York city
confidence interval for the ‘population mean’
3. The zstatistic and the tstatistic
4. Using z and t statistics to Construct Confidence Interval
Week 2 – Confidence Interval – Applications
1. Application of Confidence Interval
 Conceptual understanding
 Examples
 Stylized problem
When the population standard deviation (σ) is not known,
 we replace it by the sample standard deviation (s)
 The zstatistic gets replaced by the tstatistic
1 2 3 4 5 6 7 8 9 10 11 12 13 

A 95% confidence interval for the average (population mean) house size: [3048.2, 3428.3] square feet
1 2 3 4 5 6 7 8 9 

Answer: The 90% confidence interval is: [0.453, 0.567] or [45.3%, 56.7%]
Summary:
2. Sample Size Calculation
A pollster wanting to make a prediction about aparticular candidate’s vote share in the US presidential election.
A quality control manager at a battery manufacturer wanting to estimate the average number of defective batteries contained in a box shipped by the company.
Different industries may have different rule of thumb strategies for sample size selection.
The pollster may want to have a margin of error +/– 3% with a confidence level of 95%
simply make some small survey get std = 0.9
The quality control manager may want to assess the average number of defectives in a box with a margin of error of plus minus 0.3 batteries and a confidence level of 95%
Many industries use ruleofthumb strategies/heuristics
We provided here some basis for choosing an appropriate sample size
Week 3 – Hypothesis Testing
Hypothesis tests are an important tool to analyze data and make some inferences from it.
All Hypothesis tests follow a basic logic…
 An assumption or a claim is made.
 If your data contradicts this assumption or claim then you conclude that the claim or assumption made must be wrong.
1 2 3 4 5 6 7 8 9 10 11 12 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

if the average amount of beverage per bottle across these 10 bottles is 170 milliliter?
easy to conclude that the bottling unit is not properly adjusted.
if the average amount of beverage per bottle across these 10 bottles is 200 milliliter?
again, the conclusion seems easy given this evidence.
if the average amount of beverage per bottle across these 10 bottles is 199.9 milliliter or 200.1 milliliter?
perhaps, giving benefit of doubt you would conclude that the unit is properly adjusted.
if the average amount per bottle in the sample turns out to be 199.1 ml? 198ml? 202 ml? …
1 2 3 4 5 6 7 8 9 10 11 12 

1. The Logic of Hypothesis Testing
2. Conducting a Hypothesis Test, the Four Steps
3. Single Tail and Two Tail Hypothesis Tests
 Formulate Hypothesis
 Calculate the tstatistic
 Cutoff values for the tstatistic
 Check whether tstatistic falls in the rejection region
Interpret results of hypothesis test as applied to the particular business application
1. Two tailed test
2. Single tailed test
1 2 3 4 5 

Claim made: increase in fuel efficiency is 3 miles per gallon or more
To test the claim :
 Random selection of 150 small cars.
 Their fuel efficiency measured before and after the use of fuel additive.
 150 measurements obtained for the increase in miles per gallon achieved.
4. Guidelines, Formulas and an Application of Hypothesis Test
1 2 3 4 5 6 

5. Hypothesis Test for a Population Proportion
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

6. Type I and Type II Errors in a Hypothesis Test
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Suppose that Sam’s true ability is indeed ≥ 40.
However the 10 days were not good for Sam.
He gave a low sample average.
You reject the Null hypothesis.
Type I error: Rejecting the Null Hypothesis when it is true.
Type II error: Not rejecting the Null Hypothesis when it is false
Sam’s true ability is NOT ≥ 40.
However the 10 days were lucky for Sam. He gave a high sample average.
You DID NOT reject the Null hypothesis.
Reducing the probability of Type I and Type II errors
 Probability of Type I error is set by our choice of α
Typically α = 0.05 or 0.01
Probability of Type II error can be reduced by taking a larger sample size.
Week 4 – Hypothesis Test – Differences in Mean
1. Introducing the DifferenceInMeans Hypothesis Test
1 2 3 4 5 6 

 Equal variance ttest for differences in means.
 Unequal variance ttest for differences in means.
 Paired ttest for differences in means.
2. The Paired tTest for Means
 Used to test claims around difference in two population means.
 Requires a sense of ‘pairing’ across individual observations in the two samples.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 

Paired ttest, some important considerations
 Sense of pairing across individual observations of the two samples.
 The two samples should have equal number of observations.
 You may have your conclusion reversed in using a paired ttest versus either the equal or unequal variances ttest.
Three kinds:
 Paired ttest for differences in means.
 Used when there is a sense of ‘pairing’ in the data
 ttest for differences in means ‘assuming equal variance’.
 ttest for differences in means ‘assuming unequal variance’.
1 2 3 4 5 6 7 8 9 10 11 

Difference in means test