Statistics and Probability
Statistics and Probability
Statistics and Probability
ANS; since our calculated test statistic (-2.98) is less than the critical value (-
1.66), we reject the null hypothesis (H0: μ ≥ 5.4) in favour of the alternative
hypothesis (H1: μ < 5.4).
This means that there is evidence to suggest that telecommuting has reduced
the mean number of sick days taken by employees at this firm
QN2; The pH of an acid solution used to etch aluminium varies somewhat from batch to
batch. In a sample of 50 batches, the mean pH was 2.6, with a standard deviation of 0.3. Let
μ represent the mean pH for batches of this solution. If H0: μ≤ 2.5 versus H1: µ>- 2.5, is
there an evidence that the PH varies.
SOLN; Hint; This is a right tailed test problem since we are given that H0: μ≤ 2.5 versus
H1: µ> 2.5
Step1; we need to calculate the test statistic. (use t-test as there is no population
standard deviation)
We can do this using the formula:
t = (x - μ) / (s / √n) ` ̄
Where;
x is the sample mean (2.6) ̄
µ is the population mean
(2.5),
s is the population standard deviation ( 0.3),
and n is the sample size (50).
Putting in these values, we get t = (2.6 - 2.5) / (0.3 / √50) ≈ 2.357.
step2. We need to find the critical value for a right-tailed test using a student
distribution table.
The critical value for a 0.05 level of significance is approximately 1.68. (NB;
read from t-table a one tailed case, degree of freedom is 49, significance level
0.05)
step3. Since our calculated test statistic (2.357) is greater than the critical value
(1.68), we reject the null hypothesis (H0: μ ≤ 2.5) in favour of the alternative
hypothesis (H1: μ > 2.5).
This means that there is evidence to suggest that the mean pH of the acid
solution used to etch aluminum varies from batch to batch and is greater than
2.5.
QN3; The article "Evaluation of Mobile Mapping Systems for Roadway Data Collection"
(H.
Karimi, A. Khattak, and J. Hummer, Journal of Computing in Civil Engineering, 2000: 168-
173) describes a system for remotely measuring roadway elements such as the width of
lanes and the heights of traffic signs. For a sample of 160 such elements, the average error
(in percent) in the measurements was 1.90, with a standard deviation of 21.20. Let μ
represent the mean error in this type of measurement. If H0: µ = 0 versus H1: μ is not equal
0, can we reject Ho.
SOLN; HINT; this is a two tailed test since we are told that H0: µ = 0 versus H1: μ is
not equal 0
STEP1; we need to calculate the test statistic.
We can do this using the formula:
t = (x - μ) / (s / √n) ̄
Where;
`x` is the sample mean (1.90) ̄
`μ` is the population mean (0),
`s` is the sample standard deviation (which we estimate using the sample
standard deviation, 21.20),
And `n` is the sample size (160).
Plugging in these values,
We get; t = (1.90 - 0) / (21.20 / √160) ≈ 1.13.
STEP2. Next, we need to find the critical value for a two-tailed test using a student
distribution table. The critical value for a 0.05 level of significance is;
For left tail; t=-1.98
For right tail; t=1.98
STEP3. Since our calculated test statistic (1.13) is less than the critical value (1.98)
at right tail, also greater than critical value (-1.98) at left tail then we fail to reject the
null hypothesis (H0: μ = 0).
This means that there is not enough evidence to suggest that the mean error in this
type of measurement is different from 0
QN4; In a process that manufactures tungsten-coated silicon wafers, the target resistance for
a wafer is 85 m. In a simple random sample of 50 wafers, the sample _mean resistance was
84.8 mΩ, and the standard dev1at1on was 0.5 mΩ. Let μ represent the mean resistance of
the wafers manufactured by this process. A quality engineer tests H0: µ = 85 versus H1: μ ≠
85. Do you believe it is plausible that the mean is on target, or are you convinced that the
mean is not on target? Explain your reasoning.
SOLN;
step1. we need to calculate the test statistic. From
t = (x - μ) / (s / √n)` ̄
where;
`x` is the sample mean (84.8 mΩ), ̄
`μ` is the population mean (85 mΩ),
`s` is the sample standard deviation 0.5 mΩ), and `n` is the sample size (50).
Then t = (84.8 - 85) / (0.5 / √50) ≈ -2.828.
Step2; we need to find the critical value for a two-tailed test using a student
distribution table. The critical value for a 0.05 level of significance is calculated as
For two tailed test, significance level 0.05, degree of
step3. Since the absolute value of our calculated test statistic -2.828 is less than t-
critical for left tail (-2.01)
We reject the null hypothesis (H0: μ = 85) in favour of the alternative hypothesis (H1:
μ ≠ 85).
This means that there is evidence to suggest that the mean resistance of the wafers
manufactured by this process is different from 85 mΩ and that it is not on target.
QN5; There is concern that increased industrialization may be increasing the mineral
content of river water. Ten years ago, the silicon content of the water in a certain river was
5 mg/L. Eighty-five water samples taken recently from the river have mean silicon content
5.4 mg/L and standard deviation 1.2 mg/L.
a) State the null and alternative hypothesis
b) Do you believe it is plausible that the silicon content of the water is not greater than it
was 10 years ago, or are you convinced that the level has increased? Explain your
reasoning.
SOLN;
a) In this case, we want to test whether the mean silicon content of the river water has
increased compared to its level 10 years ago. The null and alternative hypotheses can be
stated as follows:
• Null hypothesis (H0): The mean silicon content of the river water is not greater than it was
10 years ago (i.e., μ = 5 mg/L).
• Alternative hypothesis (H1): The mean silicon content of the river water has increased
compared to its level 10 years ago (i.e., μ > 5 mg/L).
b) To determine whether it is plausible that the mean silicon content of the river water is
not greater than it was 10 years ago, or whether there is evidence to suggest that the level
has increased, we can conduct a hypothesis test using the sample data provided.
HINT; this is a right tailed test;
First, we need to calculate the test statistic. We can do this using the formula:
t = (x̄ - μ) / (s / √n),
where x̄ is the sample mean (5.4 mg/L), μ is
the population mean (5 mg/L),
s is the sample standard deviation (1.2 mg/L), and n is the sample size (85). Putting in
these values, we get:
t= (5.4 - 5) / (1.2 / √85) ≈ 3.073.
Next, we need to find the critical value for a right-tailed test using a student distribution
table.
The critical value for a 0.05 level of significance is approximately 1.66.
Since our calculated test statistic (3.073) is greater than the critical value (1.66), we reject
the null hypothesis (H0: μ = 5 mg/L) in favour of the alternative hypothesis (H1: μ > 5
mg/L).
This means that there is evidence to suggest that the mean silicon content of the river water
has increased compared to its level 10 years ago.
QN6; A certain type of stainless steel powder is supposed to have a mean particle diameter
of μ=15μm. A random sample of 87 particles had a mean diameter of 15.2μm, with a
standard deviation of 1.8μm. A test is made of H0: μ= 15 versus H1: μ≠15.
b) Do you believe it is plausible that the mean diameter is 15μm, or are you convinced that
it differs from 15μm? Explain your reasoning.
d) Use the output and an appropriate table to compute a 99% confidence interval for μ
SOLN;
a) In this case, we want to test whether the mean particle diameter of the stainless steel
powder is equal to 15μm.
• Null hypothesis (H0): The mean particle diameter of the stainless steel powder is equal to
15μm (i.e., μ = 15).
• Alternative hypothesis (H1): The mean particle diameter of the stainless steel powder is not
equal to 15μm (i.e., μ ≠ 15).
b) To determine whether it is plausible that the mean particle diameter of the stainless steel
powder is equal to 15μm, or whether there is evidence to suggest that it differs from 15μm,
we can conduct a hypothesis test using the sample data provided
we can calculate the test statistic using t-distribution or z-distribution) but the most
appropriate method regarding to this question will be the t-distribution hence the
formula:
`x̄` = 15.2μm
=1.8μm and
n= (87).
Next, we need to find the critical value for a two-tailed test using a student distribution
table. The critical value for a 0.05 level of significance is approximately 1.99 for right tail
and -1.99 for the left tail. (From t-table)
Lastly; Since our calculated test statistic (1.036) is less than the critical value (1.99) in right
tail and also greater than -1.99 in left tail, we fail to reject the null hypothesis (H0: μ = 15).
This means that there is not enough evidence to suggest that the mean particle
diameter of the stainless steel powder differs from 15μm
C) This is a two-tailed test because we are interested in whether the population mean differs
from the hypothesized value (in this case, 15μm), regardless of whether it is greater or less
than this value.
CI = x̄ ± t* (s / √n), where
(1.8μm),
of confidence.
QN7; In each of the following situations, state the most appropriate null hypothesis
regarding the population mean μ
a) A new type of epoxy will be used to bond wood pieces if it can be shown to have a
mean shear stress greater than 10MPa.
b) A quality control inspector will recalibrate a flow meter if the mean flow rate differs
from 20 mL/s. c) A new type of battery will be installed in heart pacemakers if it can be
shown to have a mean lifetime greater than eight years.
SOLN;
a) Ho≤10
b) Ho=20
QN8; The installation of a radon abatement device is recommended in any home where the
mean radon concentration is 4.0 picocuries per liter (pCi/L) or more, because it is thought
that long-term exposure to sufficiently high doses of radon can increase the risk of cancer.
Seventy-five measurements are made in a particular home. The mean concentration was
3.72pCi/L, and the standard deviation was 1.93pCi/L.
a) The home inspector who performed the test says that since the mean measurement is less
than 4.0, radon abatement is not necessary. Explain why this reasoning is incorrect.
SOLN;
In this case, the mean radon concentration was 3.72pCi/L and the standard deviation was
1.93pCi/L. This means that while the average of all the measurements was
3.72pCi/L, some of the individual measurements were likely higher or lower than this
value.
If we assume that the radon concentrations follow a normal distribution, then about 68% of
the measurements would fall within one standard deviation of the mean (i.e., between 3.72 -
1.93 = 1.79pCi/L and 3.72 + 1.93 = 5.65pCi/L). This means that it is possible that some of
the measurements were above 4.0pCi/L, which is the threshold for recommending radon
abatement. That is why this reasoning is incorrect.
b) The appropriate null hypothesis (H0) is that the mean radon concentration in the
home is less than 4.0pCi/L. The alternate hypothesis (Ha) is that the mean radon
concentration in the home is greater than or equal 4.0pCi/L
we need to perform a one-sample t-test. t-test is used in this case because the population
standard deviation is unknown
t = (x̄ - μ) / (s / √n)
≈ -1.256.
Using a t-distribution table with 74 degrees of freedom, we find that the P value is
approximately 0.106
Since the P-value is greater than the commonly used significance level of 0.05, we do not
have sufficient evidence to reject the null hypothesis. This means that we cannot conclude
that the mean radon concentration in the home is greater than 4.0pCi/L
Based on this statistical analysis, I would not recommend radon abatement in this particular
case
SOLUTION;
b) If H0 is not rejected, it means that we do not have sufficient evidence to conclude that
the population mean reading on the scale (μ) is different from 10. However, this does not
necessarily mean that the scale is in calibration. It is still possible that the scale is out of
calibration, but our sample did not provide enough evidence to detect it. In this case, the
best conclusion would be (iii) The scale might be in calibration.
NB; When we perform a hypothesis test, we are trying to determine whether the observed
data provide enough evidence to reject the null hypothesis. In this case, the null hypothesis
is that the population mean reading on the scale (μ) is equal to 10 (i.e., the scale is in
calibration).
If we do not reject H0, it means that our sample did not provide enough evidence to
conclude that μ is different from 10. However, this does not necessarily mean that μ is
equal to 10. It is still possible that μ is different from 10 (i.e., the scale is out of calibration),
but our sample did not provide enough evidence to detect it.
This can happen for several reasons. For example, our sample size may be too small to
detect a small difference between μ and 10. Or, there may be too much variability in the
measurements, making it difficult to detect a difference even if one exists.
QN10; Leakage from underground fuel tanks has been a source of water pollution. In a
random sample of 87 gasoline stations, 13 were found to have at least one leaking
underground tank.
a) Find a 95% confidence interval for the proportion of gasoline stations with at least one
leaking underground tank.
b) Find a 90% confidence interval for the proportion of gasoline stations with at least one
leaking underground tank.
c) How many stations must be sampled so that a 95% confidence interval specifies the
proportion to within ± 0.04?
d) How many stations must be sampled so that a 90% confidence interval specifies the
proportion to within ±0.04?
SOLUTION;
a) To find a 95% confidence interval for the proportion of gasoline stations with at least
one leaking underground tank, we can use the formula for a confidence interval for a
proportion:
sample size, and z is the critical value for the desired level of
When enter these values into the formula, we get: 0.149 ± 1.96√(0.149(1-0.149)/87) ≈
(0.224, 0.074). So, we are 95% confident that the true proportion of gasoline stations with
at least one leaking underground tank is between 22.4% and 7.4%.
b) To find a 90% confidence interval for the proportion of gasoline stations with at least
one leaking underground tank, we can use the same formula as in part (a), but with a
different critical value for z*. For a 90% confidence level, z* ≈ 1.645.
Putting these values into the formula, we get: 0.149 ± 1.645√(0.149(1-0.149)/87) ≈ (0.086,
0.3384). So, we are 90% confident that the true proportion of gasoline stations with at least
one leaking underground tank is between 8.6% and 38.4%
c) To determine how many stations must be sampled so that a 95% confidence interval
specifies the proportion to within ±0.04, we can use the formula for the margin of error
of a confidence interval for a proportion: E = z√(p̂ (1-p̂ )/n). Solving this formula for n,
we get: n = z²p̂(1-p̂)/E².
Since we don’t know the true value of p̂, we can use the conservative estimate of p̂ = 0.5 to
calculate the sample size. Plugging in the values for z*, E, and p̂ = 0.5 into the formula
above, we get: n = 1.96² * 0.5 * (1 - 0.5) / 0.04² ≈ 600.
So, we would need to sample at least 600 stations to be able to specify the proportion of
gasoline stations with at least one leaking underground tank to within
±0.04 with 95% confidence
d)To determine how many stations must be sampled so that a 90% confidence interval
specifies the proportion to within ±0.04, we can use the same approach as in part ©, but
with a different critical value for z*. For a 90% confidence level, z* ≈
1.645.
Plugging in the values for z*, E, and p̂ = 0.5 into the formula above, we get: n =
1.645² * 0.5 * (1 - 0.5) / 0.04² ≈ 423.
So, we would need to sample at least 423 stations to be able to specify the proportion of
gasoline stations with at least one leaking underground tank to within ±0.04 with 90%
confidence.
NB; The reason we use p̂ = 0.5 as a conservative estimate is because the sample size
formula n = z²p̂ (1-p̂ )/E² is maximized when p̂ = 0.5. This means that if we use p̂ = 0.5 to
calculate the sample size, we will get the largest possible sample size for a given margin of
error (E) and confidence level (z).
By using the largest possible sample size, we ensure that our sample will be large enough to
achieve the desired margin of error regardless of the true value of p. This is why using p̂ =
0.5 is considered a conservative approach.
a) What proportion of the automobiles in the sample had emission levels that exceed the
standard?
b) Find a 95% confidence interval for the proportion of automobiles in the state whose
emission levels exceed the standard.
c) Find a 98% confidence interval for the proportion of automobiles whose emission
levels exceed the standard.
d) How many automobiles must be sampled to specify the proportions that exceed the
standard to within ± 0.10 with 95% confidence?
e) How many automobiles must be sampled to specify the proportions that exceed the
standard to within ± 0.10 with 98% confidence
SOLN
a) The proportion of automobiles in the sample that had emission levels that exceed the
standard is 28/70 = 0.4.
b) To find a 95% confidence interval for the proportion of automobiles in the state whose
emission levels exceed the standard, we can use the formula for a confidence interval
for a proportion:
p̂ ± z√(p̂(1-p̂ )/n),
In this case, p̂ = 0.4, n = 70, and z* ≈ 1.96 for a 95% confidence level.
So, we are 95% confident that the true proportion of automobiles in the state whose
emission levels exceed the standard is between 29% and 51%.
b) To find a 98% confidence interval for the proportion of automobiles in the state
whose emission levels exceed the standard, we can use the same formula as in part
(b), but with a different critical value for z*. For a 98% confidence level, z* ≈ 2.33.
So, we are 98% confident that the true proportion of automobiles in the state whose
emission levels exceed the standard is between 26% and 53%.
c) To determine how many automobiles must be sampled to specify the proportion that
exceeds the standard to within ±0.10 with 95% confidence, we can use the formula for
the margin of error of a confidence interval for a proportion:
E = z√(p̂ (1-p̂ )/n). Solving this formula for n, we get: n = z²p̂(1-p̂)/E².
Since we don’t know the true value of p̂, we can use the conservative estimate of p̂ = 0.5 to
calculate the sample size ( remember the concept I gave in previous question) . Plugging in
the values for z*, E, and p̂ = 0.5 into the formula above, we get: n = 1.96² * 0.5 * (1 - 0.5) /
0.10² ≈ 96.
So, we would need to sample at least 96 automobiles to be able to specify the proportion
that exceeds the standard to within ±0.10 with 95% confidence.
e) To determine how many automobiles must be sampled to specify the proportion that
exceeds the standard to within ±0.10 with 98% confidence, we can use the same approach
as in part (d), but with a different critical value for z*. For a 98% confidence level, z* ≈
2.33.
Plugging in the values for z*, E, and p̂ = 0.5 into the formula above, we get: n = 2.33² * 0.5
* (1 - 0.5) / 0.10² ≈ 136.
So, we would need to sample at least 136 automobiles to be able to specify the proportion
that exceeds the standard to within ±0.10 with 98% confidence.
Anyconcern0625774413@prayol