Nothing Special   »   [go: up one dir, main page]

Biostatistics MCQ's

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

Group 8 Statistical inference for regression models

1. Which of the following statements is true about statistical inference in regression models?

a) It involves estimating the parameters of the regression equation.

b) It focuses on making predictions using the regression equation.

c) It is used to determine the correlation between two variables.

d) It is only applicable when the regression equation is linear.

2. In regression analysis, the p-value associated with a coefficient estimate measures:

a) The strength of the relationship between the predictor and the response variable.

b) The significance of the intercept in the regression model.

c) The probability that the coefficient estimate is zero.

d) The percentage of the variation in the response variable explained by the predictor.

3. In a regression model, if the p-value for a predictor variable is less than the significance level (e.g.,
0.05), what can you conclude?

a) There is a significant relationship between the predictor and the response variable.

b) The predictor has no effect on the response variable.

c) The coefficient estimate for the predictor is equal to zero.

d) The model is overfit and unreliable.

4. What is the purpose of conducting hypothesis tests in regression analysis?

a) To determine the strength of the relationship between variables.

b) To assess the overall goodness-of-fit of the regression model.

c) To evaluate the statistical significance of individual predictor variables.

d) To compare different regression models and select the best one.

5. Which of the following assumptions is NOT required for classical linear regression?

a) Linearity: The relationship between the predictors and the response is linear.

b) Independence: The observations are independent of each other.

c) Homoscedasticity: The variability of the errors is constant across all levels of predictors.

d) Multicollinearity: The predictor variables are not highly correlated with each other.

6. The coefficient of determination (R-squared) in a regression model measures:

a) The proportion of the variation in the predictor variables explained by the response variable.
b) The percentage of the variation in the response variable explained by the predictor
variables.

c) The significance of the intercept in the regression model.

d) The slope of the regression line.

7. When performing inference on regression coefficients, which distribution is commonly used?

a) Normal distribution. b) Chi-square distribution. c) F-distribution. d) T-distribution.

8. In multiple linear regression, the standard error of the coefficient estimate measures:

a) The precision of the estimate of the predictor variable's effect on the response variable.

b) The strength of the relationship between predictor variables.

c) The variability of the response variable around the regression line.

d) The significance of the intercept in the regression model.

9. The term "heteroscedasticity" in regression analysis refers to:

a) The violation of the linearity assumption in the regression model.

b) The presence of outliers in the predictor variables.

c) The non-normal distribution of the residuals.

d) The unequal variability of the errors across different levels of the predictor variables.

10. Which of the following statements is true about multicollinearity in regression models?

a) It refers to the violation of the independence assumption in the regression model.

b) It occurs when the predictor variables are highly correlated with each other.

c) It affects the precision of the coefficient estimates but does not bias them.

d) It leads to the violation of the homoscedasticity assumption in the regression model.

Group 9 Hypothesis testing for means (single and two samples)


1. A statement made about a population for testing purpose is called?

a) Statistic b) Hypothesis c) Level of Significance d) Test-Statistic

2. The rejection probability of Null Hypothesis when it is true is called as?

a) Level of Confidence b) Level of Significance c)Level of Margin d)Level of Rejection

3. Consider a hypothesis where H0 where ϕ0 = 23 against H1 where ϕ1 < 23. The test is?

a) Right tailed b) Left tailed c)Center tailed d)Cross tailed


4. The probability of Type 1 error is referred as?

a) 1-α b) β c) α d)1-β

5. if σ is unknown, then for a large sample the distribution of sample mean follows

a) chi distribution b)F distribution c)t- distribution d) normal distribution

6. The probability of Type 2 error is referred as?

a) 1-α b) β c)α d)1-β

7. The point where the Null Hypothesis gets rejected is called as?

a) Significant value b)Rejection value c)Acceptance value d)Critical value

8. Type 1 error occurs when?

a) We reject H0 if it is True

b) We reject H0 if it is False

c) We accept H0 if it is True

d) We accept H0 if it is False

9. The smaller sx is, the ------- will be the confidence interval or one sample hypothesis,

a) Largest b)Smaller c)Both a & b d)Shortest

10. For two sample hypothesis, the two samples came at random from ------ populations and that
the two populations had the same variance.

a) Standard b)Binomial c)Normal d)Both b & c

Group 11 Basic concepts related to sampling and sampling procedures


1. What is sampling in statistics?

a) The process of collecting data from a population

b) The process of analyzing data collected from a population

c) The process of summarizing data collected from a population

d) The process of interpreting data collected from a population

2. What is a population in the context of sampling?

a) The individuals or objects from which data is collected

b) The statistical summary of collected data

c) The process of selecting a sample from a population


d) The analysis of collected data

3. What is a sample in statistics?

a) A subset of the population used to draw conclusions about the whole population

b) The complete set of individuals or objects from which data is collected

c) The process of collecting data from a population

d) The statistical summary of collected data

4. Which of the following is a random sampling method?

a) Convenience sampling b) Quota sampling c) Stratified sampling d) Snowball sampling

5. What is simple random sampling?

a) A sampling method where the population is divided into homogeneous groups, and a random
sample is selected from each group

b) A sampling method where the population is divided into mutually exclusive subgroups, and a
random sample is selected from each subgroup

c) A sampling method where each individual in the population has an equal chance of being
selected

d) A sampling method where individuals are selected based on their availability or convenience

6. What is sampling bias?

a) The tendency of individuals in a sample to respond in a socially desirable manner

b) The distortion of a statistical analysis due to errors or inaccuracies in the collected data

c) The systematic over- or under-representation of certain groups in a sample

d) The process of selecting a sample from a population

7. Which of the following is NOT a basic concept related to sampling?

(A) Population (B) Sample (C) Sampling frame (D) Sampling error

8. Which of the following is a type of probability sampling?

(A) Simple random sampling (B) Stratified sampling

(C) Cluster sampling (D) All of the above

9. Which of the following is a type of non-probability sampling?

(A) Convenience sampling (B) Judgment sampling (C) Quota sampling (D) All of the above

10. What is the purpose of sampling?


(A) The purpose of sampling is to reduce the cost of collecting data.

(B) The purpose of sampling is to reduce the time it takes to collect data.

(C) The purpose of sampling is to reduce the error in the data.

(D) All of the above.

Group12 Sampling distribution of proportions and estimation


1. In sampling with replacement, standard error of the sample proportion  ^p is equal to:

a)
√ p(1− p)
p

b)
√ p(1− p)
n
p (1− p)
c)
n

d)
√ p+ q
2
2. Mean of sampling distribution of proportion is, when sample size n is large:

a) P

b) π

c) μ p

d) μ

3. The standard error increases when sample size is:

a) Increases b)Fixed c)Decreases d)More than 30

4. The difference between the sample value expected and the estimates value of the parameter is
called as:

a) Error b)Difference c)Contradiction d)Bias

5. In which of the following types of sampling the information is carried out under the opinion of
an expert:

a) Judgement Sampling b)Quota Sampling

c)Purposive Sampling d)Convenience Sampling

6. The number of all possible sample from a population containing 18 items from which 6 items are
selected at random without replacement:
a) 15864 b)20264 c)18564 d)21564

7. Find the proportion p for a cricket team having total 20 players with 8 overseas players:

a) 2/3 b)1/3 c)2/5 d)3/10

8. The standard error of population proportion p for sampling with replacement. The population
proportion is 0.5 and the sample size is 4:

a) 0.5 b)0.25 c)0.225 d)0.125

9. If p1= p2= p and n1 ≠ n2 the S. E( ^


p1− ^
p2 ) is:
p1 q1 p2q2
a) +
n1 n2
p 1 q 1 p2 q2
b) −
n1 n2

c)
√ p1 q
n1
1

p2 q
n2
2

d)
√ pq (
1 1
+ )
n1 n2
10. A sample was formed consisting of 8 students from a total of 56 students for a certain task. Fine
the sampling fraction of the population of the student:

a) 1/7 b)7 c)49 d)1/49

Group 13 Normal distribution


1. Which of the following statements about the normal distribution is true?

a) It is a symmetric distribution.

b) It is characterized by its mean and standard deviation.

c) Both (a) and (b)

d) None of the above

2. Suppose X is a normally distributed random variable with mean μ and standard deviation σ. What is
the z-score for X if the observed value is equal to the mean?

a) 0 b) 1 c) μ d) Not determinable without additional information.

3. The central limit theorem states that:

a) All random variables are normally distributed.


b) The distribution of the sample mean approaches a normal distribution as the sample size
increases, regardless of the distribution of the population.

c) The sample mean of a large number of independent and identically distributed random
variables is approximately normally distributed.

d) Both (b) and (c)

4. Which of the following is true about the area under the standard normal curve?

a) The total area under the curve is equal to 1.

b) The area under the curve to the left of the mean is equal to 0.5.

c) The area under the curve to the right of the mean is always negative.

d) The area under the curve represents the probability of an event occurring.

5. In a standard normal distribution, what percentage of data falls within one standard deviation of the
mean?

a) 34.13% b) 68.27% c) 95.45% d) 99.73%

6. Suppose X is normally distributed with mean 80 and standard deviation 10. What is the probability
that X is greater than 90?

a) 0.1587 b) 0.3413 c) 0.4772 d) 0.2877

7. Which of the following is false about the 68-95-99.7 rule?

a) It only applies to skewed distributions.

b) It states that 68% of the data falls within one standard deviation of the mean.

c) It is used to estimate the percentage of data falling within a certain number of standard
deviations from the mean in a normal distribution.

d) It guarantees that 99.7% of the data falls within three standard deviations of the mean.

8. A company claims that the weights of its cereal boxes follow a normal distribution with a mean of 400
grams and a standard deviation of 10 grams. What percentage of boxes would weigh less than 380
grams?

a) 2.28% b) 15.87% c) 34.13% d) 50%

9. Which of the following statements about the standard normal distribution is correct?

a) It has a mean of 1 and a standard deviation of 1.

b) It is always skewed to the right.

c) It is a specific case of the normal distribution with a mean of 0 and a standard deviation of 1
d) It cannot be used to calculate probabilities.

10. The Z-score for a data point measures:

a) The probability of the data point occurring.

b) The absolute difference between the data point and the mean.

c) The number of standard deviations the data point is above the mean.

d) The relative position of the data point in relation to the mean in terms of standard
deviations.

Group14 Binomial Distribution


1. In the binomial distribution each trial is:

a) Depend on the previous trial b) Independent on the previous trial.

c) Randomly generated d) Determined by a continuous variable

2. Which of the following conditions must be satisfied for a distribution to be a considered binomial:

a) Each trial has exactly two possible outcomes

b) The trials are depend on each other

c) The number of trials is fixed in advance

d) The probability of success varies from trial to trial

3. The binomial distribution is commonly used to:

a) Continuous variable b) Time series data

c) Categorical data d) Exponential growth

4. Consider a binomial random variable X. If X1, X2,...Xn are independent and identically distributed
samples from the distribution of X with sum then the distribution of Y as n → ∞ can be approximated as.

a) Exponential b) Bernoulli

c) Binomial d) Normal

5. Let x ∼ N(μ, σ2) If μ2 = σ2, (μ > 0), then the value of P(X >

a) [1 - P(Z ≤ 1) b) [1 - P(Z ≤ 2)]

c) 2[1 - P(Z ≤ 1)] d) 2[1 - P(Z ≤ 2)]

6. For larger values of 'n' , Binomial distribution ____.

a) loses its discreteness b) tends to poisson distribution

c) Stays at it is d) gives oscillatory values


7. The relation between mean and variance of binomial distribution is

a) Np=npq b) np>npq

c) np<npq d) None of above

8. The shape of binomial distribution depends upon

a) n b) p c) both n & p d) None of these

9. The distribution is positively skewed when

a) P>1/2 b) P=0 c) P<1/2 d) P=1/2

10. The binomial distribution is always symmetrical when

a) p=0 b) p>1/2 c) p<1/2 d) p=1/2

Group 15 Descriptive statistics for categorical data


1. Which of the following measures is used to summarize categorical data?

a. Mean b. Median c. Mode d. Standard deviation

2. The frequency distribution of categorical data shows:

a. The range of values in the data set b. The proportion of each category in the data set

c. The average value of the data set d. The variability of the data set

3. Which of the following is used to display the distribution of categorical data?

a. Histogram b. Scatter plot

c. Box plot d. Bar chart

4. The mode of a categorical data set represents:

a. The most frequently occurring category b. The average value of the data set

c. The highest value in the data set d. The variability of the data set

5. Which of the following is not a measure of central tendency for categorical data?

a. Mode b. Median c. Mean d. Range

6. The relative frequency of a category in a data set is calculated by:

a. Dividing the frequency of the category by the total number of observations

b. Adding the frequencies of all categories in the data set

c. Subtracting the mean from each observation in the data set

d. Multiplying the frequency of the category by the total number of observations


7. Which measure of dispersion is used for categorical data?

a. Standard deviation b. Range

c. Variance d. Interquartile range

8. A contingency table is used to:

a. Display the distribution of categorical data

b. Calculate the mean and standard deviation of categorical data

c. Compare two or more categorical variables

d. Calculate the mode of categorical data

9. The chi-square test is used to:

a. Test the association between two categorical variables

b. Calculate the mean of categorical data

c. Determine the range of categorical data

d. Compare the median of two categorical variables

10. Which of the following is a measure of association for categorical data?

a. Pearson correlation coefficient b. Standard error

c. Odds ratio d. Coefficient of determination

Group 16. Regression analysis


1. In regression analysis , R2 is also called the

a)Residual b)Coefficient of correlation

c)Coefficient of determination d)Standard error of the estimate

2. The coefficient of determination must be

a)Between -1 and +1 b)Between -1 and 0

c)Between 0 and 1 d)Equal to SSE/(n-2)

3. The difference between the actual Y value and the predicted Y value found using a regression
equation is called the

a)Slope b)Residual c)Outlier d)Scatter plot

4. In the regression equation Y = 75.65 + 0.50X, the intercept is


a)0.50 b)75.65 c)1.00 d)Indeterminable

5. In the regression equation Y = 21-3X, the slope is

a)21 b)-21

c)3 d) -3

6. The process of constructing a mathematical model or function that can be used to predict or
determine one variable by another variable is called

a)Regression b)Correlation

c)Residual d)Outlier plot

7. If X and Y in a regression model are totally unrelated.

a) The correlation coefficient would be -1

b) The coefficient of determination would be 0

c) The coefficient of determination would be 1

d) The SSE would be 0

8. The total of the squared residuals is called the

a)Coefficient of determination b)Sum of squares of error

c)Standard error of the estimate d)r-squared

9. For a data set the regression equation is Y=21-3X. The correlation coefficient for this data.

a)Must be 0 b)Is negative

c)Must be 1 d)Is positive

10. The coefficient of correlation for a problem was calculated to be 0.36. The coefficient of
determination for this would be

a)0.6 b)Either -.6 or +.6

c)0.13 d)0.36

Group 10 sample size determinations under different sample and test-statistics


1.When determining the sample size for a proportion, which of the following factors affects the sample
size?

a. Population size. b. Confidence level c. Margin of error


d. All of the above e. None of the above

2.Which of the following statements is true regarding the sample size determination for a mean?

a. The sample size decreases as the population standard deviation increases.


b. The sample size increases as the desired margin of error decreases.

c. The sample size is independent of the population size.

d. All of the above

e. None of the above

3.Which statistical test requires the largest sample size to achieve a desired power level?

a. Chi-square test b. T-test. c. ANOVA (Analysis of Variance)


d. Z-test e. Mann-Whitney U test

4.In a survey, a researcher wants to estimate the proportion of people who prefer a certain brand of
soda with a 95% confidence level and a margin of error of 3%. What is the minimum required sample
size if there are no prior estimates available?

a. 1067. b. 384 c. 246 d. 169

e. It cannot be determined without prior estimates.

5.A researcher wants to estimate the mean height of a population using a confidence interval. The
population standard deviation is known to be 5 cm. What sample size is required to achieveo a 99%
confidence level and a margin of error of 1 cm?

a. 16 b. 25 c. 64 d. 100

e. It cannot be determined without specifying the population size.

6.Which of the following factors affects the required sample size for a correlation analysis?

a. Desired level of significance b. Strength of the correlation

c. Sample size of the population d. All of the above.

e. None of the above

7.In a clinical trial, a researcher wants to compare the means of two independent groups. How does the
effect size affect the required sample size?

a. A larger effect size requires a smaller sample size.

b. A larger effect size requires a larger sample size.

c. The effect size has no impact on the required sample size.

d. The required sample size cannot be determined without additional information.

8.A population that consist of unlimited number of elements is called

A) Finite population. B) infinite population

C) both a and b D) none of these

9.The value calculated from population are called


A) Census. B) parameters.

C) statistics. D) none of these

10.A numerical quantity calculator from sample is called

A) Sample mean B) sample variance C) sample statistic D) both a and b but not c.

11.All the possible shoes made in Bata shoes factory is the example of

A) Finite population. B) infinite population

C) both a and b D) none of these

12.Pakistan cricket team can be regarded as

A) Simple random sampling B) cluster sampling C) judgement sampling D) none of these

13..Non-sampling error are reduced by

A) Increasing the sample size B) reducing the amount of data

C)Both a and b. D) none of these

14. Sampling error or reduce by

A) Increasing the sample size B) Decreasing sample size

C)Both a and b. D) none of these

15 The difference between statistics and parameter is called

A) Bias. B) Standard error C)error. D) both a and b but not c

16. Which of the following is impossible in sampling

A) Destructive test B) Heterogeneous data

C)to make voters list D) both a and b but not c

17 For making voter list in the country we need

A) Simple random sampling B) systematic sampling

C)quota sampling D) none of these

18. According to most statistician a good minimum sample size to explain the population is

A) 10% B) 5% C)4% D) 2%

19. The optimal sample size minimise the

A) sampling error B) Standard error C)non- sampling. D) none of these

20. We want to estimate the population to certain 3% when d is equal to 0.03 when 95% the where the
rate of sampling umang women is about 27% what is the interval sample size
A)842 B) 841 C)840 D) 843

21. Which of following is smallest sample size because of it efficient

A) Cluster sampling B) sample Random sampling

C) voters sampling. D) stratified sampling

You might also like