Data Science Multiple Choice Questions on “Residual Variation and Multivariate”.
1. Which of the following is the correct formula for total variation?
a) Total Variation = Residual Variation – Regression Variation
b) Total Variation = Residual Variation + Regression Variation
c) Total Variation = Residual Variation * Regression Variation
d) All of the mentioned
Answer: b
Explanation: The complementary part of the total variation is called unexplained or residual.
2. Point out the correct statement.
a) A standard error is needed to create a prediction interval
b) The prediction interval must incorporate the variability in the data around the line
c) Investors use the residual variance to measure the accuracy of their predictions on the value of an asset
d) All of the mentioned
Answer: d
Explanation: In statistics, explained variation measures the proportion to which a mathematical model accounts for the variation of a given data set.
3. Which of the following things can be accomplished with linear model?
a) Flexibly fit complicated functions
b) Uncover complex multivariate relationships
c) Build accurate prediction models
d) All of the mentioned
Answer: d
Explanation: Linear models are the single most important applied statistical and machine learning technique.
4. Which of the following statement is incorrect with respect to outliers?
a) Outliers can have varying degrees of influence
b) Outliers can be the result of spurious or real processes
c) Outliers cannot conform to the regression relationship
d) None of the mentioned
Answer: c
Explanation: Outliers can conform to the regression relationship.
5. Point out the wrong statement.
a) The fraction of variance unexplained is an established concept in the context of linear regression
b) “Explained variance” is routinely used in principal component analysis
c) The general linear model extends simple linear regression (SLR) by adding terms linearly into the model
d) None of the mentioned
Answer: d
Explanation: Linearity refers to a mathematical relationship or function that can be graphically represented as a straight line.
6. Which of the following can be useful for diagnosing data entry errors?
a) hat values
b) dffit
c) resid
d) all of the mentioned
Answer: a
Explanation: resid returns the ordinary residuals.
7. Multivariate regression estimates are exactly those having removed the linear relationship of the other variables from both the regressor and response.
a) True
b) False
Answer: a
Explanation: Multivariate Data Analysis refers to any statistical technique used to analyze data that arises from more than one variable.
8. Residual ______ plots investigate normality of the errors.
a) RR
b) PP
c) QQ
d) None of the mentioned
Answer: c
Explanation: Patterns in your residual plots generally indicate some poor aspect of model fit.
9. Which of the following show residuals divided by their standard deviations?
a) rstudent
b) cooks.distance
c) rstandard
d) all of the mentioned
Answer: c
Explanation: rstandard stands for standardized residuals.
10. The least squares estimate for the coefficient of a multivariate regression model is exactly regression through the origin with the linear relationships.
a) True
b) False
Answer: b
Explanation: Multivariate regression adjusts a coefficient for the linear impact of the other variables.