R Programming Language Multiple Choice Questions on “Linear Regression ”.
1. In practice, Line of best fit or regression line is found when _____________
a) Sum of residuals (∑(Y – h(X))) is minimum
b) Sum of the absolute value of residuals (∑|Y-h(X)|) is maximum
c) Sum of the square of residuals ( ∑ (Y-h(X))2) is minimum
d) Sum of the square of residuals ( ∑ (Y-h(X))2) is maximum
Answer: c
Clarification: Here we penalize higher error value much more as compared to the smaller one, such that there is a significant difference between making big errors and small errors, which makes it easy to differentiate and select the best fit line.
2. If Linear regression model perfectly first i.e., train error is zero, then _____________________
a) Test error is also always zero
b) Test error is non zero
c) Couldn’t comment on Test error
d) Test error is equal to Train error
Answer: c
Clarification: Test Error depends on the test data. If the Test data is an exact representation of train data then test error is always zero. But this may not be the case.
3. Which of the following metrics can be used for evaluating regression models?
i) R Squared ii) Adjusted R Squared iii) F Statistics iv) RMSE / MSE / MAE
a) ii and iv
b) i and ii
c) ii, iii and iv
d) i, ii, iii and iv
Answer: d
Clarification: These (R Squared, Adjusted R Squared, F Statistics, RMSE / MSE / MAE) are some metrics which you can use to evaluate your regression model.
4. How many coefficients do you need to estimate in a simple linear regression model (One independent variable)?
a) 1
b) 2
c) 3
d) 4
Answer: b
Clarification: In simple linear regression, there is one independent variable so 2 coefficients (Y=a+bx+error).
5. In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change?
a) by 1
b) no change
c) by intercept
d) by its slope
Answer: d
Clarification: For linear regression Y=a+bx+error. If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.
6. Function used for linear regression in R is __________
a) lm(formula, data)
b) lr(formula, data)
c) lrm(formula, data)
d) regression.linear(formula, data)
Answer: a
Clarification: lm(formula, data) refers to a linear model in which formula is the object of the class “formula”, representing the relation between variables. Now this formula is on applied on the data to create a relationship model.
7. In syntax of linear model lm(formula,data,..), data refers to ______
a) Matrix
b) Vector
c) Array
d) List
Answer: b
Clarification: Formula is just a symbol to show the relationship and is applied on data which is a vector. In General, data.frame are used for data.
8. In the mathematical Equation of Linear Regression Y = β1 + β2X + ϵ, (β1, β2) refers to __________
a) (X-intercept, Slope)
b) (Slope, X-Intercept)
c) (Y-Intercept, Slope)
d) (slope, Y-Intercept)
Answer: c
Clarification: Y-intercept is β1 and X-intercept is – (β1 / β2). Intercepts are defined for axis and formed when the coordinates are on the axis.