Data Science Multiple Choice Questions on “Cross Validation”.
1. Which of the following is correct use of cross validation?
a) Selecting variables to include in a model
b) Comparing predictors
c) Selecting parameters in prediction function
d) All of the mentioned
Answer: d
Explanation: Cross-validation is also used to pick type of prediction function to be used.
2. Point out the wrong combination.
a) True negative=correctly rejected
b) False negative=correctly rejected
c) False positive=correctly identified
d) All of the mentioned
Answer: c
Explanation: False positive means incorrectly identified.
3. Which of the following is a common error measure?
a) Sensitivity
b) Median absolute deviation
c) Specificity
d) All of the mentioned
Answer: d
Explanation: Sensitivity and specificity are statistical measures of the performance of a binary classification test, also known in statistics as classification function.
4. Which of the following is not a machine learning algorithm?
a) SVG
b) SVM
c) Random forest
d) None of the mentioned
Answer: a
Explanation: SVM stands for scalable vector machine.
5. Point out the wrong statement.
a) ROC curve stands for receiver operating characteristic
b) Foretime series, data must be in chunks
c) Random sampling must be done with replacement
d) None of the mentioned
Answer: d
Explanation: Random sampling with replacement is the bootstrap.
6. Which of the following is a categorical outcome?
a) RMSE
b) RSquared
c) Accuracy
d) All of the mentioned
Answer: c
Explanation: RMSE stands for Root Mean Squared Error.
7. For k cross-validation, larger k value implies more bias.
a) True
b) False
Answer: b
Explanation: For k cross-validation, larger k value implies less bias.
8. Which of the following method is used for trainControl resampling?
a) repeatedcv
b) svm
c) bag32
d) none of the mentioned
Answer: a
Explanation: repeatedcv stands for repeated cross-validation.
9. Which of the following can be used to create the most common graph types?
a) qplot
b) quickplot
c) plot
d) all of the mentioned
Answer: a
Explanation: qplot() is short for a quick plot.
10. For k cross-validation, smaller k value implies less variance.
a) True
b) False
Answer: a
Explanation: Larger k value implies more variance.